Exploring Efficient Neural Architectures for Linguistic-Acoustic Mapping in TTS

Santiago Pascual1, Joan Serrà2, Antonio Bonafonte1

1Universitat Politècnica de Catalunya, Barcelona, Spain

2Telefónica Research, Barcelona, Spain

This page shows qualitative results for our work on "Exploring Efficient Neural Architectures for Linguistic-Acoustic Mapping in Text-to-Speech". In this work two different pseudo-recurrent mechanisms (QLAD and SALAD) are explored to make acoustic modeling in text-to-speech more efficient whilst trying to maintain the generated speech naturalness.

The code of this project is publicly available in GitHub.

QLAD

Human

RNN

SALAD

QLAD

Human

RNN

SALAD

QLAD

Human

RNN

SALAD

QLAD

Human

RNN

SALAD

QLAD

Human

RNN

SALAD

QLAD

Human

RNN

SALAD