On Tue, Jan 11, 2022 at 12:11 PM Ken Fallon <k...@fallon.ie> wrote: > In the past it has been argued that the more natural voices are > difficult to understand when sped up. So I took the two most natural > voices from the list and posted a side by side comparison to espeak at > 150%, 200%, 250%, 300%, 350%, 400%, 450%, and 500%. In my opinion the > coqui-tts_en_en_ljspeech is more understandable than espeak at every speed. > > Can everyone have a listen to this and tell me your preference > https://hackerpublicradio.org/tts-espeak-ljspeech-vctk-normal-150-200-250-300-350-400-450-500-percent.ogg
I rarely listen faster than 2x (I prefer 1x but will speed up if I have a lot of content to get through), so I can imagine someone who deals with audio navigation day after day would have a much more nuanced (and, I think, valuable opinion). That said, here's my feedback: - I found voice #2 the most pleasant of the 3, particularly at 1x - Both voices #2 and #3 were more pleasant than #1 at all speeds - All the voices were intelligible at 1x - At higher speeds, I had the easiest time understanding voice #3, but this could just be due to my own American accent - I'd like to hear from folks like Mike who routinely listen at high speeds _______________________________________________ Hpr mailing list Hpr@hackerpublicradio.org http://hackerpublicradio.org/mailman/listinfo/hpr_hackerpublicradio.org