More 'human' voice synth (TTS)

2011-06-21 Thread Sridhar Dhanapalan
I'm wondering if there's anything we can do to make TTS sound more
'human'. We'd like to be able to use the XOs to teach English
literacy, but the espeak voices are very robotic.

My understanding is that espeak is optimised for low-power devices
(great for XOs) and clear (if robotic) speech. Would it be feasible to
switch to something else, like festival?

This is some food for thought:
http://braille.uwo.ca/pipermail/speakup/2008-July/046755.html

Sridhar


Sridhar Dhanapalan
Technical Manager
One Laptop per Child Australia
M: +61 425 239 701
E: srid...@laptop.org.au
A: G.P.O. Box 731
     Sydney, NSW 2001
W: www.laptop.org.au
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: More 'human' voice synth (TTS)

2011-06-21 Thread Paul Fox
sridhar wrote:
  I'm wondering if there's anything we can do to make TTS sound more
  'human'. We'd like to be able to use the XOs to teach English
  literacy, but the espeak voices are very robotic.
  
  My understanding is that espeak is optimised for low-power devices
  (great for XOs) and clear (if robotic) speech. Would it be feasible to
  switch to something else, like festival?

i've run festival as part of my home automation system for many many
years, including the last 3 or so on an XO-1 (debxo) which acts as my
current HA server.

the first secret is to run it in client/server mode, to avoid the
server startup latency on every enunciation.  but even after that, i
think the latency will be too high for your application.  i just
tested it:  given a moderate english sentence, it took 3 seconds to
produce output.  (i hide this on my system by caching utterances --
that's more feasible in a menuing system than when teaching literacy.)
http://dev.laptop.org/~pgf/junk/festival_out.wav   (5 seconds on XO-1)

flite is a lower cost version of festival that might be appropriate.
it seems to reduce the conversion time to about half a second.
but the quality suffers as well.
http://dev.laptop.org/~pgf/junk/flite_out.wav   (.5 seconds on XO-1)

fyi, current festival server process footprint:
root   999  0.0  9.4  26668 20004 ?Ss   Jun06  10:03 
/usr/bin/festival --server /usr/local/etc/nosil.scm

i haven't used espeak -- i suspect there are API interfaces that are
far richer than what i'm doing from the shell commandline.  i don't
know how one might access festival at that level.

paul

  
  This is some food for thought:
  http://braille.uwo.ca/pipermail/speakup/2008-July/046755.html
  
  Sridhar
  
  
  Sridhar Dhanapalan
  Technical Manager
  One Laptop per Child Australia
  M: +61 425 239 701
  E: srid...@laptop.org.au
  A: G.P.O. Box 731
   Sydney, NSW 2001
  W: www.laptop.org.au
  ___
  Devel mailing list
  Devel@lists.laptop.org
  http://lists.laptop.org/listinfo/devel

=-
 paul fox, p...@laptop.org
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: [OLPC-AU] More 'human' voice synth (TTS)

2011-06-21 Thread Peter Robinson
On Tue, Jun 21, 2011 at 8:25 AM, Sridhar Dhanapalan
srid...@laptop.org.auwrote:

 I'm wondering if there's anything we can do to make TTS sound more
 'human'. We'd like to be able to use the XOs to teach English
 literacy, but the espeak voices are very robotic.

 My understanding is that espeak is optimised for low-power devices
 (great for XOs) and clear (if robotic) speech. Would it be feasible to
 switch to something else, like festival?


You might want to look at speed-dispatcher. It can be configured for
multiple backends. Its already packaged in Fedora.

Peter
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel