Re: [whatwg] Text-To-Speech (TTS) Web API for JavaScript

2009-12-11 Thread Ian Hickson
On Thu, 10 Dec 2009, Weston Ruter wrote:

 I've been working on a web app which reads text in a web page, highlighting
 each word as it is read. For this to be possible, a Text-To-Speech API is
 needed which is able to:
 (1) generate the speech audio from some text, and
 (2) include the time indicies for when each of the words in the text is
 spoken.
 
 Microsoft has its Sapi.SpVoice API via ActiveXObject which does (1) but not
 (2) apparently. There are web services (usable in conjunction with HTML5
 Audio) which also do (1) such as the iSpeech
 APIhttp://www.ispeech.org/api and
 Google Translate's TTS 
 http://translate.google.com/translate_tts?q=Hello%2C+Worldtl=en, but none
 that I have found which do (2). In any case, web services
 aren't preferable since they require that the audio be transferred over the
 network which could take a significant amount of time.
 
 Is anyone aware of any work done to develop a standard TTS API for the Web?
 Operating systems already have this functionality built-in, and it's a shame
 that web apps can't make use of it. If Google Gears were alive, it would've
 been a good place to prototype this, but alas�

In addition to the suggestions made by Olli, you may also wish to read 
this thread, which includes some other people interested in this topic:

   
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-December/thread.html#24281

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

[whatwg] Text-To-Speech (TTS) Web API for JavaScript

2009-12-10 Thread Weston Ruter
I've been working on a web app which reads text in a web page, highlighting
each word as it is read. For this to be possible, a Text-To-Speech API is
needed which is able to:
(1) generate the speech audio from some text, and
(2) include the time indicies for when each of the words in the text is
spoken.

Microsoft has its Sapi.SpVoice API via ActiveXObject which does (1) but not
(2) apparently. There are web services (usable in conjunction with HTML5
Audio) which also do (1) such as the iSpeech
APIhttp://www.ispeech.org/api and
Google Translate's TTS 
http://translate.google.com/translate_tts?q=Hello%2C+Worldtl=en, but none
that I have found which do (2). In any case, web services
aren't preferable since they require that the audio be transferred over the
network which could take a significant amount of time.

Is anyone aware of any work done to develop a standard TTS API for the Web?
Operating systems already have this functionality built-in, and it's a shame
that web apps can't make use of it. If Google Gears were alive, it would've
been a good place to prototype this, but alas…

Thanks,
Weston


Re: [whatwg] Text-To-Speech (TTS) Web API for JavaScript

2009-12-10 Thread Olli Pettay

On 12/10/09 4:54 PM, Weston Ruter wrote:

I've been working on a web app which reads text in a web page,
highlighting each word as it is read. For this to be possible, a
Text-To-Speech API is needed which is able to:
(1) generate the speech audio from some text, and
(2) include the time indicies for when each of the words in the text is
spoken.

Microsoft has its Sapi.SpVoice API via ActiveXObject which does (1) but
not (2) apparently. There are web services (usable in conjunction with
HTML5 Audio) which also do (1) such as the iSpeech API
http://www.ispeech.org/api and Google Translate's TTS
http://translate.google.com/translate_tts?q=Hello%2C+Worldtl=en
http://translate.google.com/translate_tts?q=Hello%2C+Worldtl=en, but
none that I have found which do (2). In any case, web services
aren't preferable since they require that the audio be transferred over
the network which could take a significant amount of time.

Is anyone aware of any work done to develop a standard TTS API for the
Web? Operating systems already have this functionality built-in, and
it's a shame that web apps can't make use of it. If Google Gears were
alive, it would've been a good place to prototype this, but alas…



You probably want to ask W3C multimodal working group.
There are specifications like XHTML+Voice and SALT
(neither really W3C specifications) and (old) proposals like
MMI-CSS.


-Olli