As Dan Burnett wrote below: The HTML Speech Incubator Group [1] has recently wrapped up its work on use cases, requirements, and proposals for adding automatic speech recognition (ASR) and text-to-speech (TTS) capabilities to HTML. The work of the group is documented in the group's Final Report. [2] The members of the group intend this work to be input to one or more working groups, in W3C and/or other standards development organizations such as the IETF, as an aid to developing full standards in this space.
Because that work was so broad, Art Barstow asked (below) for a relatively specific proposal. We at Google are proposing that a subset of it be accepted as a work item by the Web Applications WG. Specifically, we are proposing this Javascript API [3], which enables web developers to incorporate speech recognition and synthesis into their web pages. This simplified subset enables developers to use scripting to generate text-to-speech output and to use speech recognition as an input for forms, continuous dictation and control, and it supports the majority of use-cases in the Incubator Group's Final Report. We welcome your feedback and ask that the Web Applications WG consider accepting this Javascript API [3] as a work item. [1] charter: http://www.w3.org/2005/Incubator/htmlspeech/charter [2] report: http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [3] API: http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html Bjorn Bringert Satish Sampath Glen Shires On Thu, Dec 22, 2011 at 11:38 AM, Glen Shires <gshi...@google.com> wrote: > Milan, > The IDLs contained in both documents are in the same format and order, so > it's relatively easy to compare the two > side<http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#speechreco-section> > -by-side<http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html#api_description>. > The semantics of the attributes, methods and events have not changed, and > both IDLs link directly to the definitions contained in the Speech XG Final > Report. > > As you mention, we agree that the protocol portions of the Speech XG Final > Report are most appropriate for consideration by a group such as IETF, and > believe such work can proceed independently, particularly because the > Speech XG Final Report has provided a roadmap for these to remain > compatible. Also, as shown in the Speech XG Final Report - > Overview<http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#introductory>, > the "Speech Web API" is not dependent on the "Speech Protocol" and a > "Default Speech" service can be used for local or remote speech recognition > and synthesis. > > Glen Shires > > > On Thu, Dec 22, 2011 at 10:32 AM, Young, Milan <milan.yo...@nuance.com>wrote: > >> Hello Glen,**** >> >> ** ** >> >> The proposal says that it contains a “simplified subset of the JavaScript >> API”. Could you please clarify which elements of the HTMLSpeech >> recommendation’s JavaScript API were omitted? I think this would be the >> most efficient way for those of us familiar with the XG recommendation to >> evaluate the new proposal.**** >> >> ** ** >> >> I’d also appreciate clarification on how you see the protocol being >> handled. In the HTMLSpeech group we were thinking about this as a >> hand-in-hand relationship between W3C and IETF like WebSockets. Is this >> still your (and Google’s) vision?**** >> >> ** ** >> >> Thanks**** >> >> ** ** >> >> ** ** >> >> *From:* Glen Shires [mailto:gshi...@google.com] >> *Sent:* Thursday, December 22, 2011 11:14 AM >> *To:* public-webapps@w3.org; Arthur Barstow >> *Cc:* public-xg-htmlspe...@w3.org; Dan Burnett >> >> *Subject:* Re: HTML Speech XG Completes, seeks feedback for eventual >> standardization**** >> >> ** ** >> >> We at Google believe that a scripting-only (Javascript) subset of the API >> defined in the Speech XG Incubator Group Final Report is of appropriate >> scope for consideration by the WebApps WG.**** >> >> ** ** >> >> The enclosed scripting-only subset supports the majority of the use-cases >> and samples in the XG proposal. Specifically, it enables web-pages to >> generate speech output and to use speech recognition as an input for forms, >> continuous dictation and control. The Javascript API will allow web pages >> to control activation and timing and to handle results and alternatives.* >> *** >> >> ** ** >> >> We welcome your feedback and ask that the Web Applications WG consider >> accepting this as a work item.**** >> >> ** ** >> >> Bjorn Bringert**** >> >> Satish Sampath**** >> >> Glen Shires**** >> >> ** ** >> >> On Tue, Dec 13, 2011 at 11:39 AM, Glen Shires <gshi...@google.com> wrote: >> **** >> >> We at Google believe that a scripting-only (Javascript) subset of the API >> defined in the Speech XG Incubator Group Final Report [1] is of appropriate >> scope for consideration by the WebApps WG.**** >> >> ** ** >> >> A scripting-only subset supports the majority of the use-cases and >> samples in the XG proposal. Specifically, it enables web-pages to generate >> speech output and to use speech recognition as an input for forms, >> continuous dictation and control. The Javascript API will allow web pages >> to control activation and timing and to handle results and alternatives** >> ** >> >> ** ** >> >> As Dan points out above, we envision that different portions of the >> Incubator Group Final Report are applicable to different working groups "in >> W3C and/or other standards development organizations such as the IETF". >> This scripting API subset does not preclude other groups from pursuing >> standardization of relevant HTML markup or underlying transport protocols, >> and indeed the Incubator Group Final Report defines a potential roadmap >> such that such additions can be compatible.**** >> >> ** ** >> >> To make this more concrete, Google will provide to this mailing list a >> specific proposal extracted from the Incubator Group Final Report, that >> includes only those portions we believe are relevant to WebApps, with links >> back to the Incubator Report as appropriate.**** >> >> ** ** >> >> Bjorn Bringert**** >> >> Satish Sampath**** >> >> Glen Shires**** >> >> ** ** >> >> [1] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/**** >> >> ** ** >> >> On Tue, Dec 13, 2011 at 5:32 AM, Dan Burnett <dburn...@voxeo.com> wrote:* >> *** >> >> Thanks for the info, Art. To be clear, I personally am *NOT* proposing >> adding any specs to WebApps, although others might. My email below as a >> Chair of the group is merely to inform people of this work and ask for >> feedback. >> I expect that your information will be useful for others who might wish >> for some of this work to continue in WebApps. >> >> -- dan**** >> >> >> >> On Dec 13, 2011, at 7:06 AM, Arthur Barstow wrote: >> >> > Hi Dan, >> > >> > WebApps already has a relatively large number of specs in progress (see >> [PubStatus]) and the group has agreed to add some additional specs (see >> [CharterChanges]). As such, please provide a relatively specific proposal >> about the features/specs you and other proponents would like to add to >> WebApps. >> > >> > Regarding the level of detail for your proposal, I think a reasonable >> precedence is something like the Gamepad and Pointer/MouseLock proposals >> (see [CharterChanges]). (Perhaps this could be achieved by identifying >> specific sections in the XG's Final Report?) >> > >> > -Art Barstow >> > >> > [PubStatus] >> http://www.w3.org/2008/webapps/wiki/PubStatus#API_Specifications >> > [CharterChanges] >> http://www.w3.org/2008/webapps/wiki/CharterChanges#Additions_Agreed >> > >> > On 12/12/11 5:25 PM, ext Dan Burnett wrote: >> >> Dear WebApps people, >> >> >> >> The HTML Speech Incubator Group [1] has recently wrapped up its work >> on use cases, requirements, and proposals for adding automatic speech >> recognition (ASR) and text-to-speech (TTS) capabilities to HTML. The work >> of the group is documented in the group's Final Report. [2] >> >> >> >> The members of the group intend this work to be input to one or more >> working groups, in W3C and/or other standards development organizations >> such as the IETF, as an aid to developing full standards in this space. >> >> Whether the W3C work happens in a new Working Group or an existing >> one, we are interested in collecting feedback on the Incubator Group's >> work. We are specifically interested in input from the members of the >> WebApps Working Group. >> >> >> >> If you have any feedback to share, please send it to, or cc, the >> group's mailing list (public-xg-htmlspe...@w3.org). This will allow >> comments to be archived in one consistent location for use by whatever >> group takes up this work. >> >> >> >> >> >> Dan Burnett, Co-Chair >> >> HTML Speech Incubator Group >> >> >> >> >> >> [1] charter: http://www.w3.org/2005/Incubator/htmlspeech/charter >> >> [2] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ >> >> >> >> p.s. This feedback request is being sent to the following groups: >> WebApps, HTML, Audio, DAP, Voice Browser, Multimodal Interaction >> >> **** >> >> ** ** >> >> ** ** >> > >