RE: Speech API Community Group

2012-04-03 Thread Young, Milan
The proposal mentions that the specification of a network speech protocol is 
out of scope. This makes sense given that protocols are the domain of the IETF.

But I'd like to confirm that the use of network speech services are in scope 
for this CG.  Would you mind amending the proposal to make this explicit?

Thanks


From: Glen Shires [mailto:gshi...@google.com]
Sent: Tuesday, April 03, 2012 8:13 AM
To: public-xg-htmlspe...@w3.org; public-webapps@w3.org
Subject: Speech API Community Group

We at Google have proposed the formation of a new Speech API Community Group 
to pursue a JavaScript Speech API. We encourage you to join and support this 
effort. [1]

We believe that forming a Community Group has the following advantages:

- It's quick, efficient and minimizes unnecessary process overhead.

- We believe it will allow us, as a group, to reach consensus in an efficient 
manner.

- We hope it will expedite interoperable implementations in multiple browsers. 
(A good example is the Web Media Text Tracks CG, where multiple implementations 
are happening quickly.)

Google plans to supply an implementation and a test suite for this 
specification, and will commit to serve as editor.  We hope that others will 
support this CG as they had stated support for the similar WebApps CfC. [2]

Bjorn Bringert
Satish Sampath
Glen Shires

[1] http://www.w3.org/community/groups/proposed/#speech-api
[2] http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/0315.html


RE: Speech API Community Group

2012-04-03 Thread Young, Milan
It matters to the application author that they can select a service that works 
best for them.  Relying on browser or OS configurations would not suffice for 
real-world speech applications.

I don't see how we can properly specify the process of selection without the 
mention of network services.  Hence the language request.


From: Jerry Carter [mailto:je...@jerrycarter.org]
Sent: Tuesday, April 03, 2012 11:46 AM
To: Young, Milan
Cc: Glen Shires; public-xg-htmlspe...@w3.org; public-webapps@w3.org
Subject: Re: Speech API Community Group


On Apr 3, 2012, at 11:48 AM, Young, Milan wrote:


The proposal mentions that the specification of a network speech protocol is 
out of scope. This makes sense given that protocols are the domain of the IETF.

But I'd like to confirm that the use of network speech services are in scope 
for this CG.  Would you mind amending the proposal to make this explicit?

I don't see why any such declaration is necessary.  From the perspective of the 
application author or of the application user, it matters very little where the 
speech-to-text operation occurs so long as the result is delivered promptly.  
There is no reason that local, network-based, or hybrid solutions would be 
unable to provide adequate performance.  I believe the current language in the 
proposal is appropriate.

-=- Jerry



RE: Speech API Community Group

2012-04-03 Thread Young, Milan
The problem is that the community group has an ambiguous “charter”, and at 
least some folks would like this clarified before joining.  Being that 
Speech-XG and Webapps are the two most relevant lists, I don’t know where else 
the discussion would take place.

I believe that all of this could be cleared up by a simple statement from the 
CG chair (Glen Shires) that FPR7 and FPR12 are in scope.  There is “STRONG” 
interest in this domain and an editor has already volunteered (Jim Barnett).  
Seems like a simple decision.

Thanks


From: Charles Pritchard [mailto:ch...@jumis.com]
Sent: Tuesday, April 03, 2012 1:29 PM
To: Michael Bodell
Cc: Jerry Carter; Raj (Openstream); Young, Milan; Jim; Glen Shires; 
public-xg-htmlspe...@w3.org; public-webapps@w3.org
Subject: Re: Speech API Community Group

I'd like to encourage everyone interested in the Speech API to join the mailing 
list:
http://lists.w3.org/Archives/Public/public-speech-api/

For those interested in more hands-on interaction, there's the CG:
http://www.w3.org/community/speech-api/

For some archived mailing list discussion, browse the old XG list:
http://lists.w3.org/Archives/Public/public-xg-htmlspeech/

It seems like we can move this chatter over to public-speech-api and off of the 
webapps list.

-Charles


On 4/3/2012 1:08 PM, Michael Bodell wrote:
A little bit of historical context and resource references might be helpful for 
some on the email thread.

While this is still an early stage for a community group, if one will happen, 
it actually isn’t early for the community as a group to talk about this.  In 
many ways we’ve already done the initial incubation and community discussion 
and investigation for this space in the HTML Speech XG.  This lead to the XG’s 
use case and requirements document:
http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html

which were then refined to a prioritized requirement list after soliciting 
community input:
http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#prioritized

As I read it, Milan and Jim and Raj’s requirements discussed are part of FPR7 
[Web apps should be able to request speech service different from default] and 
FPR12 [Speech services that can be specified by web apps must include network 
speech services], both of which were voted to have “Strong Interest” by the 
community.

Further work from these requirements led to the community coming up with a 
proposal, which is ready now to be taken to a standards track process, that was 
published in the XG final report:
http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/

Hopefully we can all properly leverage the work the community has already done.

Michael Bodell
Co-chair HTML Speech XG


From: Jerry Carter [mailto:je...@jerrycarter.org]
Sent: Tuesday, April 03, 2012 12:50 PM
To: Raj (Openstream); Milan Young; Jim
Cc: Glen Shires; 
public-xg-htmlspe...@w3.orgmailto:public-xg-htmlspe...@w3.org; 
public-webapps@w3.orgmailto:public-webapps@w3.org
Subject: Re: Speech API Community Group


We can discuss this in terms of generalities without any resolution, so let me 
offer two more concrete use cases:

My friend Jóse is working on a personal site to track teams and player 
statistics at the Brazil 2014 World Cup.  He recognizes that the browser will 
define a default language through the HTTP Accept-Language header, but knows 
that speakers may code switch in their requests (e.g. Spanish + English or 
Portuguese + English or ) or be better served by using native pronunciations 
(Jesus = /heːzus/ vs. /ˈdʒiːzəs/).  Hence, he requires a resource that can 
provide support for Spanish, English, and Portuguese and that can also support 
multiple simultaneous languages.

These are two solid requirements.  A browser encountering the page might (1) be 
able to satisfy these requirements, (2) require user permission before 
accessing such a resource, or (3) be unable to meet the request.

My colleague Jim has another application for which hundreds of hours have been 
invested to optimize the performance for a specify recognition resource.  
Security considerations further restrict the physical location of conforming 
resources.  His page requires a very specific resource.

These are two solid requirements.  A browser encountering the page might (1) be 
able to satisfy these requirements, (2) require user permission before 
accessing such a resource, or (3) be unable to meet the request.

There are indeed commercial requirements around the capabilities of resources.  
We are in full agreement.  It is important to be able to list requirements for 
conforming resources and to ensure that the browser is enforcing those 
requirements.  That stated, the application author does no care where such a 
conforming resource resides so long as it is available to the targeted user 
population.  The user does not care where the resource resides so long as it 
works well and does not cost too much to use.

The trick within

RE: to add Speech API to Charter; deadline January 19

2012-01-13 Thread Young, Milan
That's exactly the right question to ask.  Please take a look at: 
http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#requirements

I am also in support of Olli's statement that we may not be able to 
spec/implement the complete XG recommendation in one pass.  But decisions made 
toward the definition of that initial feature set should be made in a 
democratic forum.  I feel the best way to do that is to start where the last 
democratic forum left off, and whittle down from there as schedule requires.

Thank you


-Original Message-
From: Dave Bernard [mailto:dbern...@intellectiongroup.com] 
Sent: Friday, January 13, 2012 8:14 AM
To: 'Deborah Dahl'; 'Satish S'; Young, Milan
Cc: 'Arthur Barstow'; 'public-webapps'; public-xg-htmlspe...@w3.org
Subject: RE: to add Speech API to Charter; deadline January 19

Deborah-

Is there a draft priority list in existence? I like the idea of getting good 
enough out there sooner, especially as an implementer with real projects in 
the space.

Dave

-Original Message-
From: Deborah Dahl [mailto:d...@conversational-technologies.com]
Sent: Friday, January 13, 2012 10:43 AM
To: 'Satish S'; 'Young, Milan'
Cc: 'Arthur Barstow'; 'public-webapps'; public-xg-htmlspe...@w3.org
Subject: RE: to add Speech API to Charter; deadline January 19

Olli has a good point that it makes sense to implement the SpeechAPI in pieces. 
That doesn't mean that the WebApps WG only has to look at one proposal in 
deciding how to proceed with the work. Another option would be to start off the 
Speech API work in the Web Apps group with both proposals (the Google proposal 
and the SpeechXG report) and let the editors prioritize the order that the 
different aspects of the API are worked out and published as specs.

 -Original Message-
 From: Satish S [mailto:sat...@google.com]
 Sent: Thursday, January 12, 2012 5:01 PM
 To: Young, Milan
 Cc: Arthur Barstow; public-webapps; public-xg-htmlspe...@w3.org
 Subject: Re: to add Speech API to Charter; deadline January 19
 
 Milan,
 It looks like we fundamentally agree on several things:
 *  That we'd like to see the JavaScript Speech API included in the 
 WebApps' charter.*  That we believe the wire protocol is best suited 
 for another organization, such as IETF.*  That we believe the markup 
 bindings may be excluded.
 Our only difference seems to be whether to start with the extensive 
 Javascript API proposed in [1] or the simplified subset of it proposed 
 in [2] which supports majority of the use cases in the XG's Final 
 Report.
 
 Art Barstow asked for a relatively specific proposal and provided 
 some precedence examples regarding the level of detail. [3] Olli 
 Pettay wrote in [4] Since from practical point of view the
 API+protocol XG defined is a huge thing to implement at once, it makes
 sense to implement it in pieces.
 Starting with a baseline that supports the majority of use cases will 
 accelerate implementation, interoperability testing, standardization 
 and ultimately developer adoption.
 Cheers
 Satish
 
 [1] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/[2]
 http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-
 1696/speechapi.html[3]
 http://lists.w3.org/Archives/Public/public-
 webapps/2011OctDec/1474.html[4]
 http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/0068.htm
 l On Thu, Jan 12, 2012 at 5:46 PM, Young, Milan 
 milan.yo...@nuance.com
 wrote:
 
  I've made the point a few times now, and would appreciate a response.
  Why are we preferring to seed WebApps speech with [2] when we 
  already have [3] that represents industry consensus as of a month 
  ago (Google not withstanding)?  Proceeding with [2] would almost 
  surely delay the resulting specification as functionality would 
  patched and haggled over to meet consensus.
 
  My counter proposal is to open the HTML/speech marriage in WebApps 
  essentially where we left off at [3].  The only variants being: 1) 
  Dropping the markup bindings in sections 7.1.2/7.1.3 because its 
  primary supporter has since expressed non-interest, and 2) Spin the 
  protocol specification in 7.2 out to the IETF.  If I need to 
  formalize all of this in a document, please let me know.
 
  Thank you
 
  [3] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/
 
 
 
  -Original Message-
  From: Arthur Barstow [mailto:art.bars...@nokia.com]
  Sent: Thursday, January 12, 2012 4:31 AM
  To: public-webapps
  Cc: public-xg-htmlspe...@w3.org
  Subject: CfC: to add Speech API to Charter; deadline January 19
 
  Glen Shires and some others at Google proposed [1] that WebApps add 
  Speech API to WebApps' charter and they put forward the Speech 
  Javascript API Specification [2] as as a starting point. Members of 
  Mozilla and Nuance have voiced various levels of support for this 
  proposal. As such, this is a Call for Consensus to add Speech API to 
  WebApps' charter.
 
  Positive response to this CfC

RE: to add Speech API to Charter; deadline January 19

2012-01-12 Thread Young, Milan
I've made the point a few times now, and would appreciate a response.
Why are we preferring to seed WebApps speech with [2] when we already
have [3] that represents industry consensus as of a month ago (Google
not withstanding)?  Proceeding with [2] would almost surely delay the
resulting specification as functionality would patched and haggled over
to meet consensus.

My counter proposal is to open the HTML/speech marriage in WebApps
essentially where we left off at [3].  The only variants being: 1)
Dropping the markup bindings in sections 7.1.2/7.1.3 because its primary
supporter has since expressed non-interest, and 2) Spin the protocol
specification in 7.2 out to the IETF.  If I need to formalize all of
this in a document, please let me know.

Thank you

[3] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/



-Original Message-
From: Arthur Barstow [mailto:art.bars...@nokia.com] 
Sent: Thursday, January 12, 2012 4:31 AM
To: public-webapps
Cc: public-xg-htmlspe...@w3.org
Subject: CfC: to add Speech API to Charter; deadline January 19

Glen Shires and some others at Google proposed [1] that WebApps add
Speech API to WebApps' charter and they put forward the Speech
Javascript API Specification [2] as as a starting point. Members of
Mozilla and Nuance have voiced various levels of support for this
proposal. As such, this is a Call for Consensus to add Speech API to
WebApps' charter.

Positive response to this CfC is preferred and encouraged and silence
will be considered as agreeing with the proposal. The deadline for
comments is January 19 and all comments should be sent to public-webapps
at w3.org.

-AB

[1]
http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1696.html
[2]
http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/s
peechapi.html




RE: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization

2012-01-06 Thread Young, Milan
The HTML Speech XG worked for over a year prioritizing use cases against
timelines and packaged all of that into a recommendation complete with
IDLs and examples.  So while I understand that WebApps may not have the
time to review the entirety of this work, it's hard to see how
dissecting it would speed the process of understanding.

 

Perhaps a better approach would be to find half an hour to present to
select members of WebApps the content of the recommendation and the
possible relevance to their group.  Does that sound reasonable?

 

Thanks

 

 

 

From: Glen Shires [mailto:gshi...@google.com] 
Sent: Wednesday, January 04, 2012 11:15 PM
To: public-webapps@w3.org
Cc: public-xg-htmlspe...@w3.org; Arthur Barstow; Dan Burnett
Subject: Speech Recognition and Text-to-Speech Javascript API - seeking
feedback for eventual standardization

 

As Dan Burnett wrote below: The HTML Speech Incubator Group [1] has
recently wrapped up its work on use cases, requirements, and proposals
for adding automatic speech recognition (ASR) and text-to-speech (TTS)
capabilities to HTML.  The work of the group is documented in the
group's Final Report. [2]  The members of the group intend this work to
be input to one or more working groups, in W3C and/or other standards
development organizations such as the IETF, as an aid to developing full
standards in this space.

 

Because that work was so broad, Art Barstow asked (below) for a
relatively specific proposal.  We at Google are proposing that a subset
of it be accepted as a work item by the Web Applications WG.
Specifically, we are proposing this Javascript API [3], which enables
web developers to incorporate speech recognition and synthesis into
their web pages. This simplified subset enables developers to use
scripting to generate text-to-speech output and to use speech
recognition as an input for forms, continuous dictation and control, and
it supports the majority of use-cases in the Incubator Group's Final
Report.

 

We welcome your feedback and ask that the Web Applications WG consider
accepting this Javascript API [3] as a work item.

 

[1] charter:  http://www.w3.org/2005/Incubator/htmlspeech/charter

[2] report: http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/

[3] API:
http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/s
peechapi.html

 

Bjorn Bringert

Satish Sampath

Glen Shires

 

On Thu, Dec 22, 2011 at 11:38 AM, Glen Shires gshi...@google.com
wrote:

Milan,

The IDLs contained in both documents are in the same format and order,
so it's relatively easy to compare the two side
http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#sp
eechreco-section -by-side
http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/
speechapi.html#api_description . The semantics of the attributes,
methods and events have not changed, and both IDLs link directly to the
definitions contained in the Speech XG Final Report. 

 

As you mention, we agree that the protocol portions of the Speech XG
Final Report are most appropriate for consideration by a group such as
IETF, and believe such work can proceed independently, particularly
because the Speech XG Final Report has provided a roadmap for these to
remain compatible.  Also, as shown in the Speech XG Final Report -
Overview
http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#in
troductory , the Speech Web API is not dependent on the Speech
Protocol and a Default Speech service can be used for local or remote
speech recognition and synthesis.

 

Glen Shires

 

On Thu, Dec 22, 2011 at 10:32 AM, Young, Milan milan.yo...@nuance.com
wrote:

Hello Glen,

 

The proposal says that it contains a simplified subset of the
JavaScript API.  Could you please clarify which elements of the
HTMLSpeech recommendation's JavaScript API were omitted?   I think this
would be the most efficient way for those of us familiar with the XG
recommendation to evaluate the new proposal.

 

I'd also appreciate clarification on how you see the protocol being
handled.  In the HTMLSpeech group we were thinking about this as a
hand-in-hand relationship between W3C and IETF like WebSockets.  Is this
still your (and Google's) vision?

 

Thanks

 

 

From: Glen Shires [mailto:gshi...@google.com] 
Sent: Thursday, December 22, 2011 11:14 AM
To: public-webapps@w3.org; Arthur Barstow
Cc: public-xg-htmlspe...@w3.org; Dan Burnett


Subject: Re: HTML Speech XG Completes, seeks feedback for eventual
standardization

 

We at Google believe that a scripting-only (Javascript) subset of the
API defined in the Speech XG Incubator Group Final Report is of
appropriate scope for consideration by the WebApps WG.

 

The enclosed scripting-only subset supports the majority of the
use-cases and samples in the XG proposal. Specifically, it enables
web-pages to generate speech output and to use speech recognition as an
input for forms, continuous dictation and control. The Javascript API
will allow

RE: HTML Speech XG Completes, seeks feedback for eventual standardization

2011-12-22 Thread Young, Milan
Hello Glen,

 

The proposal says that it contains a simplified subset of the
JavaScript API.  Could you please clarify which elements of the
HTMLSpeech recommendation's JavaScript API were omitted?   I think this
would be the most efficient way for those of us familiar with the XG
recommendation to evaluate the new proposal.

 

I'd also appreciate clarification on how you see the protocol being
handled.  In the HTMLSpeech group we were thinking about this as a
hand-in-hand relationship between W3C and IETF like WebSockets.  Is this
still your (and Google's) vision?

 

Thanks

 

 

From: Glen Shires [mailto:gshi...@google.com] 
Sent: Thursday, December 22, 2011 11:14 AM
To: public-webapps@w3.org; Arthur Barstow
Cc: public-xg-htmlspe...@w3.org; Dan Burnett
Subject: Re: HTML Speech XG Completes, seeks feedback for eventual
standardization

 

We at Google believe that a scripting-only (Javascript) subset of the
API defined in the Speech XG Incubator Group Final Report is of
appropriate scope for consideration by the WebApps WG.

 

The enclosed scripting-only subset supports the majority of the
use-cases and samples in the XG proposal. Specifically, it enables
web-pages to generate speech output and to use speech recognition as an
input for forms, continuous dictation and control. The Javascript API
will allow web pages to control activation and timing and to handle
results and alternatives.

 

We welcome your feedback and ask that the Web Applications WG consider
accepting this as a work item.

 

Bjorn Bringert

Satish Sampath

Glen Shires

 

On Tue, Dec 13, 2011 at 11:39 AM, Glen Shires gshi...@google.com
wrote:

We at Google believe that a scripting-only (Javascript) subset of the
API defined in the Speech XG Incubator Group Final Report [1] is of
appropriate scope for consideration by the WebApps WG.

 

A scripting-only subset supports the majority of the use-cases and
samples in the XG proposal. Specifically, it enables web-pages to
generate speech output and to use speech recognition as an input for
forms, continuous dictation and control. The Javascript API will allow
web pages to control activation and timing and to handle results and
alternatives

 

As Dan points out above, we envision that different portions of the
Incubator Group Final Report are applicable to different working groups
in W3C and/or other standards development organizations such as the
IETF.  This scripting API subset does not preclude other groups from
pursuing standardization of relevant HTML markup or underlying transport
protocols, and indeed the Incubator Group Final Report defines a
potential roadmap such that such additions can be compatible.

 

To make this more concrete, Google will provide to this mailing list a
specific proposal extracted from the Incubator Group Final Report, that
includes only those portions we believe are relevant to WebApps, with
links back to the Incubator Report as appropriate.

 

Bjorn Bringert

Satish Sampath

Glen Shires

 

[1] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/

 

On Tue, Dec 13, 2011 at 5:32 AM, Dan Burnett dburn...@voxeo.com wrote:

Thanks for the info, Art.  To be clear, I personally am *NOT* proposing
adding any specs to WebApps, although others might.  My email below as a
Chair of the group is merely to inform people of this work and ask for
feedback.
I expect that your information will be useful for others who might wish
for some of this work to continue in WebApps.

-- dan



On Dec 13, 2011, at 7:06 AM, Arthur Barstow wrote:

 Hi Dan,

 WebApps already has a relatively large number of specs in progress
(see [PubStatus]) and the group has agreed to add some additional specs
(see [CharterChanges]). As such, please provide a relatively specific
proposal about the features/specs you and other proponents would like to
add to WebApps.

 Regarding the level of detail for your proposal, I think a reasonable
precedence is something like the Gamepad and Pointer/MouseLock proposals
(see [CharterChanges]). (Perhaps this could be achieved by identifying
specific sections in the XG's Final Report?)

 -Art Barstow

 [PubStatus]
http://www.w3.org/2008/webapps/wiki/PubStatus#API_Specifications
 [CharterChanges]
http://www.w3.org/2008/webapps/wiki/CharterChanges#Additions_Agreed

 On 12/12/11 5:25 PM, ext Dan Burnett wrote:
 Dear WebApps people,

 The HTML Speech Incubator Group [1] has recently wrapped up its work
on use cases, requirements, and proposals for adding automatic speech
recognition (ASR) and text-to-speech (TTS) capabilities to HTML.  The
work of the group is documented in the group's Final Report. [2]

 The members of the group intend this work to be input to one or more
working groups, in W3C and/or other standards development organizations
such as the IETF, as an aid to developing full standards in this space.
 Whether the W3C work happens in a new Working Group or an existing
one, we are interested in collecting feedback on the Incubator Group's