On Tue, Jan 7, 2025 at 9:50 PM Evan Liu <ev...@google.com> wrote: > * Are the resources downloaded partitioned per top-level site? What should >> typical download sizes be? > > This depends on the browser--for Chrome on Windows/Mac/Linux, there's only > one instance of each on-device speech recognition language pack and each > language pack is ~60MB. The spec doesn't necessarily dictate how the > downloads are handled, only that websites should be allowed to trigger a > download (or request a download) of a language. >
This seems like it'd require at very least some extra considerations as part of the Privacy & Security section of the spec. It would also be good to have that be explicitly an implementation-defined decision. +Domenic Denicola <dome...@chromium.org> who's been working on similar privacy models related to translations, and can potentially advise you on the best path there. > Links to the minutes would be helpful. Filing official positions would be >> even better. > > I've filed official positions for Mozilla > <https://github.com/mozilla/standards-positions/issues/1157> and WebKit > <https://github.com/WebKit/standards-positions/issues/443>. > > Why not? Is it tested otherwise? > > Oops, I forgot to check that box. This feature is testable by > web-platform-tests. > > It’s implied that installOnDeviceSpeechRecognition() happens >> synchronously. Making this a blocking call seems problematic since it could >> involve a fetch and a download. I’d expect it to return a Promise ( >> https://www.w3.org/TR/design-principles/#promises). And >> onDeviceWebSpeechAvailable should probably also be async since it could >> involve reading data from disk. > > Totally agree--the implementation of those two APIs on Chrome return > promises. I'll make sure the spec reflects this. > > The SpeechRecognitionMode "ondevice-only" value is only defined by a >> comment in the IDL stating that it “Returns an error if on-device speech >> recognition is not available”. What specifically returns an error? >> SpeechRecognition.start() doesn’t return any value, and in other error >> conditions the behavior is to fire SpeechRecognitionErrorEvent. Also, what >> should the behavior be if SpeechRecognitionMode is changed after start() >> has already been called? > > Ah yeah, I'll update that comment to clarify that it fires a > SpeechRecognitionErrorEvent. Updating the SpeechRecognitionMode after > start() has been called has no effect on the existing session. This is > consistent with how other SpeechRecognition attributes work (i.e. lang, > maxAlternatives, etc.). This isn't explicitly stated anywhere in the spec, > so I'll file a spec issue to clarify this as well. > > As for mitigating privacy and fingerprinting risks, we've been > collaborating with the team building the Translator API > <https://chromestatus.com/feature/5172811302961152> feature which also > has the ability to download and detect language packs. Because the risks > between these two features are nearly identical, on-device speech > recognition language pack downloads will follow the same pattern and use > the same permissions UI as on-device translation language packs. Here are > some helpful links: > Privacy Design Doc > I don't think that's a link.. > Translator API Developer Docs > <https://developer.chrome.com/docs/ai/translator-api> > Github Issue on Preventing Fingerprinting > <https://github.com/webmachinelearning/translation-api/issues/3> > > Thanks, > Evan > > > On Tue, Jan 7, 2025 at 10:34 AM Daniel Clark <dan...@microsoft.com> wrote: > >> Adding to Yoav’s feedback about the spec: >> >> - It’s implied that installOnDeviceSpeechRecognition() happens >> synchronously. Making this a blocking call seems problematic since it >> could >> involve a fetch and a download. I’d expect it to return a Promise ( >> https://www.w3.org/TR/design-principles/#promises). And >> onDeviceWebSpeechAvailable should probably also be async since it could >> involve reading data from disk. >> - The SpeechRecognitionMode "ondevice-only" value is only defined by >> a comment in the IDL stating that it “Returns an error if on-device speech >> recognition is not available”. What specifically returns an error? >> SpeechRecognition.start() doesn’t return any value, and in other error >> conditions the behavior is to fire SpeechRecognitionErrorEvent. Also, what >> should the behavior be if SpeechRecognitionMode is changed after start() >> has already been called? >> >> >> >> I also wonder if this should have a TAG review, especially given the >> privacy/fingerprinting implications of websites being able to query which >> on-device models are available. >> >> >> >> -- Dan Clark >> >> >> >> *From:* Yoav Weiss (@Shopify) <yoavwe...@chromium.org> >> *Sent:* Tuesday, January 7, 2025 12:29 AM >> *To:* Chromestatus <ad...@cr-status.appspotmail.com> >> *Cc:* blink-dev@chromium.org; ev...@google.com >> *Subject:* [EXTERNAL] Re: [blink-dev] Intent to Ship: On-device Web >> Speech API >> >> >> >> >> >> >> >> On Tue, Jan 7, 2025 at 2:10 AM Chromestatus < >> ad...@cr-status.appspotmail.com> wrote: >> >> Contact emails >> >> ev...@google.com >> Explainer >> >> https://github.com/WebAudio/web-speech-api/pull/122 >> >> >> >> An actual explainer with usage examples would've been useful. >> >> Also, the spec is not very detailed: >> >> * It seems to be triggering resource downloads, but Fetch >> <https://fetch.spec.whatwg.org/> integration is not specified. >> >> * Are the resources downloaded partitioned per top-level site? What >> should typical download sizes be? >> >> >> >> >> Specification >> >> https://webaudio.github.io/web-speech-api >> Summary >> >> This feature adds on-device speech recognition support to the Web Speech >> API, allowing websites to ensure that neither audio nor transcribed speech >> are sent to a third-party service for processing. Websites can query the >> availability of on-device speech recognition for specific languages, prompt >> users to install the necessary resources for on-device speech recognition, >> and choose between on-device or cloud-based speech recognition as needed. >> >> >> Blink component >> >> Blink>Speech >> <https://issues.chromium.org/issues?q=customfield1222907:%22Blink%3ESpeech%22> >> Search tags >> >> speech <http://features#tags:speech>, recognition >> <http://features#tags:recognition>, local <http://features#tags:local>, >> offline <http://features#tags:offline>, on-device >> <http://features#tags:on-device> >> TAG review >> >> None >> TAG review status >> >> Pending >> Risks >> >> >> Interoperability and Compatibility >> >> None >> >> >> >> *Gecko*: Positive Discussed at TPAC 2024 with representatives from >> Mozilla including Paul Adenot >> >> *WebKit*: Positive Discussed at TPAC 2024 with representatives from >> Apple including Eric Carlson. >> >> >> >> Links to the minutes would be helpful. Filing official positions would be >> even better. >> >> >> >> >> >> *Web developers*: Positive Commonly requested feature. Examples: >> https://webwewant.fyi/wants/55/ >> https://github.com/WebAudio/web-speech-api/issues/108 >> https://stackoverflow.com/questions/49473369/offline-speech-recognition-in-browser >> https://www.reddit.com/r/html5/comments/8jtv3u/offline_voice_recognition_without_the_webspeech/ >> >> *Other signals*: >> WebView application risks >> >> *Does this intent deprecate or change behavior of existing APIs, such >> that it has potentially high risk for Android WebView-based applications?* >> >> None >> >> >> Debuggability >> >> None >> >> >> Will this feature be supported on all six Blink platforms (Windows, Mac, >> Linux, ChromeOS, Android, and Android WebView)? >> >> No >> >> Initially supported on Windows, Mac, and Linux with ChromeOS support to >> follow. >> >> >> Is this feature fully tested by web-platform-tests >> <https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_platform_tests.md> >> ? >> >> No >> >> >> >> Why not? Is it tested otherwise? >> >> >> Flag name on about://flags >> >> None >> Finch feature name >> >> InstallOnDeviceSpeechRecognition,OnDeviceWebSpeechAvailable,OnDeviceWebSpeech >> >> Requires code in //chrome? >> >> False >> Estimated milestones >> >> Shipping on desktop >> >> 135 >> >> >> Anticipated spec changes >> >> *Open questions about a feature may be a source of future web compat or >> interop issues. Please list open issues (e.g. links to known github issues >> in the project for the feature specification) whose resolution may >> introduce web compat/interop risk (e.g., changing to naming or structure of >> the API in a non-backward-compatible way).* >> >> https://github.com/WebAudio/web-speech-api/pull/122 >> Link to entry on the Chrome Platform Status >> >> https://chromestatus.com/feature/6090916291674112?gate=4683906480340992 >> >> This intent message was generated by Chrome Platform Status >> <https://chromestatus.com/>. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "blink-dev" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to blink-dev+unsubscr...@chromium.org. >> To view this discussion visit >> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/677c7f0e.2b0a0220.2e82a8.01f6.GAE%40google.com >> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/677c7f0e.2b0a0220.2e82a8.01f6.GAE%40google.com?utm_medium=email&utm_source=footer> >> . >> >> -- >> You received this message because you are subscribed to the Google Groups >> "blink-dev" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to blink-dev+unsubscr...@chromium.org. >> To view this discussion visit >> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAOmohSJFcq7nCbx372u8Qas0%3DUWbCUY9b37ak6fAN8CwGfFVcA%40mail.gmail.com >> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAOmohSJFcq7nCbx372u8Qas0%3DUWbCUY9b37ak6fAN8CwGfFVcA%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "blink-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscr...@chromium.org. To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAOmohSJb0cmJS4MxC7sTAnXNtrOXdV601QoGa_pXwseJH4%2Bhcw%40mail.gmail.com.