Re: [EXTERNAL] Re: [blink-dev] Intent to Ship: On-device Web Speech API

Yoav Weiss (@Shopify) Tue, 07 Jan 2025 19:20:09 -0800

On Tue, Jan 7, 2025 at 9:50 PM Evan Liu <ev...@google.com> wrote:

> * Are the resources downloaded partitioned per top-level site? What should
>> typical download sizes be?
>
> This depends on the browser--for Chrome on Windows/Mac/Linux, there's only
> one instance of each on-device speech recognition language pack and each
> language pack is ~60MB. The spec doesn't necessarily dictate how the
> downloads are handled, only that websites should be allowed to trigger a
> download (or request a download) of a language.
>


This seems like it'd require at very least some extra considerations as
part of the Privacy & Security section of the spec.
It would also be good to have that be explicitly an implementation-defined
decision.

+Domenic Denicola <dome...@chromium.org> who's been working on similar
privacy models related to translations, and can potentially advise you on
the best path there.


> Links to the minutes would be helpful. Filing official positions would be
>> even better.
>
> I've filed official positions for Mozilla
> <https://github.com/mozilla/standards-positions/issues/1157> and WebKit
> <https://github.com/WebKit/standards-positions/issues/443>.
>
> Why not? Is it tested otherwise?
>
> Oops, I forgot to check that box. This feature is testable by
> web-platform-tests.
>
> It’s implied that installOnDeviceSpeechRecognition() happens
>> synchronously. Making this a blocking call seems problematic since it could
>> involve a fetch and a download. I’d expect it to return a Promise (
>> https://www.w3.org/TR/design-principles/#promises). And
>> onDeviceWebSpeechAvailable should probably also be async since it could
>> involve reading data from disk.
>
> Totally agree--the implementation of those two APIs on Chrome return
> promises. I'll make sure the spec reflects this.
>
> The SpeechRecognitionMode "ondevice-only" value is only defined by a
>> comment in the IDL stating that it “Returns an error if on-device speech
>> recognition is not available”. What specifically returns an error?
>> SpeechRecognition.start() doesn’t return any value, and in other error
>> conditions the behavior is to fire SpeechRecognitionErrorEvent. Also, what
>> should the behavior be if SpeechRecognitionMode is changed after start()
>> has already been called?
>
> Ah yeah, I'll update that comment to clarify that it fires a
> SpeechRecognitionErrorEvent. Updating the SpeechRecognitionMode after
> start() has been called has no effect on the existing session. This is
> consistent with how other SpeechRecognition attributes work (i.e. lang,
> maxAlternatives, etc.). This isn't explicitly stated anywhere in the spec,
> so I'll file a spec issue to clarify this as well.
>
> As for mitigating privacy and fingerprinting risks, we've been
> collaborating with the team building the Translator API
> <https://chromestatus.com/feature/5172811302961152> feature which also
> has the ability to download and detect language packs. Because the risks
> between these two features are nearly identical, on-device speech
> recognition language pack downloads will follow the same pattern and use
> the same permissions UI as on-device translation language packs. Here are
> some helpful links:
> Privacy Design Doc
>

I don't think that's a link..


> Translator API Developer Docs
> <https://developer.chrome.com/docs/ai/translator-api>
> Github Issue on Preventing Fingerprinting
> <https://github.com/webmachinelearning/translation-api/issues/3>
>
> Thanks,
> Evan
>
>
> On Tue, Jan 7, 2025 at 10:34 AM Daniel Clark <dan...@microsoft.com> wrote:
>
>> Adding to Yoav’s feedback about the spec:
>>
>>    - It’s implied that installOnDeviceSpeechRecognition() happens
>>    synchronously. Making this a blocking call seems problematic since it 
>> could
>>    involve a fetch and a download. I’d expect it to return a Promise (
>>    https://www.w3.org/TR/design-principles/#promises). And
>>    onDeviceWebSpeechAvailable should probably also be async since it could
>>    involve reading data from disk.
>>    - The SpeechRecognitionMode "ondevice-only" value is only defined by
>>    a comment in the IDL stating that it “Returns an error if on-device speech
>>    recognition is not available”. What specifically returns an error?
>>    SpeechRecognition.start() doesn’t return any value, and in other error
>>    conditions the behavior is to fire SpeechRecognitionErrorEvent. Also, what
>>    should the behavior be if SpeechRecognitionMode is changed after start()
>>    has already been called?
>>
>>
>>
>> I also wonder if this should have a TAG review, especially given the
>> privacy/fingerprinting implications of websites being able to query which
>> on-device models are available.
>>
>>
>>
>> -- Dan Clark
>>
>>
>>
>> *From:* Yoav Weiss (@Shopify) <yoavwe...@chromium.org>
>> *Sent:* Tuesday, January 7, 2025 12:29 AM
>> *To:* Chromestatus <ad...@cr-status.appspotmail.com>
>> *Cc:* blink-dev@chromium.org; ev...@google.com
>> *Subject:* [EXTERNAL] Re: [blink-dev] Intent to Ship: On-device Web
>> Speech API
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Jan 7, 2025 at 2:10 AM Chromestatus <
>> ad...@cr-status.appspotmail.com> wrote:
>>
>> Contact emails
>>
>> ev...@google.com
>> Explainer
>>
>> https://github.com/WebAudio/web-speech-api/pull/122
>>
>>
>>
>> An actual explainer with usage examples would've been useful.
>>
>> Also, the spec is not very detailed:
>>
>> * It seems to be triggering resource downloads, but Fetch
>> <https://fetch.spec.whatwg.org/> integration is not specified.
>>
>> * Are the resources downloaded partitioned per top-level site? What
>> should typical download sizes be?
>>
>>
>>
>>
>> Specification
>>
>> https://webaudio.github.io/web-speech-api
>> Summary
>>
>> This feature adds on-device speech recognition support to the Web Speech
>> API, allowing websites to ensure that neither audio nor transcribed speech
>> are sent to a third-party service for processing. Websites can query the
>> availability of on-device speech recognition for specific languages, prompt
>> users to install the necessary resources for on-device speech recognition,
>> and choose between on-device or cloud-based speech recognition as needed.
>>
>>
>> Blink component
>>
>> Blink>Speech
>> <https://issues.chromium.org/issues?q=customfield1222907:%22Blink%3ESpeech%22>
>> Search tags
>>
>> speech <http://features#tags:speech>, recognition
>> <http://features#tags:recognition>, local <http://features#tags:local>,
>> offline <http://features#tags:offline>, on-device
>> <http://features#tags:on-device>
>> TAG review
>>
>> None
>> TAG review status
>>
>> Pending
>> Risks
>>
>>
>> Interoperability and Compatibility
>>
>> None
>>
>>
>>
>> *Gecko*: Positive Discussed at TPAC 2024 with representatives from
>> Mozilla including Paul Adenot
>>
>> *WebKit*: Positive Discussed at TPAC 2024 with representatives from
>> Apple including Eric Carlson.
>>
>>
>>
>> Links to the minutes would be helpful. Filing official positions would be
>> even better.
>>
>>
>>
>>
>>
>> *Web developers*: Positive Commonly requested feature. Examples:
>> https://webwewant.fyi/wants/55/
>> https://github.com/WebAudio/web-speech-api/issues/108
>> https://stackoverflow.com/questions/49473369/offline-speech-recognition-in-browser
>> https://www.reddit.com/r/html5/comments/8jtv3u/offline_voice_recognition_without_the_webspeech/
>>
>> *Other signals*:
>> WebView application risks
>>
>> *Does this intent deprecate or change behavior of existing APIs, such
>> that it has potentially high risk for Android WebView-based applications?*
>>
>> None
>>
>>
>> Debuggability
>>
>> None
>>
>>
>> Will this feature be supported on all six Blink platforms (Windows, Mac,
>> Linux, ChromeOS, Android, and Android WebView)?
>>
>> No
>>
>> Initially supported on Windows, Mac, and Linux with ChromeOS support to
>> follow.
>>
>>
>> Is this feature fully tested by web-platform-tests
>> <https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_platform_tests.md>
>> ?
>>
>> No
>>
>>
>>
>> Why not? Is it tested otherwise?
>>
>>
>> Flag name on about://flags
>>
>> None
>> Finch feature name
>>
>> InstallOnDeviceSpeechRecognition,OnDeviceWebSpeechAvailable,OnDeviceWebSpeech
>>
>> Requires code in //chrome?
>>
>> False
>> Estimated milestones
>>
>> Shipping on desktop
>>
>> 135
>>
>>
>> Anticipated spec changes
>>
>> *Open questions about a feature may be a source of future web compat or
>> interop issues. Please list open issues (e.g. links to known github issues
>> in the project for the feature specification) whose resolution may
>> introduce web compat/interop risk (e.g., changing to naming or structure of
>> the API in a non-backward-compatible way).*
>>
>> https://github.com/WebAudio/web-speech-api/pull/122
>> Link to entry on the Chrome Platform Status
>>
>> https://chromestatus.com/feature/6090916291674112?gate=4683906480340992
>>
>> This intent message was generated by Chrome Platform Status
>> <https://chromestatus.com/>.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "blink-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to blink-dev+unsubscr...@chromium.org.
>> To view this discussion visit
>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/677c7f0e.2b0a0220.2e82a8.01f6.GAE%40google.com
>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/677c7f0e.2b0a0220.2e82a8.01f6.GAE%40google.com?utm_medium=email&utm_source=footer>
>> .
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "blink-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to blink-dev+unsubscr...@chromium.org.
>> To view this discussion visit
>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAOmohSJFcq7nCbx372u8Qas0%3DUWbCUY9b37ak6fAN8CwGfFVcA%40mail.gmail.com
>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAOmohSJFcq7nCbx372u8Qas0%3DUWbCUY9b37ak6fAN8CwGfFVcA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to blink-dev+unsubscr...@chromium.org.
To view this discussion visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAOmohSJb0cmJS4MxC7sTAnXNtrOXdV601QoGa_pXwseJH4%2Bhcw%40mail.gmail.com.

Re: [EXTERNAL] Re: [blink-dev] Intent to Ship: On-device Web Speech API

Reply via email to