[blink-dev] Re: Intent to Prototype: Web Speech API: On-Device Recognition Quality

'Evan Liu' via blink-dev Thu, 15 Jan 2026 13:48:05 -0800

+cc [email protected]

On Thu, Jan 15, 2026 at 12:32 PM Chromestatus <
[email protected]> wrote:


> *Contact emails*
> [email protected]
>
> *Explainer*
>
> https://docs.google.com/document/d/1fLqaTipW1AcHRwoa7V88Z_SJ52w_xV2lObLq-Sqaw5Q/edit?usp=sharing
>
> *Specification*
> https://webaudio.github.io/web-speech-api
>
> *Summary*
> Extends the SpeechRecognition interface by adding a quality property to
> SpeechRecognitionOptions. This allows developers to specify the semantic
> capability required for on-device recognition (via processLocally: true).
> The proposed quality enum supports three levels—'command', 'dictation', and
> 'conversation'—mapping to increasing task complexity and hardware
> requirements. This enables developers to determine if the local device can
> handle high-stakes use cases (like meeting transcription) or if they should
> fallback to cloud services, solving the current "black box" issue of
> on-device model capabilities.
>
> *Blink component*
> Blink>Speech
> <https://issues.chromium.org/issues?q=customfield1222907:%22Blink%3ESpeech%22>
>
> *Web Feature ID*
> speech-recognition <https://webstatus.dev/features/speech-recognition>
>
> *Motivation*
> While the introduction of processLocally: true was a significant step for
> privacy and latency, it currently treats all on-device models as
> functionally equivalent. In reality, on-device capabilities are highly
> fragmented: a lightweight model optimized for simple voice commands (e.g.,
> "turn on the lights") is often insufficient for high-stakes use cases like
> video conferencing transcription or accessibility captioning, which require
> handling continuous speech, multiple speakers, and background noise.
> Because developers currently have no way to verify the semantic capability
> of the local model, they must blindly trust the device or default to
> Cloud-based recognition to guarantee a minimum user experience. This lack
> of transparency forces developers to bypass on-device capabilities for
> high-end use cases, effectively negating the privacy and bandwidth benefits
> of the API. There is a critical need for a mechanism that allows
> applications to define their required "floor" of utility (e.g.,
> conversation-grade accuracy) to confidently utilize local processing.
>
> *Initial public proposal*
> https://github.com/WebAudio/web-speech-api/issues/182
>
> *Requires code in //chrome?*
> True
>
> *Tracking bug*
> https://g-issues.chromium.org/issues/476168420
>
> *Estimated milestones*
>
> No milestones specified
>
>
> *Link to entry on the Chrome Platform Status*
> https://chromestatus.com/feature/5136859632107520?gate=6299914562830336
>
> This intent message was generated by Chrome Platform Status
> <https://chromestatus.com>.
>

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAOVsCZ%3DrOCSmtPvwaH251S70bosizT%2BZxKbea%2BimV69t0yKYjg%40mail.gmail.com.

[blink-dev] Re: Intent to Prototype: Web Speech API: On-Device Recognition Quality

Reply via email to