On 1/31/14, Daniel Kahn Gillmor <[email protected]> wrote:

> i'd be curious to know what other people's intuitions are on this.  I
> haven't done signal-processing work in years, and i know very little
> about speech timing analysis mechanisms.

If (a) the users will tolerate enough latency for the MITM box to
buffer about a word of speech, and (b) the attacker can get some
advance information about the users' dialect, it sounds like a fun
problem for a grad student.

My understanding is that the major part of speaker recognition is the
‘glottal pulse’ (which can easily be extracted from any voiced
phoneme), and the next most likely thing that a human would notice is
general pronunciation of words (e.g. general American vs. New York
accent vs. British Received Pronunciation).  Once you know how the
user speaks in general, just detect the magic word and stomp it with a
synthesized replacement.


Robert Ransom
_______________________________________________
Messaging mailing list
[email protected]
https://moderncrypto.org/mailman/listinfo/messaging

Reply via email to