Steven M. Bellovin writes:
> The problem, from the perspective of an intelligence agency, is figuring out
> what to listen to. Let's do some arithmetic.
>
> The product you cite requires at least a 133 Mhz Pentium; 200 Mhz preferred.
> How many such chips are needed? Well, according to a map on a wall near my
Ahem. DSPs are very cheap (like <$10) in large quantities, and are
roughly comparable or better than a P200, especially for numerical
purposes and if properly optimized. From a certain scale onward (I'm
sure that Echelon exceeds the threshold easily) you can use ASICs, at
least for the most computationally intensive algorithm stages, which
makes it even cheaper (also in terms of footprint and
power/airconditioning, which can be surprisingly expensive). Then of
course you can hard-disk record a temporal window of every current
conversation (which can be as deep as a whole call: compressed voice
is low-bandwidth and hard drives are cheap), while sifting for certain
keywords (with recent neuronal hardware techniques looking for
keywords from a limited vocabulary or looking for specific speakers
should be a piece of cake). If no hits occur, you discard the
call. (Or process random calls if there is too much slack in the
pool). If if the hits are over threshold, you keep the hitherto
recorded call window plus intercept the whole call, then submit the
job for (computationally expensive) high-fidelity crunching (which is
probably not better than the best commercially available), which may
give you -- how much? -- maybe 90-95% of recognition rate. If the AI
(which can be something trained on a large real-world case knowledge
base) deems the conversation to be hot, it passes on the transcript to
a human. If you swamp their (limited, expensive, slow to expand)
processing capabilities, you tweak the threshold until they just can
cope.
Of course there are conversations from certain people (crypto
activists, politicians, criminals, C.E.Os etc.), which get filed
automatically. I don't think the fact that most of Echelon is about
industrial espionage is a big secret.
Even if above isn't true (yet), it is imo sound practice to assume a
worst case.
> office (see http://www.telegeography.com/Publications/cmap99.html), there are
> currently about 150 Gbps worth of fiber across the Atlantic. That's about 2.7
> potential million phone channels. A lot of that is data, of course -- shall
> we say 75%? That still leaves us with ~675K simultaneous calls. That's an
> awful lot of CPU power, even by NSA's standards.
Afaik the current telephone system can't work if more than (order of
magnitude) ~10% of customers attempt to speak simultaneously. The
world population might be 5-6 billion, but most of them don't have
phones. The amount of international calls must be but a tiny fraction
of total calls.
> And it gets worse -- within a year, the FLAG and TAT-14 cables will come
> online, adding at least 800 Gbps of capacity...
As far as I know the transatlantic fibers are mostly idling. There
aren't that many humans around who want to make international
calls. I'd surmise the growth saturates sooner or later. The fraction
of voice vs. data will be increasingly skewed towards data.
Of course currently voice is typically much more important than data...
> Tentative conclusion: they need to listen to the signaling channels, so that
> they can focus their efforts. *Then* they can do the voice recognition and
> pattern-matching tricks.
Afaik focusing has been intelligence's basic tenet since time immemorable.
> --Steve Bellovin
>
>