Re: FW: on FPGAs vs ASICs

2005-03-22 Thread Tyler Durden
How much off-the-shelf crypto IP is available to be plopped on a crypto net 
processor? Are their stego detection/cracking Development kits and so on?

-TD
From: "Major Variola (ret)" <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Subject: Re: FW: on FPGAs vs ASICs
Date: Mon, 21 Mar 2005 18:34:07 -0800
At 05:44 PM 3/20/05 -0500, Tyler Durden wrote:
>What I suspect is that there's already some crypto net processors out
there,
>though they may be classified, or the commercial equivalent (ie, I
assume
>there are 'classified' catalogs from companies like General Dynamics
that
>normal clients never see).
I've programmed (well, microcoded) the Intel IXA family.   Some variants
of that family can do line-rate AES.  They can handle insane line rates,
thanks
to hardware everything and an array of hyperthreaded RISCs.   Not
at all classified.
At 09:49 AM 3/21/05 -0500, Trei, Peter wrote:
>One of the interesting twists of FPGAs is that you can
>optimize the circuit to the actual data being processed.
>For example, in DES keysearch you could hardwire into
>the circuit some of the subkey bits (which were determined
>by, say, high order key bits you rarely changed), thus
>simplifying the circuit. When those bits changed, you
>re-wrote the circuilt.
Its quite possible that reconfigurability is part of the future.
Your N-way x86 die will come with a few hundred thou reconfigurable
gates, which you'll reconfigure to do your Photoshop or MPEG
or rendering or speech recognition or modular exponentiation
tasks.   Obviously this is a big change and there's a lot of software
support required (from OS to app) to make it happen.  Also
there are fascinating tech problems in coupling the reconfig hardware
to high bandwidth data flows, required to keep it busy.  But the
benefits
are substantial.
Tangentially,
I should note that there are "modes of encryption" which can be scaled
infinitely
with parallel hardware; they use interleaved blocks so each chip sees
every Nth
block of the real stream.  So high clock rates are not required to
crypt.
It seems that hashing can be parallelized that way too, run a hash-chip
on
every Nth bit, and hash those partial results.   Both ends have to agree
on the N-way division (as with the infinitely scalable crypto) but
that's all.
With regular hashing (and attacks thereof that require grinding out a
lot
of hashes in order to find a collision, to go back to the original
topic)
single-chip parallel hardware hacks could speed things up, but (given
that modern hashes
are designed for CPUs, like AES) I don't ever expect to see DESCrack
like
gains there.
And while TD keeps alluding to the DESCrack suitcase, I'll point out
that a GSM Cracker
could fit in your carry-on luggage nowadays.   Every 'embassy' ought to
have one :-)



Re: FW: on FPGAs vs ASICs

2005-03-22 Thread Jack Lloyd
On Mon, Mar 21, 2005 at 06:34:07PM -0800, Major Variola (ret) wrote:

> Tangentially, I should note that there are "modes of encryption" which can be
> scaled infinitely with parallel hardware; they use interleaved blocks so each
> chip sees every Nth block of the real stream.  So high clock rates are not
> required to crypt.

Counter mode works this way, and is a fairly common mode in any case.

> It seems that hashing can be parallelized that way too, run a hash-chip on
> every Nth bit, and hash those partial results.  Both ends have to agree on
> the N-way division (as with the infinitely scalable crypto) but that's all.

Depending on the interconnect it would probably be faster to do it in blocks of
8-64k, doing it a bit at a time would eat your standard PCI bus alive.

There are message authentication modes which can scale 'infinitely' (assuming a
sufficiently long message), and don't depend on the number of functional units,
so for example I could generate a MAC using my regular single core CPU and you
could verify it on a machine with N functional units with a cooresponding
speedup of N (modulo some fixed per-message overhead) without us having to
agree on anything in advance. For example there is the MAC used in Rogoway's
OCB. Unfortunately most (all?) of these algorithms have been patented.

-Jack



Re: FW: on FPGAs vs ASICs

2005-03-21 Thread Major Variola (ret)
At 05:44 PM 3/20/05 -0500, Tyler Durden wrote:
>What I suspect is that there's already some crypto net processors out
there,
>though they may be classified, or the commercial equivalent (ie, I
assume
>there are 'classified' catalogs from companies like General Dynamics
that
>normal clients never see).

I've programmed (well, microcoded) the Intel IXA family.   Some variants

of that family can do line-rate AES.  They can handle insane line rates,
thanks
to hardware everything and an array of hyperthreaded RISCs.   Not
at all classified.


At 09:49 AM 3/21/05 -0500, Trei, Peter wrote:
>One of the interesting twists of FPGAs is that you can
>optimize the circuit to the actual data being processed.
>For example, in DES keysearch you could hardwire into
>the circuit some of the subkey bits (which were determined
>by, say, high order key bits you rarely changed), thus
>simplifying the circuit. When those bits changed, you
>re-wrote the circuilt.

Its quite possible that reconfigurability is part of the future.
Your N-way x86 die will come with a few hundred thou reconfigurable
gates, which you'll reconfigure to do your Photoshop or MPEG
or rendering or speech recognition or modular exponentiation
tasks.   Obviously this is a big change and there's a lot of software
support required (from OS to app) to make it happen.  Also
there are fascinating tech problems in coupling the reconfig hardware
to high bandwidth data flows, required to keep it busy.  But the
benefits
are substantial.

Tangentially,
I should note that there are "modes of encryption" which can be scaled
infinitely
with parallel hardware; they use interleaved blocks so each chip sees
every Nth
block of the real stream.  So high clock rates are not required to
crypt.

It seems that hashing can be parallelized that way too, run a hash-chip
on
every Nth bit, and hash those partial results.   Both ends have to agree

on the N-way division (as with the infinitely scalable crypto) but
that's all.
With regular hashing (and attacks thereof that require grinding out a
lot
of hashes in order to find a collision, to go back to the original
topic)
single-chip parallel hardware hacks could speed things up, but (given
that modern hashes
are designed for CPUs, like AES) I don't ever expect to see DESCrack
like
gains there.

And while TD keeps alluding to the DESCrack suitcase, I'll point out
that a GSM Cracker
could fit in your carry-on luggage nowadays.   Every 'embassy' ought to
have one :-)








FW: on FPGAs vs ASICs

2005-03-21 Thread Trei, Peter

>From Major Variola (ret)

> Tyler, Riad, etc:
 
> FPGAs are used in telecom because the volumes do not support an ASIC
> run.
> Riad doesn't seem to appreciate this.  He does understand that an ASIC
> is more
> efficient because its gates are used only for 1 computation, 
> rather than
> most
> (FPGA) gates being used for reconfigurability ---useful if you can't
> afford
> an ASIC run (a million bucks a mask...) or if algorithms get tweaked
> (eg you release before the Spec comes out, or you are shooting for
> time-to-market).  Clockwise an FPGA wastes time in extra wire routing
> although since an FPGA may be made in state of the art processes,
> and your ASIC may not, its a complex tradeoff.  (Albeit some circuit
> topologies
> work very well on FPGAs)
> 
> So for the Cypherpunk wanting hardware (vs cluster) 
> acceleration, FPGAs
> are the way to go.  For TLAs, you prototype in FPGAs of course, and
> then make some chips in your private fab.  (Same for Broadcom, etc.)
> 
> For someone making 10,000 routers, you use FPGAs.
> 
> DESCrack was solving a problem for which the x86 is not very efficient
> at computing --all the sub-byte bit-diddling-- and hardware is very
> efficient
> (by design in DES, after all).

Indeed, during the initial DESCrack effort, I spent some time
investigating FPGAs. I came to the conclusion that it was
definitely possible to build a Weiner-style pipeline machine
(ie, one key tested per clock cycle), but it would be more
costly than I could afford. 

One of the interesting twists of FPGAs is that you can
optimize the circuit to the actual data being processed. 
For example, in DES keysearch you could hardwire into
the circuit some of the subkey bits (which were determined
by, say, high order key bits you rarely changed), thus
simplifying the circuit. When those bits changed, you
re-wrote the circuilt.

Peter Trei