Re: FW: on FPGAs vs ASICs
How much off-the-shelf crypto IP is available to be plopped on a crypto net processor? Are their stego detection/cracking Development kits and so on? -TD From: "Major Variola (ret)" <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Subject: Re: FW: on FPGAs vs ASICs Date: Mon, 21 Mar 2005 18:34:07 -0800 At 05:44 PM 3/20/05 -0500, Tyler Durden wrote: >What I suspect is that there's already some crypto net processors out there, >though they may be classified, or the commercial equivalent (ie, I assume >there are 'classified' catalogs from companies like General Dynamics that >normal clients never see). I've programmed (well, microcoded) the Intel IXA family. Some variants of that family can do line-rate AES. They can handle insane line rates, thanks to hardware everything and an array of hyperthreaded RISCs. Not at all classified. At 09:49 AM 3/21/05 -0500, Trei, Peter wrote: >One of the interesting twists of FPGAs is that you can >optimize the circuit to the actual data being processed. >For example, in DES keysearch you could hardwire into >the circuit some of the subkey bits (which were determined >by, say, high order key bits you rarely changed), thus >simplifying the circuit. When those bits changed, you >re-wrote the circuilt. Its quite possible that reconfigurability is part of the future. Your N-way x86 die will come with a few hundred thou reconfigurable gates, which you'll reconfigure to do your Photoshop or MPEG or rendering or speech recognition or modular exponentiation tasks. Obviously this is a big change and there's a lot of software support required (from OS to app) to make it happen. Also there are fascinating tech problems in coupling the reconfig hardware to high bandwidth data flows, required to keep it busy. But the benefits are substantial. Tangentially, I should note that there are "modes of encryption" which can be scaled infinitely with parallel hardware; they use interleaved blocks so each chip sees every Nth block of the real stream. So high clock rates are not required to crypt. It seems that hashing can be parallelized that way too, run a hash-chip on every Nth bit, and hash those partial results. Both ends have to agree on the N-way division (as with the infinitely scalable crypto) but that's all. With regular hashing (and attacks thereof that require grinding out a lot of hashes in order to find a collision, to go back to the original topic) single-chip parallel hardware hacks could speed things up, but (given that modern hashes are designed for CPUs, like AES) I don't ever expect to see DESCrack like gains there. And while TD keeps alluding to the DESCrack suitcase, I'll point out that a GSM Cracker could fit in your carry-on luggage nowadays. Every 'embassy' ought to have one :-)
Re: FW: on FPGAs vs ASICs
On Mon, Mar 21, 2005 at 06:34:07PM -0800, Major Variola (ret) wrote: > Tangentially, I should note that there are "modes of encryption" which can be > scaled infinitely with parallel hardware; they use interleaved blocks so each > chip sees every Nth block of the real stream. So high clock rates are not > required to crypt. Counter mode works this way, and is a fairly common mode in any case. > It seems that hashing can be parallelized that way too, run a hash-chip on > every Nth bit, and hash those partial results. Both ends have to agree on > the N-way division (as with the infinitely scalable crypto) but that's all. Depending on the interconnect it would probably be faster to do it in blocks of 8-64k, doing it a bit at a time would eat your standard PCI bus alive. There are message authentication modes which can scale 'infinitely' (assuming a sufficiently long message), and don't depend on the number of functional units, so for example I could generate a MAC using my regular single core CPU and you could verify it on a machine with N functional units with a cooresponding speedup of N (modulo some fixed per-message overhead) without us having to agree on anything in advance. For example there is the MAC used in Rogoway's OCB. Unfortunately most (all?) of these algorithms have been patented. -Jack
Re: FW: on FPGAs vs ASICs
At 05:44 PM 3/20/05 -0500, Tyler Durden wrote: >What I suspect is that there's already some crypto net processors out there, >though they may be classified, or the commercial equivalent (ie, I assume >there are 'classified' catalogs from companies like General Dynamics that >normal clients never see). I've programmed (well, microcoded) the Intel IXA family. Some variants of that family can do line-rate AES. They can handle insane line rates, thanks to hardware everything and an array of hyperthreaded RISCs. Not at all classified. At 09:49 AM 3/21/05 -0500, Trei, Peter wrote: >One of the interesting twists of FPGAs is that you can >optimize the circuit to the actual data being processed. >For example, in DES keysearch you could hardwire into >the circuit some of the subkey bits (which were determined >by, say, high order key bits you rarely changed), thus >simplifying the circuit. When those bits changed, you >re-wrote the circuilt. Its quite possible that reconfigurability is part of the future. Your N-way x86 die will come with a few hundred thou reconfigurable gates, which you'll reconfigure to do your Photoshop or MPEG or rendering or speech recognition or modular exponentiation tasks. Obviously this is a big change and there's a lot of software support required (from OS to app) to make it happen. Also there are fascinating tech problems in coupling the reconfig hardware to high bandwidth data flows, required to keep it busy. But the benefits are substantial. Tangentially, I should note that there are "modes of encryption" which can be scaled infinitely with parallel hardware; they use interleaved blocks so each chip sees every Nth block of the real stream. So high clock rates are not required to crypt. It seems that hashing can be parallelized that way too, run a hash-chip on every Nth bit, and hash those partial results. Both ends have to agree on the N-way division (as with the infinitely scalable crypto) but that's all. With regular hashing (and attacks thereof that require grinding out a lot of hashes in order to find a collision, to go back to the original topic) single-chip parallel hardware hacks could speed things up, but (given that modern hashes are designed for CPUs, like AES) I don't ever expect to see DESCrack like gains there. And while TD keeps alluding to the DESCrack suitcase, I'll point out that a GSM Cracker could fit in your carry-on luggage nowadays. Every 'embassy' ought to have one :-)
Re: on FPGAs vs ASICs
At 11:11 AM 3/19/2005, Major Variola (ret) wrote: ---useful if you can't afford an ASIC run (a million bucks a mask...) ... For someone making 10,000 routers, you use FPGAs. DESCrack was solving a problem for which the x86 is not very efficient at computing --all the sub-byte bit-diddling-- and hardware is very efficient (by design in DES, after all). EFF's DESCrack cost $200K in 1998 and used ASICs. (It's really only six years since we killed off single-DES!) There were 1500 DES-cracker ASIC chips in it. ASICs may cost a bit more today - Moore's Law helps, but it also means that chip designs can become larger and more complex, though code-cracker applications have a lot of uniformity in their design, and we've got six more years of experience building ASIC cell libraries that can be reused. I suspect a similar-sized machine would cost a similar amount but have a lot more DES functional units in it. FPGAs probably make more sense for routers, because you want the ability to change the firmware more often, and a router has a bunch of other parts as well, and realistically, cypher-cracking is not an economically viable activity for most people, so the cost-benefit tradeoffs are a bit twisted.
Re: on FPGAs vs ASICs
FPGAs probably make more sense for routers, because you want the ability to change the firmware more often, and a router has a bunch of other parts as well, and realistically, cypher-cracking is not an economically viable activity for most people, so the cost-benefit tradeoffs are a bit twisted. The router world seems to use a good mixture. At a startup we were purchasing nice off-the-shelf MPLS ASICs, which did MPLS route setup and forwarding (and some enforcement) while the 'software'/control plane (eg, OSPF, RSVP-TE, etc...) was largely in FPGAs of our own brew. At that time (ca, 2000/2001) some vendors were starting to push net processors, which were somewhere in between, and at the time just weren't quite fast enough for ASIC-busting applications and not quite flexible enough for FPGA-ish applications. Now, however, I'd bet net processors are very effective for metro-edge applications. What I suspect is that there's already some crypto net processors out there, though they may be classified, or the commercial equivalent (ie, I assume there are 'classified' catalogs from companies like General Dynamics that normal clients never see). They can periodically upgrade the code when they discover that some new form of stego (for instance) has become in-vogue at Al Qaeda. These won't be Variola Suitcase-type applications, though, but perhaps for special situations where they know the few locations in Cobble Hill Brooklyn they want to monitor and decrypt. -TD
Re: on FPGAs vs ASICs
"Major Variola (ret)" <[EMAIL PROTECTED]> wrote: > Riad doesn't seem to appreciate this. Of course I do. I'm saying that for our purposes (a dedicated hashcracker) we want an ASIC. Whether we can afford one or not is another question (obviously if we can't, we buy the best FPGA we can). ...or are we no longer assuming an adversary with unlimited resources? -- Riad S. Wahby [EMAIL PROTECTED]
FW: on FPGAs vs ASICs
>From Major Variola (ret) > Tyler, Riad, etc: > FPGAs are used in telecom because the volumes do not support an ASIC > run. > Riad doesn't seem to appreciate this. He does understand that an ASIC > is more > efficient because its gates are used only for 1 computation, > rather than > most > (FPGA) gates being used for reconfigurability ---useful if you can't > afford > an ASIC run (a million bucks a mask...) or if algorithms get tweaked > (eg you release before the Spec comes out, or you are shooting for > time-to-market). Clockwise an FPGA wastes time in extra wire routing > although since an FPGA may be made in state of the art processes, > and your ASIC may not, its a complex tradeoff. (Albeit some circuit > topologies > work very well on FPGAs) > > So for the Cypherpunk wanting hardware (vs cluster) > acceleration, FPGAs > are the way to go. For TLAs, you prototype in FPGAs of course, and > then make some chips in your private fab. (Same for Broadcom, etc.) > > For someone making 10,000 routers, you use FPGAs. > > DESCrack was solving a problem for which the x86 is not very efficient > at computing --all the sub-byte bit-diddling-- and hardware is very > efficient > (by design in DES, after all). Indeed, during the initial DESCrack effort, I spent some time investigating FPGAs. I came to the conclusion that it was definitely possible to build a Weiner-style pipeline machine (ie, one key tested per clock cycle), but it would be more costly than I could afford. One of the interesting twists of FPGAs is that you can optimize the circuit to the actual data being processed. For example, in DES keysearch you could hardwire into the circuit some of the subkey bits (which were determined by, say, high order key bits you rarely changed), thus simplifying the circuit. When those bits changed, you re-wrote the circuilt. Peter Trei
on FPGAs vs ASICs
Tyler, Riad, etc: FPGAs are used in telecom because the volumes do not support an ASIC run. Riad doesn't seem to appreciate this. He does understand that an ASIC is more efficient because its gates are used only for 1 computation, rather than most (FPGA) gates being used for reconfigurability ---useful if you can't afford an ASIC run (a million bucks a mask...) or if algorithms get tweaked (eg you release before the Spec comes out, or you are shooting for time-to-market). Clockwise an FPGA wastes time in extra wire routing although since an FPGA may be made in state of the art processes, and your ASIC may not, its a complex tradeoff. (Albeit some circuit topologies work very well on FPGAs) So for the Cypherpunk wanting hardware (vs cluster) acceleration, FPGAs are the way to go. For TLAs, you prototype in FPGAs of course, and then make some chips in your private fab. (Same for Broadcom, etc.) For someone making 10,000 routers, you use FPGAs. DESCrack was solving a problem for which the x86 is not very efficient at computing --all the sub-byte bit-diddling-- and hardware is very efficient (by design in DES, after all).