The fastest hardware implementation of RC4 that I know is 2 bytes/clock. I
personally programmed a 1 byte/clock RC4 in a FPGA, it's quite simple.

At 2 bytes/clock you still need a clock of 10 gigahertz to encrypt 100
Gbps. That's unfeasible, the way it's done is using paralelism, then you
can use any algorithm you want as long as you have silicon available.
Consider there are 400 Gbps systems coming online.

Using a PC for that kind of workload is a waste of money and power. FPGAs
are not that expensive nowadays.




> Just as a data point, on x86 processors with AESNI you can encrypt AES in,
> say, XTS mode with about 0.75 cycles / byte on each core.
>
> On an Intel Xeon E5-2690 'openssl speed -multi 4 -evp aes-128-xts' tops
> out
> at 13.5 GB/s for 8k blocks, which is 108 Gbps. That's only using half the
> physical cores and no hyperthreading.
>
> However, that's unlikely a realistic benchmark for whatever context the
> original question was referring to.
>
>
> On Sat, Jun 22, 2013 at 5:25 PM, Peter Maxwell
> <pe...@allicient.co.uk>wrote:
>
>>
>>
>> On 22 June 2013 23:31, James A. Donald <jam...@echeque.com> wrote:
>>
>>>  On 2013-06-23 6:47 AM, Peter Maxwell wrote:
>>>
>>>
>>>
>>>  I think Bernstein's Salsa20 is faster and significantly more secure
>>> than RC4, whether you'll be able to design hardware to run at
>>> line-speed is
>>> somewhat more questionable though (would be interested to know if it's
>>> possible right enough).
>>>
>>>
>>> I would be surprised if it is faster.
>>>
>>>
>>>
>>
>> Given the 100Gbps spec, I can only presume it's hardware that's being
>> talked about, which is well outwith my knowledge.  We also don't know
>> whether there is to be only one keystream allowed or not.
>>
>> However, just to give an idea of performance: from a cursory search on
>> Google, once can seemingly find Salsa20/12 being implemented recently on
>> GPU with performance around 43Gbps without memory transfer (2.7Gbps
>> with) -
>> http://link.springer.com/chapter/10.1007%2F978-3-642-38553-7_11 ) -
>> unfortunately I don't have access to the paper.
>>
>> On a decent 64-bit processor, the full Salsa20/20 is coming in around
>> 3-4cpb - http://bench.cr.yp.to/results-stream.html - and while cpb isn't
>> a great measurement, it at least gives a feel for things.
>>
>>
>> Going on a very naive approach, I would imagine the standard RC4 will
>> suffer due to being byte-orientated and not particularly open to
>> parallelism.  Salsa20 operates on 32-bit words and from a cursory
>> inspection of the spec seems to offer at least some options to do
>> operations in parallel.
>>
>> If I were putting money on it, I suspect one could optimise at least
>> Salsa20/12 to be faster than RC4 on modern platforms; whether this has
>> been
>> done is another story.  Fairly sure Salsa20/8 was faster than RC4
>> out-of-the-box.
>>
>> As with anything though, I stand to be corrected.
>>
>>
>>
>>
>> _______________________________________________
>> cryptography mailing list
>> cryptography@randombit.net
>> http://lists.randombit.net/mailman/listinfo/cryptography
>>
>>
> _______________________________________________
> cryptography mailing list
> cryptography@randombit.net
> http://lists.randombit.net/mailman/listinfo/cryptography
>


_______________________________________________
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography

Reply via email to