Re: [Firebird-devel] Crypto Algoritm Performance
The Hardware AES uses differend MAgic constant (Seed?) that many popular software implementations (If I recall, reading from somewhere). So the initial state of non HW accelerated implementation should match the HW one, that's all. PS: Can't remember the details just pumped into this one, that I once stumbled upon some article talking about that On Sat, Sep 5, 2015 at 9:50 PM, Leyne, Seanwrote: > Jim and Boris, > > > Something you may want to investigate is replacing the "pure C" > > implementation of ChaCha20 with the rotate step replaced with either a > > compiler intrinsic (Microsoft) or a bit of assembler (gcc). SHA1 has > > the same issue. I haven't a clue as to why popular crypto algoritms > > use a rotate, virtually all microprocessors have rotate instructions, > > but C lacks a rotate operator and the standard libraries neglect to > support it. > > Forgive my naïve point of view, but given that AES instruction set has > been built into AMD and Intel CPUs since 2011, why do you feel that it is > necessary to push for ChaCha20***? > > To my reading, Boris' numbers have shown that AES performance is more than > adequate (53.2 AVG seconds to process 256MB = 4+MB/s). > > Further, considering that the use can is the encryption of data blocks > which would be much smaller than even 1MB, will be performance difference > really be noticeable? > > > Sean > > *** Separately, with Intel HyperThreaded CPUs and considering that AES in > "on-chip" wouldn't that allow the core processing the encryption to shift > to focus on the other thread instruction while the first thread wait for > the on chip AES processor operates? In other words, isn't it possible that > ChaCha20 is only faster when CPUs are being "single minded" and that real > world performance on a server dealing with several tasks might favor CPUs > with native AES instructions? > > > > Here are numbers: > > -- > > --- AES, BOTAN based code, with AES-NI instruction set all enc > > > > 531.153.2 > > > > -- > > > > AES, INTEL based code, with AES-NI instruction set all enc > > > > 544.876.6 > > > > > > -- > > AES, code based on Bouncy Castle (Java) , without AES-NI instruction set > > allenc > > > > 2071.8 1620.6 > > > > > > -- > > ChaCha20, code based on Bouncy Castle (Java) > > > > 1712.7 1234.8 > > > > -- > Firebird-Devel mailing list, web interface at > https://lists.sourceforge.net/lists/listinfo/firebird-devel > -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Crypto Algoritm Performance
Jim and Boris, > Something you may want to investigate is replacing the "pure C" > implementation of ChaCha20 with the rotate step replaced with either a > compiler intrinsic (Microsoft) or a bit of assembler (gcc). SHA1 has > the same issue. I haven't a clue as to why popular crypto algoritms > use a rotate, virtually all microprocessors have rotate instructions, > but C lacks a rotate operator and the standard libraries neglect to support > it. Forgive my naïve point of view, but given that AES instruction set has been built into AMD and Intel CPUs since 2011, why do you feel that it is necessary to push for ChaCha20***? To my reading, Boris' numbers have shown that AES performance is more than adequate (53.2 AVG seconds to process 256MB = 4+MB/s). Further, considering that the use can is the encryption of data blocks which would be much smaller than even 1MB, will be performance difference really be noticeable? Sean *** Separately, with Intel HyperThreaded CPUs and considering that AES in "on-chip" wouldn't that allow the core processing the encryption to shift to focus on the other thread instruction while the first thread wait for the on chip AES processor operates? In other words, isn't it possible that ChaCha20 is only faster when CPUs are being "single minded" and that real world performance on a server dealing with several tasks might favor CPUs with native AES instructions? > Here are numbers: > -- > --- AES, BOTAN based code, with AES-NI instruction set all enc > > 531.1 53.2 > > -- > > AES, INTEL based code, with AES-NI instruction set all enc > > 544.8 76.6 > > > -- > AES, code based on Bouncy Castle (Java) , without AES-NI instruction set > all enc > > 2071.8 1620.6 > > > -- > ChaCha20, code based on Bouncy Castle (Java) > > 1712.7 1234.8 -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Crypto Algoritm Performance
I have implemented ChaCha20 and compared it with various AES implementations on my other (still cheap) notebook. All my implementations are made for Windows and MS Visual Studio, but I think that Intel's AES-NI code (see below) and the original ChaCha code was made for GNU C compiler. First implementation with AES-NI instruction set is based on BOTAN library: http://botan.randombit.net/ For second implementation, INTEL based code, with AES-NI instruction set, I have used code from Intel White Paper (Shay Gueron) along with code from Dr. Brian Gladman: https://software.intel.com/sites/default/files/article/165683/aes-wp-2012-09-22-v01.pdf https://github.com/BrianGladman/AES/ Third (AES) and fourth (ChaCha) implementations are based on Bouncy Castle java library. The following results show AES and ChaCha encryption od 256 MB file with 32kB buffer and ECB mode of operation (for AES), without parallelization. The results are presented in two columns. First column shows whole program execution, from program start to end. Second column shows only time needed for encryption and key setup (without IO operations). Here are numbers: - AES, BOTAN based code, with AES-NI instruction set all enc 594 63 562 61 469 63 547 78 468 32 562 47 500 48 562 46 469 48 578 46 531.153.2 -- AES, INTEL based code, with AES-NI instruction set all enc 516 94 531 47 625 63 578 79 515 61 532 95 515 93 574 76 531 94 531 64 544.876.6 - AES, code based on Bouncy Castle (Java) , without AES-NI instruction set allenc 2031 1657 2047 1625 2015 1676 2047 1578 2078 1736 2078 1543 2015 1625 2219 1672 2063 1517 2125 1577 2071.8 1620.6 - ChaCha20, code based on Bouncy Castle (Java) 1625 1251 1672 1143 2016 1253 1672 1313 1750 1138 1672 1298 1625 1200 1797 1298 1641 1251 1657 1203 1712.7 1234.8 - As you can see, ChaCha implementation has far worse performance than implementation accelerated with AES-NI instruction set. However, it is somewhat faster than AES implementation without AES-Ni instruction set. I don't know whether the source code of these apps would satisfy your (Firebird) coding standards. If anyone wants to check it out, I could publish it on GitHub, or elsewhere. Just let me know how and where. Boris Damjanovic On 8/31/2015 3:03 PM, Jim Starkey wrote: > For the non-aficinionadoes, ECB is the electronic code book mode where each > 16 byte block is independently encrypted/decrypted. As such, it can reveal a > great deal about an encrypted document or stream as a repeating block will > always have the same encrypted form. > > The Ciphertext Block Chaining (CBC) works around this problem by XORing the > previous block's ciphertext with the next block's plaintext before > encryption. This makes it measureably, but not significantly, slower than > ECB. > > Another interesting variationon CBC is Ciphertext Stealing mode (CTS) used to > handle plaintexts of lengths that are not multiples of 16 bytes without > padding. Ciphertext stealing works by padding the unused tail of the last -- > and incomplete -- block with the trailing byes of the previous blocks > ciphertext before encryption, transmitting this last block before the next to > last block, then transmitting the next to last encrypted block truncated the > the original length of the last block. It's a really cute hack, but it > obviously doesn't work on plaintexts less than 16 bytes. > > The differences between AES in software and AES-NI (new instructions) will > vary wildly depending whether AES-NI is implemented in just microcode or > actual hardware. But none of these affect the security of AES. > > AES-256 isn't significantly more secure than AES-128 for normal computers, > though NSA believes it will be more resilient against attack by quantum > computers, if they ever show up. Personally, this is not something I'm > losing sleep over. > > Jim Starkey > > >> On Aug 31, 2015, at 2:01 AM, dbo...@poen.net wrote: >> >> Hi James, >> >> more numbers here. >> >> Soft. AES implementation vs AES-NI implementation, 512 MB, ECB mode of >> operation, single core, buffer size 32kB, Windows: >> AES 128:3873 ms (average calculated on 10 measurements) >> AES-NI 128: 1067 ms (average calculated on 10 measurements) >> >> Numbers are from my study, and they were also computed on pretty cheap >> notebook. The obtained results are similar to Intel's papers
Re: [Firebird-devel] Crypto Algoritm Performance
Something you may want to investigate is replacing the "pure C" implementation of ChaCha20 with the rotate step replaced with either a compiler intrinsic (Microsoft) or a bit of assembler (gcc). SHA1 has the same issue. I haven't a clue as to why popular crypto algoritms use a rotate, virtually all microprocessors have rotate instructions, but C lacks a rotate operator and the standard libraries neglect to support it. On Friday, September 4, 2015, Boris Damjanovicwrote: > I have implemented ChaCha20 and compared it with various AES > implementations on my other (still cheap) notebook. All my > implementations are made for Windows and MS Visual Studio, but I think > that Intel's AES-NI code (see below) and the original ChaCha code was > made for GNU C compiler. > > First implementation with AES-NI instruction set is based on BOTAN library: > http://botan.randombit.net/ > > For second implementation, INTEL based code, with AES-NI instruction > set, I have used code from Intel White Paper (Shay Gueron) along with > code from Dr. Brian Gladman: > > https://software.intel.com/sites/default/files/article/165683/aes-wp-2012-09-22-v01.pdf > https://github.com/BrianGladman/AES/ > > Third (AES) and fourth (ChaCha) implementations are based on Bouncy > Castle java library. > > The following results show AES and ChaCha encryption od 256 MB file with > 32kB buffer and ECB mode of operation (for AES), without parallelization. > The results are presented in two columns. First column shows whole > program execution, from program start to end. Second column shows only > time needed for encryption and key setup (without IO operations). > > Here are numbers: > > - > AES, BOTAN based code, with AES-NI instruction set > all enc > 594 63 > 562 61 > 469 63 > 547 78 > 468 32 > 562 47 > 500 48 > 562 46 > 469 48 > 578 46 > > 531.153.2 > > > -- > > AES, INTEL based code, with AES-NI instruction set > all enc > 516 94 > 531 47 > 625 63 > 578 79 > 515 61 > 532 95 > 515 93 > 574 76 > 531 94 > 531 64 > > 544.876.6 > > > > - > AES, code based on Bouncy Castle (Java) , without AES-NI instruction set > allenc > 2031 1657 > 2047 1625 > 2015 1676 > 2047 1578 > 2078 1736 > 2078 1543 > 2015 1625 > 2219 1672 > 2063 1517 > 2125 1577 > > 2071.8 1620.6 > > > > - > ChaCha20, code based on Bouncy Castle (Java) > 1625 1251 > 1672 1143 > 2016 1253 > 1672 1313 > 1750 1138 > 1672 1298 > 1625 1200 > 1797 1298 > 1641 1251 > 1657 1203 > > 1712.7 1234.8 > > - > > As you can see, ChaCha implementation has far worse performance than > implementation accelerated with AES-NI instruction set. However, it is > somewhat faster than AES implementation without AES-Ni instruction set. > > I don't know whether the source code of these apps would satisfy your > (Firebird) coding standards. If anyone wants to check it out, I could > publish it on GitHub, or elsewhere. Just let me know how and where. > > Boris Damjanovic > > > > > On 8/31/2015 3:03 PM, Jim Starkey wrote: > > For the non-aficinionadoes, ECB is the electronic code book mode where > each 16 byte block is independently encrypted/decrypted. As such, it can > reveal a great deal about an encrypted document or stream as a repeating > block will always have the same encrypted form. > > > > The Ciphertext Block Chaining (CBC) works around this problem by XORing > the previous block's ciphertext with the next block's plaintext before > encryption. This makes it measureably, but not significantly, slower than > ECB. > > > > Another interesting variationon CBC is Ciphertext Stealing mode (CTS) > used to handle plaintexts of lengths that are not multiples of 16 bytes > without padding. Ciphertext stealing works by padding the unused tail of > the last -- and incomplete -- block with the trailing byes of the previous > blocks ciphertext before encryption, transmitting this last block before > the next to last block, then transmitting the next to last encrypted block > truncated the the original length of the last block. It's a really cute > hack, but it obviously doesn't work on plaintexts less than 16 bytes. > > > > The differences between AES in software and AES-NI (new instructions) > will vary wildly depending whether AES-NI is implemented in just microcode > or actual hardware. But none of these affect the security of AES. > > > > AES-256 isn't significantly more secure than AES-128 for normal >
Re: [Firebird-devel] Crypto Algoritm Performance
Hi James, more numbers here. Soft. AES implementation vs AES-NI implementation, 512 MB, ECB mode of operation, single core, buffer size 32kB, Windows: AES 128:3873 ms (average calculated on 10 measurements) AES-NI 128: 1067 ms (average calculated on 10 measurements) Numbers are from my study, and they were also computed on pretty cheap notebook. The obtained results are similar to Intel's papers (there are many). I will try to implement ChaCha20 on Windows over the next few days. Boris Damjanovi > Here are some numbers. The numbers were comoued on my boat computer, > which > is a very cheap notebook, so consider them relative, not absolute. > > 10mb encryption with a single key: > > RC4: 0.021 seconds > ChaCha20: 0.007 > AES-128:0.212 > > 10mb encryption setting key every 1024 bytes: > > RC4: 0.201 seconds > ChaCha20: 0.091 > AES-128:2.400 > > ChaCha20 is a clear winner. And it has a cool name. > > I make no claims that the AES implementation is anywhere near optimal -- > it > is one I found with an acceptable license and not deeply embedded in a > huge > crypto library. AES, unlike the stream ciphers, has opportunities for > what > D. J. Bernstein (the crypto god who invented ChaCha20 and all sorts of > other good and valuable stuff) calls voodoo. > > > > -- > Jim Starkey > -- > Firebird-Devel mailing list, web interface at > https://lists.sourceforge.net/lists/listinfo/firebird-devel > -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Crypto Algoritm Performance
For the non-aficinionadoes, ECB is the electronic code book mode where each 16 byte block is independently encrypted/decrypted. As such, it can reveal a great deal about an encrypted document or stream as a repeating block will always have the same encrypted form. The Ciphertext Block Chaining (CBC) works around this problem by XORing the previous block's ciphertext with the next block's plaintext before encryption. This makes it measureably, but not significantly, slower than ECB. Another interesting variationon CBC is Ciphertext Stealing mode (CTS) used to handle plaintexts of lengths that are not multiples of 16 bytes without padding. Ciphertext stealing works by padding the unused tail of the last -- and incomplete -- block with the trailing byes of the previous blocks ciphertext before encryption, transmitting this last block before the next to last block, then transmitting the next to last encrypted block truncated the the original length of the last block. It's a really cute hack, but it obviously doesn't work on plaintexts less than 16 bytes. The differences between AES in software and AES-NI (new instructions) will vary wildly depending whether AES-NI is implemented in just microcode or actual hardware. But none of these affect the security of AES. AES-256 isn't significantly more secure than AES-128 for normal computers, though NSA believes it will be more resilient against attack by quantum computers, if they ever show up. Personally, this is not something I'm losing sleep over. Jim Starkey > On Aug 31, 2015, at 2:01 AM, dbo...@poen.net wrote: > > Hi James, > > more numbers here. > > Soft. AES implementation vs AES-NI implementation, 512 MB, ECB mode of > operation, single core, buffer size 32kB, Windows: > AES 128:3873 ms (average calculated on 10 measurements) > AES-NI 128: 1067 ms (average calculated on 10 measurements) > > Numbers are from my study, and they were also computed on pretty cheap > notebook. The obtained results are similar to Intel's papers (there are > many). > > I will try to implement ChaCha20 on Windows over the next few days. > > Boris Damjanovi > > >> Here are some numbers. The numbers were comoued on my boat computer, >> which >> is a very cheap notebook, so consider them relative, not absolute. >> >> 10mb encryption with a single key: >> >>RC4: 0.021 seconds >>ChaCha20: 0.007 >>AES-128:0.212 >> >> 10mb encryption setting key every 1024 bytes: >> >>RC4: 0.201 seconds >>ChaCha20: 0.091 >>AES-128:2.400 >> >> ChaCha20 is a clear winner. And it has a cool name. >> >> I make no claims that the AES implementation is anywhere near optimal -- >> it >> is one I found with an acceptable license and not deeply embedded in a >> huge >> crypto library. AES, unlike the stream ciphers, has opportunities for >> what >> D. J. Bernstein (the crypto god who invented ChaCha20 and all sorts of >> other good and valuable stuff) calls voodoo. >> >> >> >> -- >> Jim Starkey >> -- >> Firebird-Devel mailing list, web interface at >> https://lists.sourceforge.net/lists/listinfo/firebird-devel >> > > > > -- > Firebird-Devel mailing list, web interface at > https://lists.sourceforge.net/lists/listinfo/firebird-devel -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
[Firebird-devel] Crypto Algoritm Performance
Here are some numbers. The numbers were comoued on my boat computer, which is a very cheap notebook, so consider them relative, not absolute. 10mb encryption with a single key: RC4: 0.021 seconds ChaCha20: 0.007 AES-128:0.212 10mb encryption setting key every 1024 bytes: RC4: 0.201 seconds ChaCha20: 0.091 AES-128:2.400 ChaCha20 is a clear winner. And it has a cool name. I make no claims that the AES implementation is anywhere near optimal -- it is one I found with an acceptable license and not deeply embedded in a huge crypto library. AES, unlike the stream ciphers, has opportunities for what D. J. Bernstein (the crypto god who invented ChaCha20 and all sorts of other good and valuable stuff) calls voodoo. -- Jim Starkey -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel