Re: [Firebird-devel] Crypto Algoritm Performance

2015-09-07 Thread Tommi Prami
The Hardware AES uses differend MAgic constant (Seed?) that many popular
software implementations (If I recall, reading from somewhere). So the
initial state of non HW accelerated implementation should match the HW one,
that's all.


PS: Can't remember the details just pumped into this one, that I once
stumbled upon some article talking about that

On Sat, Sep 5, 2015 at 9:50 PM, Leyne, Sean 
wrote:

> Jim and Boris,
>
> > Something you may want to investigate is replacing the "pure C"
> > implementation of ChaCha20 with the rotate step replaced with either a
> > compiler intrinsic (Microsoft) or a bit of assembler (gcc).  SHA1 has
> > the same issue.  I haven't a clue as to why popular crypto algoritms
> > use a rotate, virtually all microprocessors have rotate instructions,
> > but C lacks a rotate operator and the standard libraries neglect to
> support it.
>
> Forgive my naïve point of view, but given that AES instruction set has
> been built into AMD and Intel CPUs since 2011, why do you feel that it is
> necessary to push for ChaCha20***?
>
> To my reading, Boris' numbers have shown that AES performance is more than
> adequate (53.2 AVG seconds to process 256MB = 4+MB/s).
>
> Further, considering that the use can is the encryption of data blocks
> which would be much smaller than even 1MB, will be performance difference
> really be noticeable?
>
>
> Sean
>
> *** Separately, with Intel HyperThreaded CPUs and considering that AES in
> "on-chip" wouldn't that allow the core processing the encryption to shift
> to focus on the other thread instruction while the first thread wait for
> the on chip AES processor operates?  In other words, isn't it possible that
> ChaCha20 is only faster when CPUs are being "single minded" and that real
> world performance on a server dealing with several tasks might favor CPUs
> with native AES instructions?
>
>
> > Here are numbers:
> > --
> > --- AES, BOTAN based code, with AES-NI instruction set all enc
> > 
> > 531.153.2
> >
> > --
> >
> > AES, INTEL based code, with AES-NI instruction set all enc
> > 
> > 544.876.6
> >
> >
> > --
> > AES, code based on Bouncy Castle (Java)  , without AES-NI instruction set
> >   allenc
> > 
> > 2071.8 1620.6
> >
> >
> > --
> > ChaCha20, code based on Bouncy Castle (Java)
> > 
> > 1712.7 1234.8
>
>
>
> --
> Firebird-Devel mailing list, web interface at
> https://lists.sourceforge.net/lists/listinfo/firebird-devel
>
--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Crypto Algoritm Performance

2015-09-05 Thread Leyne, Sean
Jim and Boris,

> Something you may want to investigate is replacing the "pure C"
> implementation of ChaCha20 with the rotate step replaced with either a 
> compiler intrinsic (Microsoft) or a bit of assembler (gcc).  SHA1 has 
> the same issue.  I haven't a clue as to why popular crypto algoritms 
> use a rotate, virtually all microprocessors have rotate instructions, 
> but C lacks a rotate operator and the standard libraries neglect to support 
> it.

Forgive my naïve point of view, but given that AES instruction set has been 
built into AMD and Intel CPUs since 2011, why do you feel that it is necessary 
to push for ChaCha20***?

To my reading, Boris' numbers have shown that AES performance is more than 
adequate (53.2 AVG seconds to process 256MB = 4+MB/s).

Further, considering that the use can is the encryption of data blocks which 
would be much smaller than even 1MB, will be performance difference really be 
noticeable?


Sean

*** Separately, with Intel HyperThreaded CPUs and considering that AES in 
"on-chip" wouldn't that allow the core processing the encryption to shift to 
focus on the other thread instruction while the first thread wait for the on 
chip AES processor operates?  In other words, isn't it possible that ChaCha20 
is only faster when CPUs are being "single minded" and that real world 
performance on a server dealing with several tasks might favor CPUs with native 
AES instructions?


> Here are numbers:
> --
> --- AES, BOTAN based code, with AES-NI instruction set all     enc
> 
> 531.1    53.2
> 
> --
> 
> AES, INTEL based code, with AES-NI instruction set all     enc
> 
> 544.8    76.6
> 
> 
> --
> AES, code based on Bouncy Castle (Java)  , without AES-NI instruction set
>   all    enc
> 
> 2071.8 1620.6
> 
> 
> --
> ChaCha20, code based on Bouncy Castle (Java)
> 
> 1712.7 1234.8


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Crypto Algoritm Performance

2015-09-04 Thread Boris Damjanovic
I have implemented ChaCha20 and compared it with various AES 
implementations on my other (still cheap) notebook. All my 
implementations are made for Windows and MS Visual Studio, but I think 
that Intel's AES-NI code (see below) and the original ChaCha code was 
made for GNU C compiler.

First implementation with AES-NI instruction set is based on BOTAN library:
http://botan.randombit.net/

For second implementation, INTEL based code, with AES-NI instruction 
set, I have used code from Intel White Paper (Shay Gueron) along with 
code from Dr. Brian Gladman:
https://software.intel.com/sites/default/files/article/165683/aes-wp-2012-09-22-v01.pdf
https://github.com/BrianGladman/AES/

Third (AES) and fourth (ChaCha) implementations are based on Bouncy 
Castle java library.

The following results show AES and ChaCha encryption od 256 MB file with 
32kB buffer and ECB mode of operation (for AES), without parallelization.
The results are presented in two columns. First column shows whole 
program execution, from program start to end. Second column shows only 
time needed for encryption and key setup (without IO operations).

Here are numbers:
-
AES, BOTAN based code, with AES-NI instruction set
all enc
594  63
562  61
469  63
547  78
468  32
562  47
500  48
562  46
469  48
578  46

531.153.2

--

AES, INTEL based code, with AES-NI instruction set
all enc
516  94
531  47
625  63
578  79
515  61
532  95
515  93
574  76
531  94
531  64

544.876.6


-
AES, code based on Bouncy Castle (Java)  , without AES-NI instruction set
  allenc
2031   1657
2047   1625
2015   1676
2047   1578
2078   1736
2078   1543
2015   1625
2219   1672
2063   1517
2125   1577

2071.8 1620.6


-
ChaCha20, code based on Bouncy Castle (Java)
1625   1251
1672   1143
2016   1253
1672   1313
1750   1138
1672   1298
1625   1200
1797   1298
1641   1251
1657   1203

1712.7 1234.8
-

As you can see, ChaCha implementation has far worse performance than 
implementation accelerated with AES-NI instruction set. However, it is 
somewhat faster than AES implementation without AES-Ni instruction set.

I don't know whether the source code of these apps would satisfy your 
(Firebird) coding standards. If anyone wants to check it out, I could 
publish it on GitHub, or elsewhere. Just let me know how and where.

Boris Damjanovic




On 8/31/2015 3:03 PM, Jim Starkey wrote:
> For the non-aficinionadoes, ECB is the electronic code book mode where each 
> 16 byte block is independently encrypted/decrypted.  As such, it can reveal a 
> great deal about an encrypted document or stream as a repeating block will 
> always have the same encrypted form.
>
> The Ciphertext Block Chaining (CBC) works around this problem by XORing the 
> previous block's ciphertext with the next block's plaintext before 
> encryption.  This makes it measureably, but not significantly, slower than 
> ECB.
>
> Another interesting variationon CBC is Ciphertext Stealing mode (CTS) used to 
> handle plaintexts of lengths that are not multiples of 16 bytes without 
> padding.  Ciphertext stealing works by padding the unused tail of the last -- 
> and incomplete -- block with the trailing byes of the previous blocks 
> ciphertext before encryption, transmitting this last block before the next to 
> last block, then transmitting the next to last encrypted block truncated the 
> the original length of the last block.  It's a really cute hack, but it 
> obviously doesn't work on plaintexts less than 16 bytes.
>
> The differences between AES in software and AES-NI (new instructions) will 
> vary wildly depending whether AES-NI is implemented in just microcode or 
> actual hardware.  But none of these affect the security of AES.
>
> AES-256 isn't significantly more secure than AES-128 for normal computers, 
> though NSA believes it will be more resilient against attack by quantum 
> computers, if they ever show up.  Personally, this is not something I'm 
> losing sleep over.
>
> Jim Starkey
>
>
>> On Aug 31, 2015, at 2:01 AM, dbo...@poen.net wrote:
>>
>> Hi James,
>>
>> more numbers here.
>>
>> Soft. AES implementation vs AES-NI implementation, 512 MB, ECB mode of
>> operation, single core, buffer size 32kB, Windows:
>> AES 128:3873 ms (average calculated on 10 measurements)
>> AES-NI 128: 1067 ms (average calculated on 10 measurements)
>>
>> Numbers are from my study, and they were also computed on pretty cheap
>> notebook. The obtained results are similar to Intel's papers 

Re: [Firebird-devel] Crypto Algoritm Performance

2015-09-04 Thread James Starkey
Something you may want to investigate is replacing the "pure C"
implementation of ChaCha20 with the rotate step replaced with either a
compiler intrinsic (Microsoft) or a bit of assembler (gcc).  SHA1 has the
same issue.  I haven't a clue as to why popular crypto algoritms use a
rotate, virtually all microprocessors have rotate instructions, but C lacks
a rotate operator and the standard libraries neglect to support it.

On Friday, September 4, 2015, Boris Damjanovic  wrote:

> I have implemented ChaCha20 and compared it with various AES
> implementations on my other (still cheap) notebook. All my
> implementations are made for Windows and MS Visual Studio, but I think
> that Intel's AES-NI code (see below) and the original ChaCha code was
> made for GNU C compiler.
>
> First implementation with AES-NI instruction set is based on BOTAN library:
> http://botan.randombit.net/
>
> For second implementation, INTEL based code, with AES-NI instruction
> set, I have used code from Intel White Paper (Shay Gueron) along with
> code from Dr. Brian Gladman:
>
> https://software.intel.com/sites/default/files/article/165683/aes-wp-2012-09-22-v01.pdf
> https://github.com/BrianGladman/AES/
>
> Third (AES) and fourth (ChaCha) implementations are based on Bouncy
> Castle java library.
>
> The following results show AES and ChaCha encryption od 256 MB file with
> 32kB buffer and ECB mode of operation (for AES), without parallelization.
> The results are presented in two columns. First column shows whole
> program execution, from program start to end. Second column shows only
> time needed for encryption and key setup (without IO operations).
>
> Here are numbers:
>
> -
> AES, BOTAN based code, with AES-NI instruction set
> all enc
> 594  63
> 562  61
> 469  63
> 547  78
> 468  32
> 562  47
> 500  48
> 562  46
> 469  48
> 578  46
> 
> 531.153.2
>
>
> --
>
> AES, INTEL based code, with AES-NI instruction set
> all enc
> 516  94
> 531  47
> 625  63
> 578  79
> 515  61
> 532  95
> 515  93
> 574  76
> 531  94
> 531  64
> 
> 544.876.6
>
>
>
> -
> AES, code based on Bouncy Castle (Java)  , without AES-NI instruction set
>   allenc
> 2031   1657
> 2047   1625
> 2015   1676
> 2047   1578
> 2078   1736
> 2078   1543
> 2015   1625
> 2219   1672
> 2063   1517
> 2125   1577
> 
> 2071.8 1620.6
>
>
>
> -
> ChaCha20, code based on Bouncy Castle (Java)
> 1625   1251
> 1672   1143
> 2016   1253
> 1672   1313
> 1750   1138
> 1672   1298
> 1625   1200
> 1797   1298
> 1641   1251
> 1657   1203
> 
> 1712.7 1234.8
>
> -
>
> As you can see, ChaCha implementation has far worse performance than
> implementation accelerated with AES-NI instruction set. However, it is
> somewhat faster than AES implementation without AES-Ni instruction set.
>
> I don't know whether the source code of these apps would satisfy your
> (Firebird) coding standards. If anyone wants to check it out, I could
> publish it on GitHub, or elsewhere. Just let me know how and where.
>
> Boris Damjanovic
>
>
>
>
> On 8/31/2015 3:03 PM, Jim Starkey wrote:
> > For the non-aficinionadoes, ECB is the electronic code book mode where
> each 16 byte block is independently encrypted/decrypted.  As such, it can
> reveal a great deal about an encrypted document or stream as a repeating
> block will always have the same encrypted form.
> >
> > The Ciphertext Block Chaining (CBC) works around this problem by XORing
> the previous block's ciphertext with the next block's plaintext before
> encryption.  This makes it measureably, but not significantly, slower than
> ECB.
> >
> > Another interesting variationon CBC is Ciphertext Stealing mode (CTS)
> used to handle plaintexts of lengths that are not multiples of 16 bytes
> without padding.  Ciphertext stealing works by padding the unused tail of
> the last -- and incomplete -- block with the trailing byes of the previous
> blocks ciphertext before encryption, transmitting this last block before
> the next to last block, then transmitting the next to last encrypted block
> truncated the the original length of the last block.  It's a really cute
> hack, but it obviously doesn't work on plaintexts less than 16 bytes.
> >
> > The differences between AES in software and AES-NI (new instructions)
> will vary wildly depending whether AES-NI is implemented in just microcode
> or actual hardware.  But none of these affect the security of AES.
> >
> > AES-256 isn't significantly more secure than AES-128 for normal
> 

Re: [Firebird-devel] Crypto Algoritm Performance

2015-08-31 Thread dboris
Hi James,

more numbers here.

Soft. AES implementation vs AES-NI implementation, 512 MB, ECB mode of
operation, single core, buffer size 32kB, Windows:
AES 128:3873 ms (average calculated on 10 measurements)
AES-NI 128: 1067 ms (average calculated on 10 measurements)

Numbers are from my study, and they were also computed on pretty cheap
notebook. The obtained results are similar to Intel's papers (there are
many).

I will try to implement ChaCha20 on Windows over the next few days.

Boris Damjanovi


> Here are some numbers.  The numbers were comoued on my boat computer,
> which
> is a very cheap notebook, so consider them relative, not absolute.
>
> 10mb encryption with a single key:
>
> RC4:   0.021 seconds
> ChaCha20: 0.007
> AES-128:0.212
>
> 10mb encryption setting key every 1024 bytes:
>
> RC4:   0.201 seconds
> ChaCha20: 0.091
> AES-128:2.400
>
> ChaCha20 is a clear winner.  And it has a cool name.
>
> I make no claims that the AES implementation is anywhere near optimal --
> it
> is one I found with an acceptable license and not deeply embedded in a
> huge
> crypto library.  AES, unlike the stream ciphers, has opportunities for
> what
> D. J. Bernstein (the crypto god who invented ChaCha20 and all sorts of
> other good and valuable stuff) calls voodoo.
>
>
>
> --
> Jim Starkey
> --
> Firebird-Devel mailing list, web interface at
> https://lists.sourceforge.net/lists/listinfo/firebird-devel
>



--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Crypto Algoritm Performance

2015-08-31 Thread Jim Starkey
For the non-aficinionadoes, ECB is the electronic code book mode where each 16 
byte block is independently encrypted/decrypted.  As such, it can reveal a 
great deal about an encrypted document or stream as a repeating block will 
always have the same encrypted form.

The Ciphertext Block Chaining (CBC) works around this problem by XORing the 
previous block's ciphertext with the next block's plaintext before encryption.  
This makes it measureably, but not significantly, slower than ECB.

Another interesting variationon CBC is Ciphertext Stealing mode (CTS) used to 
handle plaintexts of lengths that are not multiples of 16 bytes without 
padding.  Ciphertext stealing works by padding the unused tail of the last -- 
and incomplete -- block with the trailing byes of the previous blocks 
ciphertext before encryption, transmitting this last block before the next to 
last block, then transmitting the next to last encrypted block truncated the 
the original length of the last block.  It's a really cute hack, but it 
obviously doesn't work on plaintexts less than 16 bytes.

The differences between AES in software and AES-NI (new instructions) will vary 
wildly depending whether AES-NI is implemented in just microcode or actual 
hardware.  But none of these affect the security of AES.

AES-256 isn't significantly more secure than AES-128 for normal computers, 
though NSA believes it will be more resilient against attack by quantum 
computers, if they ever show up.  Personally, this is not something I'm losing 
sleep over.

Jim Starkey


> On Aug 31, 2015, at 2:01 AM, dbo...@poen.net wrote:
> 
> Hi James,
> 
> more numbers here.
> 
> Soft. AES implementation vs AES-NI implementation, 512 MB, ECB mode of
> operation, single core, buffer size 32kB, Windows:
> AES 128:3873 ms (average calculated on 10 measurements)
> AES-NI 128: 1067 ms (average calculated on 10 measurements)
> 
> Numbers are from my study, and they were also computed on pretty cheap
> notebook. The obtained results are similar to Intel's papers (there are
> many).
> 
> I will try to implement ChaCha20 on Windows over the next few days.
> 
> Boris Damjanovi
> 
> 
>> Here are some numbers.  The numbers were comoued on my boat computer,
>> which
>> is a very cheap notebook, so consider them relative, not absolute.
>> 
>> 10mb encryption with a single key:
>> 
>>RC4:   0.021 seconds
>>ChaCha20: 0.007
>>AES-128:0.212
>> 
>> 10mb encryption setting key every 1024 bytes:
>> 
>>RC4:   0.201 seconds
>>ChaCha20: 0.091
>>AES-128:2.400
>> 
>> ChaCha20 is a clear winner.  And it has a cool name.
>> 
>> I make no claims that the AES implementation is anywhere near optimal --
>> it
>> is one I found with an acceptable license and not deeply embedded in a
>> huge
>> crypto library.  AES, unlike the stream ciphers, has opportunities for
>> what
>> D. J. Bernstein (the crypto god who invented ChaCha20 and all sorts of
>> other good and valuable stuff) calls voodoo.
>> 
>> 
>> 
>> --
>> Jim Starkey
>> --
>> Firebird-Devel mailing list, web interface at
>> https://lists.sourceforge.net/lists/listinfo/firebird-devel
>> 
> 
> 
> 
> --
> Firebird-Devel mailing list, web interface at 
> https://lists.sourceforge.net/lists/listinfo/firebird-devel

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


[Firebird-devel] Crypto Algoritm Performance

2015-08-28 Thread James Starkey
Here are some numbers.  The numbers were comoued on my boat computer, which
is a very cheap notebook, so consider them relative, not absolute.

10mb encryption with a single key:

RC4:   0.021 seconds
ChaCha20: 0.007
AES-128:0.212

10mb encryption setting key every 1024 bytes:

RC4:   0.201 seconds
ChaCha20: 0.091
AES-128:2.400

ChaCha20 is a clear winner.  And it has a cool name.

I make no claims that the AES implementation is anywhere near optimal -- it
is one I found with an acceptable license and not deeply embedded in a huge
crypto library.  AES, unlike the stream ciphers, has opportunities for what
D. J. Bernstein (the crypto god who invented ChaCha20 and all sorts of
other good and valuable stuff) calls voodoo.



-- 
Jim Starkey
--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel