Re: AW: via padlock support much slower in 0.9.8e than in 0.9.8d, why?
Hi! perhaps the benchmark differences come from the different CPU types (I've got an Esther CPU with 1.2GHz). Here is the content of the / proc/cpuinfo of Linux: processor : 0 vendor_id : CentaurHauls cpu family : 6 model : 10 model name : VIA Esther processor 1200MHz stepping: 9 cpu MHz : 1200.052 cache size : 128 KB fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce apic sep mtrr pge cmov pat clflush acpi mmx fxsr sse sse2 tm nx up pni est tm2 rng rng_en ace ace_en ace2 ace2_en phe phe_en pmm pmm_en bogomips: 2401.87 clflush size: 64 Am 25.09.2007 um 21:38 schrieb Buddy Butterfly: Hi, strange. What could be the reason then? I have 2 systems available for testing. C5 and C7. C5 runs Suse 9.3 (kernel 2.6.11) which shows the difference I have posted below. C7 runs Debian etch (kernel 2.6.18 type i686). On the C7 I see no difference between openssl version d and e but speed seems to be much slower (max 292MB/s) with both versions and not going up to over 600MB/s like you posted. Any clues? -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Harald Latzko Gesendet: Dienstag, 25. September 2007 20:25 An: openssl-users@openssl.org Betreff: Re: via padlock support much slower in 0.9.8e than in 0.9.8d, why? Hi! I cannot confirm these performance differences between 0.9.8d and 0.9.8e. My results on a Via CPU are: 0.9.8d == engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 11906104 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 64 size blocks: 9088256 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 256 size blocks: 4744283 aes-256-cbc's in 2.98s Doing aes-256-cbc for 3s on 1024 size blocks: 1624804 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 226672 aes-256-cbc's in 2.99s OpenSSL 0.9.8d 28 Sep 2006 built on: Tue Sep 25 20:13:59 GMT 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes (partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN - DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall - DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM - DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 63499.22k 194531.23k 407562.57k 554599.77k 621035.79k 0.9.8e == engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 11597661 aes-256-cbc's in 3.01s Doing aes-256-cbc for 3s on 64 size blocks: 8927779 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 4708369 aes-256-cbc's in 3.01s Doing aes-256-cbc for 3s on 1024 size blocks: 1622241 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 227275 aes-256-cbc's in 2.97s OpenSSL 0.9.8e 23 Feb 2007 built on: Tue Sep 25 20:21:15 GMT 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes (partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN - DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall - DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM - DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 61648.70k 190459.29k 400446.00k 553724.93k 626881.08k Regards, Harald Am 25.09.2007 um 19:35 schrieb Buddy Butterfly: With a VIA C5 board I get a huge difference in speed with engine padlock support (same machine same OS etc.). Where is the difference coming from. Are there any changes regarding buffering or block sizes? Look at this results: 0.9.8e: #./openssl speed -evp aes-256-cbc -engine padlock engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 9477714 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 64 size blocks: 5371202 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 2058449 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 1024 size blocks: 645381 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 93456 aes-256-cbc's in 3.00s OpenSSL 0.9.8e 23 Feb 2007 built on: Wed Aug 22 17:00:48 CEST 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes (partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN - DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM available
via padlock support much slower in 0.9.8e than in 0.9.8d, why?
With a VIA C5 board I get a huge difference in speed with engine padlock support (same machine same OS etc.). Where is the difference coming from. Are there any changes regarding buffering or block sizes? Look at this results: 0.9.8e: #./openssl speed -evp aes-256-cbc -engine padlock engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 9477714 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 64 size blocks: 5371202 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 2058449 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 1024 size blocks: 645381 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 93456 aes-256-cbc's in 3.00s OpenSSL 0.9.8e 23 Feb 2007 built on: Wed Aug 22 17:00:48 CEST 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 50716.86k 114585.64k 176241.79k 220290.05k 255197.18k # 0.9.8d: # ./openssl speed -evp aes-256-cbc -engine padlock engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 13856973 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 64 size blocks: 10520959 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 5370328 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 1024 size blocks: 1807981 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 251498 aes-256-cbc's in 3.00s OpenSSL 0.9.8d 28 Sep 2006 built on: Fri Nov 10 20:44:47 CET 2006 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 74151.03k 224447.13k 458267.99k 617124.18k 686757.21k # __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: via padlock support much slower in 0.9.8e than in 0.9.8d, why?
Hi! I cannot confirm these performance differences between 0.9.8d and 0.9.8e. My results on a Via CPU are: 0.9.8d == engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 11906104 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 64 size blocks: 9088256 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 256 size blocks: 4744283 aes-256-cbc's in 2.98s Doing aes-256-cbc for 3s on 1024 size blocks: 1624804 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 226672 aes-256-cbc's in 2.99s OpenSSL 0.9.8d 28 Sep 2006 built on: Tue Sep 25 20:13:59 GMT 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes (partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN - DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall - DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM - DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 63499.22k 194531.23k 407562.57k 554599.77k 621035.79k 0.9.8e == engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 11597661 aes-256-cbc's in 3.01s Doing aes-256-cbc for 3s on 64 size blocks: 8927779 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 4708369 aes-256-cbc's in 3.01s Doing aes-256-cbc for 3s on 1024 size blocks: 1622241 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 227275 aes-256-cbc's in 2.97s OpenSSL 0.9.8e 23 Feb 2007 built on: Tue Sep 25 20:21:15 GMT 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes (partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN - DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall - DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM - DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 61648.70k 190459.29k 400446.00k 553724.93k 626881.08k Regards, Harald Am 25.09.2007 um 19:35 schrieb Buddy Butterfly: With a VIA C5 board I get a huge difference in speed with engine padlock support (same machine same OS etc.). Where is the difference coming from. Are there any changes regarding buffering or block sizes? Look at this results: 0.9.8e: #./openssl speed -evp aes-256-cbc -engine padlock engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 9477714 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 64 size blocks: 5371202 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 2058449 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 1024 size blocks: 645381 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 93456 aes-256-cbc's in 3.00s OpenSSL 0.9.8e 23 Feb 2007 built on: Wed Aug 22 17:00:48 CEST 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes (partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN - DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 50716.86k 114585.64k 176241.79k 220290.05k 255197.18k # 0.9.8d: # ./openssl speed -evp aes-256-cbc -engine padlock engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 13856973 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 64 size blocks: 10520959 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 5370328 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 1024 size blocks: 1807981 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 251498 aes-256-cbc's in 3.00s OpenSSL 0.9.8d 28 Sep 2006 built on: Fri Nov 10 20:44:47 CET 2006 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes (partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN - DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 74151.03k 224447.13k 458267.99k 617124.18k
AW: via padlock support much slower in 0.9.8e than in 0.9.8d, why?
Hi, strange. What could be the reason then? I have 2 systems available for testing. C5 and C7. C5 runs Suse 9.3 (kernel 2.6.11) which shows the difference I have posted below. C7 runs Debian etch (kernel 2.6.18 type i686). On the C7 I see no difference between openssl version d and e but speed seems to be much slower (max 292MB/s) with both versions and not going up to over 600MB/s like you posted. Any clues? -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Harald Latzko Gesendet: Dienstag, 25. September 2007 20:25 An: openssl-users@openssl.org Betreff: Re: via padlock support much slower in 0.9.8e than in 0.9.8d, why? Hi! I cannot confirm these performance differences between 0.9.8d and 0.9.8e. My results on a Via CPU are: 0.9.8d == engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 11906104 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 64 size blocks: 9088256 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 256 size blocks: 4744283 aes-256-cbc's in 2.98s Doing aes-256-cbc for 3s on 1024 size blocks: 1624804 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 226672 aes-256-cbc's in 2.99s OpenSSL 0.9.8d 28 Sep 2006 built on: Tue Sep 25 20:13:59 GMT 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes (partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN - DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall - DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM - DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 63499.22k 194531.23k 407562.57k 554599.77k 621035.79k 0.9.8e == engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 11597661 aes-256-cbc's in 3.01s Doing aes-256-cbc for 3s on 64 size blocks: 8927779 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 4708369 aes-256-cbc's in 3.01s Doing aes-256-cbc for 3s on 1024 size blocks: 1622241 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 227275 aes-256-cbc's in 2.97s OpenSSL 0.9.8e 23 Feb 2007 built on: Tue Sep 25 20:21:15 GMT 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes (partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN - DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall - DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM - DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 61648.70k 190459.29k 400446.00k 553724.93k 626881.08k Regards, Harald Am 25.09.2007 um 19:35 schrieb Buddy Butterfly: With a VIA C5 board I get a huge difference in speed with engine padlock support (same machine same OS etc.). Where is the difference coming from. Are there any changes regarding buffering or block sizes? Look at this results: 0.9.8e: #./openssl speed -evp aes-256-cbc -engine padlock engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 9477714 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 64 size blocks: 5371202 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 2058449 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 1024 size blocks: 645381 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 93456 aes-256-cbc's in 3.00s OpenSSL 0.9.8e 23 Feb 2007 built on: Wed Aug 22 17:00:48 CEST 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes (partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN - DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 50716.86k 114585.64k 176241.79k 220290.05k 255197.18k # 0.9.8d: # ./openssl speed -evp aes-256-cbc -engine padlock engine padlock set. Doing aes-256-cbc for 3s on 16 size blocks: 13856973 aes-256-cbc's in 2.99s Doing aes-256-cbc for 3s on 64 size blocks: 10520959 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 5370328 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s