Re: Performance on IA64 using icc vs gcc
OK, I've made some headway on this. First, I'd like to thank all those who have provided input. It looks like the issue might not be the lack of an additional optimization option, but the adverse affect of one of the optimizations included by -O2. I tried a build with -O1 instead and got much better results: cfe2.imorgan> apps/openssl speed aes bf rc4 md5 sha1 2>/dev/null OpenSSL 0.9.8e 23 Feb 2007 built on: Mon Jun 11 12:12:59 PDT 2007 options:bn(64,64) md2(int) rc4(ptr,int) des(idx,cisc,4,long) aes(partial) idea(int) blowfish(idx) compiler: icc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O1 -Wall -no_cpprt -i-static -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=1024 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes md5 8507.03k27722.30k75531.10k 132557.28k 169630.22k sha1 9792.46k27468.37k81129.39k 159646.12k 222044.16k rc4 239281.42k 298509.16k 312526.86k 317020.73k 317904.21k blowfish cbc 51905.17k55981.70k57137.15k57406.12k57450.08k aes-128 cbc 80270.47k91255.53k94353.92k95308.12k94093.90k aes-192 cbc 73833.99k83016.73k85580.54k86355.07k85327.87k aes-256 cbc 68343.73k76161.19k78277.94k78952.79k77976.92k Iain Morgan wrote: > Hello, > > Using the Intel 9.1 compiler on an IA64 system the performance of > AES and (to a lesser extent) other algorithms implemented in > assembly language is less than that using gcc. I've included the > speed output for several of the algorithms below. > > Is this a know issue and is there a workaround other than switching > to gcc? > > Thanks > > -- > Iain Morgan > > cfe2.imorgan> apps/openssl speed aes bf rc4 md5 sha 2>/dev/null > OpenSSL 0.9.8e 23 Feb 2007 > built on: Fri Jun 8 10:46:48 PDT 2007 > options:bn(64,64) md2(int) rc4(ptr,int) des(idx,cisc,4,long) aes(partial) > idea(int) blowfish(idx) > compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H > -DL_ENDIAN -DTERMIO -O3 -Wall -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM > available timing options: TIMES TIMEB HZ=1024 [sysconf value] > timing function used: times > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes > md5 8388.37k27899.28k73067.45k 123142.98k 153932.33k > sha1 10270.44k33538.23k93979.13k 171431.58k 224889.51k > rc4 248141.11k 299502.68k 312819.11k 316454.57k 317876.62k > blowfish cbc 52879.87k56659.39k57974.95k58306.56k58335.11k > aes-128 cbc 56603.67k78418.22k86639.62k89007.75k88984.23k > aes-192 cbc 53353.86k72285.93k79266.68k81229.82k81267.37k > aes-256 cbc 50445.43k67040.72k72999.08k74656.43k74604.54k > sha256 10808.77k29896.92k62069.85k84877.31k95065.43k > sha5127113.80k28534.17k72103.59k 134958.08k 181021.35k > > cfe2.imorgan> $NOBACKUP/build/bin/openssl speed aes bf rc4 md5 sha 2>/dev/nul > > > OpenSSL 0.9.8e 23 Feb 2007 > built on: Fri Jun 8 09:27:49 PDT 2007 > options:bn(64,64) md2(int) rc4(ptr,int) des(idx,cisc,4,long) aes(partial) > idea(int) blowfish(idx) > compiler: icc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H > -DL_ENDIAN -DTERMIO -O2 -Wall -no_cpprt -i-static -DSHA1_ASM -DSHA256_ASM > -DSHA512_ASM -DAES_ASM > available timing options: TIMES TIMEB HZ=1024 [sysconf value] > timing function used: times > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes > md5 5835.63k20437.19k60845.48k 119983.98k 166928.38k > sha1 6356.02k19805.21k63273.44k 140241.24k 216834.46k > rc4 247902.49k 299131.32k 314115.67k 317933.50k 318111.74k > blowfish cbc 50625.69k59182.33k61785.80k62476.09k62615.65k > aes-128 cbc 47999.54k51204.48k52011.06k52207.27k51920.38k > aes-192 cbc 45619.43k48507.73k49234.01k49402.50k49119.23k > aes-256 cbc 43471.76k46080.66k46732.78k46890.42k46632.80k > sha2566729.22k21080.31k50995.71k79001.82k94085.12k > sha5124036.13k16274.26k48357.42k 109740.71k 174342.74k > > __ > OpenSSL Project http://www.openssl.org > Development Mailing List openssl-dev@openssl.org > Automated List Manager [EMAIL PROTECTED] -- Iain Morgan
Re: Performance on IA64 using icc vs gcc
On Mon, Jun 11, 2007 at 11:18:53 -0700, Rick Jones wrote: > My favorite starting point, when I have little else to go on is to try > to pick something "close" from a SPECint suite and use the settings the > submittor used for it. > > Still would be nice to know if the stuff that was supposed to be > assembly was still assembly when using icc rather than gcc. > > rick jones I hadn't thought of comparing against something from SPECint, but that's an interesting idea. Yes, it does look like icc is using the assembly language code. -- Iain Morgan __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Performance on IA64 using icc vs gcc
My favorite starting point, when I have little else to go on is to try to pick something "close" from a SPECint suite and use the settings the submittor used for it. Still would be nice to know if the stuff that was supposed to be assembly was still assembly when using icc rather than gcc. rick jones __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Performance on IA64 using icc vs gcc
On Sat, Jun 09, 2007 at 14:52:27 +0200, Frank Büttner wrote: > Kurt Roeckx schrieb: > > On Fri, Jun 08, 2007 at 05:00:37PM -0700, David Schwartz wrote: > >>> Using the Intel 9.1 compiler on an IA64 system the performance of > >>> AES and (to a lesser extent) other algorithms implemented in > >>> assembly language is less than that using gcc. I've included the > >>> speed output for several of the algorithms below. > >>> > >>> Is this a know issue and is there a workaround other than switching > >>> to gcc? > >> You should compare with the best optimization flags for each compiler. I > >> don't see any of the typical icc optimization flags used, > >> like -ip, -march=pentium4, -msse3, -xP, or whatever is appropriate for your > >> CPU. > > > > I don't think -march=pentium4 is going to work on an IA64, and I have my > > doubts about sse3 too. > > > > Note that IA64 is not an x86_64/amd64/x64. > > > > > > Kurt > > Have you try the last version of the intel compiler(10.0)? > Not yet, the system I'm building on does not have the 10.0 compilers. However, that is something I intend to try. -- Iain Morgan __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Performance on IA64 using icc vs gcc
> I don't think -march=pentium4 is going to work on an IA64, and I have my > doubts about sse3 too. Yeah, I misread the original post. I still recommend comparing using the appropriate optimization flags for each compiler. If you're going to compare them just based on performance, you should allow each compiler to fully optimize for performance. Depending on what you're really trying to get at, you may wish to avoid using assembler code as well, especially if you're comparing on newer CPUs that the assembler code might not really be optimized for. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Performance on IA64 using icc vs gcc
Kurt Roeckx schrieb: > On Fri, Jun 08, 2007 at 05:00:37PM -0700, David Schwartz wrote: >>> Using the Intel 9.1 compiler on an IA64 system the performance of >>> AES and (to a lesser extent) other algorithms implemented in >>> assembly language is less than that using gcc. I've included the >>> speed output for several of the algorithms below. >>> >>> Is this a know issue and is there a workaround other than switching >>> to gcc? >> You should compare with the best optimization flags for each compiler. I >> don't see any of the typical icc optimization flags used, >> like -ip, -march=pentium4, -msse3, -xP, or whatever is appropriate for your >> CPU. > > I don't think -march=pentium4 is going to work on an IA64, and I have my > doubts about sse3 too. > > Note that IA64 is not an x86_64/amd64/x64. > > > Kurt Have you try the last version of the intel compiler(10.0)? smime.p7s Description: S/MIME Cryptographic Signature
Re: Performance on IA64 using icc vs gcc
On Fri, Jun 08, 2007 at 05:00:37PM -0700, David Schwartz wrote: > > > Using the Intel 9.1 compiler on an IA64 system the performance of > > AES and (to a lesser extent) other algorithms implemented in > > assembly language is less than that using gcc. I've included the > > speed output for several of the algorithms below. > > > > Is this a know issue and is there a workaround other than switching > > to gcc? > > You should compare with the best optimization flags for each compiler. I > don't see any of the typical icc optimization flags used, > like -ip, -march=pentium4, -msse3, -xP, or whatever is appropriate for your > CPU. I don't think -march=pentium4 is going to work on an IA64, and I have my doubts about sse3 too. Note that IA64 is not an x86_64/amd64/x64. Kurt __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Performance on IA64 using icc vs gcc
On Fri, Jun 08, 2007 at 17:00:37 -0700, David Schwartz wrote: > > > Using the Intel 9.1 compiler on an IA64 system the performance of > > AES and (to a lesser extent) other algorithms implemented in > > assembly language is less than that using gcc. I've included the > > speed output for several of the algorithms below. > > > > Is this a know issue and is there a workaround other than switching > > to gcc? > > You should compare with the best optimization flags for each compiler. I > don't see any of the typical icc optimization flags used, > like -ip, -march=pentium4, -msse3, -xP, or whatever is appropriate for your > CPU. > > DS > The options used in the icc case were simply those set by ./Configure linux-ia64-icc. The one option that I added was -i-static to force static linking to libimf. -- Iain Morgan __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Performance on IA64 using icc vs gcc
> Using the Intel 9.1 compiler on an IA64 system the performance of > AES and (to a lesser extent) other algorithms implemented in > assembly language is less than that using gcc. I've included the > speed output for several of the algorithms below. > > Is this a know issue and is there a workaround other than switching > to gcc? Here's what I get doing a similar test. P3-1Ghz machine. OpenSSL 0.9.8d 28 Sep 2006 built on: Fri Jun 8 17:02:33 PDT 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) idea(int) blowfish(idx) compiler: gcc420 -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -march=pent ium3 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PAR T_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes md5 5931.24k20709.35k60287.32k 116997.46k 159391.74k sha1 5550.15k17675.63k44248.15k70832.13k 85983.23k sha5121224.25k 4897.15k 8170.41k11913.22k 13757.10k OpenSSL 0.9.8d 28 Sep 2006 built on: Fri Jun 8 17:04:17 PDT 2007 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) idea(int) blowfish(idx) compiler: icc91038 -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIA N -DTERMIO -O3 -xK -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -D OPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes md5 6958.21k24528.05k68030.50k 124347.11k 161808.25k sha1 6404.12k19898.15k47526.49k72804.35k 86343.68k sha5121390.74k 5584.36k 8995.75k12954.28k 14846.63k It could be your compiler options. It could be your choice of gcc version. It could be something quirky about your hardware. Mostly, I think it's the compiler flags you passed to icc. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Performance on IA64 using icc vs gcc
> Using the Intel 9.1 compiler on an IA64 system the performance of > AES and (to a lesser extent) other algorithms implemented in > assembly language is less than that using gcc. I've included the > speed output for several of the algorithms below. > > Is this a know issue and is there a workaround other than switching > to gcc? You should compare with the best optimization flags for each compiler. I don't see any of the typical icc optimization flags used, like -ip, -march=pentium4, -msse3, -xP, or whatever is appropriate for your CPU. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Performance on IA64 using icc vs gcc
Iain Morgan wrote: Hello, Using the Intel 9.1 compiler on an IA64 system the performance of AES and (to a lesser extent) other algorithms implemented in assembly language is less than that using gcc. I've included the speed output for several of the algorithms below. Is this a know issue and is there a workaround other than switching to gcc? How about if you use something other than -O2 with icc compared to the -O3 on gcc? That could I would think affect the 'C' stuff but shouldn't affect the assembly stuff though (although who knows, I'm just a networking guy...) is the assembly stuff such that it is still done with icc as the compiler rather than gcc? rick jones __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]