Re: Performance on IA64 using icc vs gcc

2007-06-11 Thread Iain Morgan
OK, I've made some headway on this. First, I'd like to thank all those
who have provided input.

It looks like the issue might not be the lack of an additional optimization
option, but the adverse affect of one of the optimizations included by -O2.
I tried a build with -O1 instead and got much better results:

cfe2.imorgan> apps/openssl speed aes bf rc4 md5 sha1 2>/dev/null
OpenSSL 0.9.8e 23 Feb 2007
built on: Mon Jun 11 12:12:59 PDT 2007
options:bn(64,64) md2(int) rc4(ptr,int) des(idx,cisc,4,long) aes(partial) 
idea(int) blowfish(idx) 
compiler: icc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H 
-DL_ENDIAN -DTERMIO -O1 -Wall -no_cpprt -i-static -DSHA1_ASM -DSHA256_ASM 
-DSHA512_ASM -DAES_ASM
available timing options: TIMES TIMEB HZ=1024 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
md5   8507.03k27722.30k75531.10k   132557.28k   169630.22k
sha1  9792.46k27468.37k81129.39k   159646.12k   222044.16k
rc4 239281.42k   298509.16k   312526.86k   317020.73k   317904.21k
blowfish cbc 51905.17k55981.70k57137.15k57406.12k57450.08k
aes-128 cbc  80270.47k91255.53k94353.92k95308.12k94093.90k
aes-192 cbc  73833.99k83016.73k85580.54k86355.07k85327.87k
aes-256 cbc  68343.73k76161.19k78277.94k78952.79k77976.92k

Iain Morgan wrote:
> Hello,
> 
> Using the Intel 9.1 compiler on an IA64 system the performance of
> AES and (to a lesser extent) other algorithms implemented in
> assembly language is less than that using gcc. I've included the
> speed output for several of the algorithms below.
> 
> Is this a know issue and is there a workaround other than switching
> to gcc?
> 
> Thanks
> 
> -- 
> Iain Morgan
> 
> cfe2.imorgan> apps/openssl speed aes bf rc4 md5 sha 2>/dev/null
> OpenSSL 0.9.8e 23 Feb 2007
> built on: Fri Jun  8 10:46:48 PDT 2007
> options:bn(64,64) md2(int) rc4(ptr,int) des(idx,cisc,4,long) aes(partial) 
> idea(int) blowfish(idx) 
> compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H 
> -DL_ENDIAN -DTERMIO -O3 -Wall -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM
> available timing options: TIMES TIMEB HZ=1024 [sysconf value]
> timing function used: times
> The 'numbers' are in 1000s of bytes per second processed.
> type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
> md5   8388.37k27899.28k73067.45k   123142.98k   153932.33k
> sha1 10270.44k33538.23k93979.13k   171431.58k   224889.51k
> rc4 248141.11k   299502.68k   312819.11k   316454.57k   317876.62k
> blowfish cbc 52879.87k56659.39k57974.95k58306.56k58335.11k
> aes-128 cbc  56603.67k78418.22k86639.62k89007.75k88984.23k
> aes-192 cbc  53353.86k72285.93k79266.68k81229.82k81267.37k
> aes-256 cbc  50445.43k67040.72k72999.08k74656.43k74604.54k
> sha256   10808.77k29896.92k62069.85k84877.31k95065.43k
> sha5127113.80k28534.17k72103.59k   134958.08k   181021.35k
> 
> cfe2.imorgan> $NOBACKUP/build/bin/openssl speed aes bf rc4 md5 sha 2>/dev/nul 
> >
> OpenSSL 0.9.8e 23 Feb 2007
> built on: Fri Jun  8 09:27:49 PDT 2007
> options:bn(64,64) md2(int) rc4(ptr,int) des(idx,cisc,4,long) aes(partial) 
> idea(int) blowfish(idx) 
> compiler: icc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H 
> -DL_ENDIAN -DTERMIO -O2 -Wall -no_cpprt -i-static -DSHA1_ASM -DSHA256_ASM 
> -DSHA512_ASM -DAES_ASM
> available timing options: TIMES TIMEB HZ=1024 [sysconf value]
> timing function used: times
> The 'numbers' are in 1000s of bytes per second processed.
> type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
> md5   5835.63k20437.19k60845.48k   119983.98k   166928.38k
> sha1  6356.02k19805.21k63273.44k   140241.24k   216834.46k
> rc4 247902.49k   299131.32k   314115.67k   317933.50k   318111.74k
> blowfish cbc 50625.69k59182.33k61785.80k62476.09k62615.65k
> aes-128 cbc  47999.54k51204.48k52011.06k52207.27k51920.38k
> aes-192 cbc  45619.43k48507.73k49234.01k49402.50k49119.23k
> aes-256 cbc  43471.76k46080.66k46732.78k46890.42k46632.80k
> sha2566729.22k21080.31k50995.71k79001.82k94085.12k
> sha5124036.13k16274.26k48357.42k   109740.71k   174342.74k
> 
> __
> OpenSSL Project http://www.openssl.org
> Development Mailing List   openssl-dev@openssl.org
> Automated List Manager   [EMAIL PROTECTED]

-- 
Iain Morgan

Re: Performance on IA64 using icc vs gcc

2007-06-11 Thread Iain Morgan
On Mon, Jun 11, 2007 at 11:18:53 -0700, Rick Jones wrote:
> My favorite starting point, when I have little else to go on is to try 
> to pick something "close" from a SPECint suite and use the settings the 
> submittor used for it.
> 
> Still would be nice to know if the stuff that was supposed to be 
> assembly was still assembly when using icc rather than gcc.
> 
> rick jones

I hadn't thought of comparing against something from SPECint,
but that's an interesting idea.

Yes, it does look like icc is using the assembly language code.

-- 
Iain Morgan
__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   [EMAIL PROTECTED]


Re: Performance on IA64 using icc vs gcc

2007-06-11 Thread Rick Jones
My favorite starting point, when I have little else to go on is to try 
to pick something "close" from a SPECint suite and use the settings the 
submittor used for it.


Still would be nice to know if the stuff that was supposed to be 
assembly was still assembly when using icc rather than gcc.


rick jones
__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   [EMAIL PROTECTED]


Re: Performance on IA64 using icc vs gcc

2007-06-11 Thread Iain Morgan
On Sat, Jun 09, 2007 at 14:52:27 +0200, Frank Büttner wrote:
> Kurt Roeckx schrieb:
> > On Fri, Jun 08, 2007 at 05:00:37PM -0700, David Schwartz wrote:
> >>> Using the Intel 9.1 compiler on an IA64 system the performance of
> >>> AES and (to a lesser extent) other algorithms implemented in
> >>> assembly language is less than that using gcc. I've included the
> >>> speed output for several of the algorithms below.
> >>>
> >>> Is this a know issue and is there a workaround other than switching
> >>> to gcc?
> >> You should compare with the best optimization flags for each compiler. I
> >> don't see any of the typical icc optimization flags used,
> >> like -ip, -march=pentium4, -msse3, -xP, or whatever is appropriate for your
> >> CPU.
> > 
> > I don't think -march=pentium4 is going to work on an IA64, and I have my
> > doubts about sse3 too.
> > 
> > Note that IA64 is not an x86_64/amd64/x64.
> > 
> > 
> > Kurt
> 
> Have you try the last version of the intel compiler(10.0)?
> 

Not yet, the system I'm building on does not have the 10.0 compilers.
However, that is something I intend to try.

-- 
Iain Morgan
__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   [EMAIL PROTECTED]


RE: Performance on IA64 using icc vs gcc

2007-06-10 Thread David Schwartz

> I don't think -march=pentium4 is going to work on an IA64, and I have my
> doubts about sse3 too.

Yeah, I misread the original post. I still recommend comparing using the
appropriate optimization flags for each compiler. If you're going to compare
them just based on performance, you should allow each compiler to fully
optimize for performance. Depending on what you're really trying to get at,
you may wish to avoid using assembler code as well, especially if you're
comparing on newer CPUs that the assembler code might not really be
optimized for.

DS


__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   [EMAIL PROTECTED]


Re: Performance on IA64 using icc vs gcc

2007-06-09 Thread Frank Büttner
Kurt Roeckx schrieb:
> On Fri, Jun 08, 2007 at 05:00:37PM -0700, David Schwartz wrote:
>>> Using the Intel 9.1 compiler on an IA64 system the performance of
>>> AES and (to a lesser extent) other algorithms implemented in
>>> assembly language is less than that using gcc. I've included the
>>> speed output for several of the algorithms below.
>>>
>>> Is this a know issue and is there a workaround other than switching
>>> to gcc?
>> You should compare with the best optimization flags for each compiler. I
>> don't see any of the typical icc optimization flags used,
>> like -ip, -march=pentium4, -msse3, -xP, or whatever is appropriate for your
>> CPU.
> 
> I don't think -march=pentium4 is going to work on an IA64, and I have my
> doubts about sse3 too.
> 
> Note that IA64 is not an x86_64/amd64/x64.
> 
> 
> Kurt

Have you try the last version of the intel compiler(10.0)?



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Performance on IA64 using icc vs gcc

2007-06-09 Thread Kurt Roeckx
On Fri, Jun 08, 2007 at 05:00:37PM -0700, David Schwartz wrote:
> 
> > Using the Intel 9.1 compiler on an IA64 system the performance of
> > AES and (to a lesser extent) other algorithms implemented in
> > assembly language is less than that using gcc. I've included the
> > speed output for several of the algorithms below.
> >
> > Is this a know issue and is there a workaround other than switching
> > to gcc?
> 
> You should compare with the best optimization flags for each compiler. I
> don't see any of the typical icc optimization flags used,
> like -ip, -march=pentium4, -msse3, -xP, or whatever is appropriate for your
> CPU.

I don't think -march=pentium4 is going to work on an IA64, and I have my
doubts about sse3 too.

Note that IA64 is not an x86_64/amd64/x64.


Kurt

__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   [EMAIL PROTECTED]


Re: Performance on IA64 using icc vs gcc

2007-06-08 Thread Iain Morgan
On Fri, Jun 08, 2007 at 17:00:37 -0700, David Schwartz wrote:
> 
> > Using the Intel 9.1 compiler on an IA64 system the performance of
> > AES and (to a lesser extent) other algorithms implemented in
> > assembly language is less than that using gcc. I've included the
> > speed output for several of the algorithms below.
> >
> > Is this a know issue and is there a workaround other than switching
> > to gcc?
> 
> You should compare with the best optimization flags for each compiler. I
> don't see any of the typical icc optimization flags used,
> like -ip, -march=pentium4, -msse3, -xP, or whatever is appropriate for your
> CPU.
> 
> DS
> 

The options used in the icc case were simply those set by
./Configure linux-ia64-icc. The one option that I added
was -i-static to force static linking to libimf.

-- 
Iain Morgan
__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   [EMAIL PROTECTED]


RE: Performance on IA64 using icc vs gcc

2007-06-08 Thread David Schwartz

> Using the Intel 9.1 compiler on an IA64 system the performance of
> AES and (to a lesser extent) other algorithms implemented in
> assembly language is less than that using gcc. I've included the
> speed output for several of the algorithms below.
>
> Is this a know issue and is there a workaround other than switching
> to gcc?

Here's what I get doing a similar test. P3-1Ghz machine.

OpenSSL 0.9.8d 28 Sep 2006
built on: Fri Jun  8 17:02:33 PDT 2007
options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial)
idea(int) blowfish(idx)
compiler:
gcc420 -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -march=pent
ium3 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PAR
T_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192
bytes
md5   5931.24k20709.35k60287.32k   116997.46k
159391.74k
sha1  5550.15k17675.63k44248.15k70832.13k
85983.23k
sha5121224.25k 4897.15k 8170.41k11913.22k
13757.10k

OpenSSL 0.9.8d 28 Sep 2006
built on: Fri Jun  8 17:04:17 PDT 2007
options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial)
idea(int) blowfish(idx)
compiler:
icc91038 -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIA
N -DTERMIO -O3 -xK -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -D
OPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192
bytes
md5   6958.21k24528.05k68030.50k   124347.11k
161808.25k
sha1  6404.12k19898.15k47526.49k72804.35k
86343.68k
sha5121390.74k 5584.36k 8995.75k12954.28k
14846.63k

It could be your compiler options. It could be your choice of gcc version.
It could be something quirky about your hardware. Mostly, I think it's the
compiler flags you passed to icc.

DS


__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   [EMAIL PROTECTED]


RE: Performance on IA64 using icc vs gcc

2007-06-08 Thread David Schwartz

> Using the Intel 9.1 compiler on an IA64 system the performance of
> AES and (to a lesser extent) other algorithms implemented in
> assembly language is less than that using gcc. I've included the
> speed output for several of the algorithms below.
>
> Is this a know issue and is there a workaround other than switching
> to gcc?

You should compare with the best optimization flags for each compiler. I
don't see any of the typical icc optimization flags used,
like -ip, -march=pentium4, -msse3, -xP, or whatever is appropriate for your
CPU.

DS


__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   [EMAIL PROTECTED]


Re: Performance on IA64 using icc vs gcc

2007-06-08 Thread Rick Jones

Iain Morgan wrote:

Hello,

Using the Intel 9.1 compiler on an IA64 system the performance of
AES and (to a lesser extent) other algorithms implemented in
assembly language is less than that using gcc. I've included the
speed output for several of the algorithms below.

Is this a know issue and is there a workaround other than switching
to gcc?


How about if you use something other than -O2 with icc compared to the 
-O3 on gcc?  That could I would think affect the 'C' stuff but shouldn't 
affect the assembly stuff though (although who knows, I'm just a 
networking guy...) is the assembly stuff such that it is still done with 
icc as the compiler rather than gcc?


rick jones
__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   [EMAIL PROTECTED]