subject:"\"1.3.10 memcmp\\\(\\\) bug\""

Re: 1.3.10 memcmp() bug

2002-04-25 Thread Tim Prince


On Thursday 25 April 2002 00:22, Gareth Pearce wrote:
> >On Tuesday 23 April 2002 23:41, Sami Korhonen wrote:
> > > On Tue, 23 Apr 2002, Tim Prince wrote:
 >
> >AFAICT there's no reason this should behave differently on linux or
> > cygwin. You're comparing the speed of memcmp() against the speed of
> > comparing ints in
> >a loop.  When you don't ask the compiler to in-line memcmp(), you get a
> >library function which is written with enough smarts to compare 4 bytes at
> >a
> >time.   Various versions of gcc are interpreting the instruction to use
> >"optimized" in-line code as a rep cmpsb, which is slower than the newlib
> >memcmp() function, even on my P-III.
> >P4's, particularly early versions, are notorious for various performance
> >glitches when using rep cmpsb on long strings.  gcc isn't smart enough to
> >look at the lengths of your strings and second guess your instruction to
> > do that, nor does it have a crystal ball to second guess your instruction
> > to generate 486 code, even if you were running a version with P4
> >optimizations.
> >In time critical applications, it can be quite important to learn the
> >particular tricks of your compiler and when to choose a separately
> > compiled string function, or when to ask for in-line, as well as to
> > acquire a library
> >of such functions built for the processor of your choice.   On the P4, you
> >would have available 64-bit integer comparisons if you chose to use them
> > to speed this up.
> >--
>
> gcc 3.1+ are supposed to be 'more' intelligent about such things -
> althought they arent brilliant.
>
> Regards,
> Gareth

At least with -march=pentium3, gcc-3.1 has the same problem of not knowing 
not to do what is asked.

-- 
Tim Prince

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

Re: 1.3.10 memcmp() bug

2002-04-25 Thread Gareth Pearce



>On Tuesday 23 April 2002 23:41, Sami Korhonen wrote:
> > On Tue, 23 Apr 2002, Tim Prince wrote:
> > > On Tuesday 23 April 2002 22:04, Sami Korhonen wrote:
> > > >  I wasnt sure wheter I should post about this on gcc bug report list 
>or
> > > > here. Anyways, it seems that using -O2 flag with gcc causes huge
> > > > slowdown in memcmp(). However i dont see performance drop under 
>linux,
> > > > so I suppose it is cygwin issue.
> > > >
> > > > $ gcc memtest.c -O2 -o memtest ; ./memtest.exe
> > > > Amount of memory to scan (mbytes)? 100
> > > > Memory block size (default 1024)? 1024
> > > > Allocating memory
> > > > Testing memory - read (1 byte at time)
> > > > Complete: 889.73MB/sec
> > > > Testing memory - read (4 bytes at time)
> > > > Complete: 3313.07MB/sec
> > > > Freeing memory
> > > >
> > > > $ gcc memtest.c -o memtest ; ./memtest.exe
> > > > Amount of memory to scan (mbytes)? 100
> > > > Memory block size (default 1024)? 1024
> > > > Allocating memory
> > > > Testing memory - read (1 byte at time)
> > > > Complete: 2517.94MB/sec
> > > > Testing memory - read (4 bytes at time)
> > > > Complete: 2933.50MB/sec
> > > > Freeing memory
> > > >
> > > >
> > > > '1 byte at time' is using memcmp() to compare two blocks.
> > >
> > > You leave so many relevant considerations unspecified, that anything I
> > > say must be a stab in the dark.  I assume you have a standard cygwin
> > > installation, where binutils is built to honor only 4-byte alignments,
> > > while recent linux configurations provide for 16-byte alignments.  The
> > > significance of that is different on various CPU families, with code
> > > alignment being quite important on certain CPU's, and data alignment 
>on
> > > others.  Do we assume that you are running on a 486, since you have 
>not
> > > told gcc otherwise?  You may have fallen accidentally into good 
>alignment
> > > in one case and bad in the other.  You might or might not be using
> > > similar versions of gcc in cygwin and linux.  If you would provide a 
>test
> > > case, and mention some hardware parameters, some of the mystery could 
>be
> > > eliminated; for example, we could find out whether memcmp() is code
> > > generated by gcc or from a library.  cygwin is not generally 
>considered
> > > an important target for performance optimization, as you can see from 
>the
> > > alignment considerations and the differences in the libraries.
> > > --
> > > Tim Prince
> >
> >  Sorry that I wasnt specific enough with my system configuration. I'm
> > running standard installation of cygwin on x86 (P4) and WinXP. Both
> > test were run under same setup, only difference was the use of -O2 flag. 
>I
> > find it odd, that performance differnece is that huge. Source is 
>available
> > at: http://kotisivu.raketti.net/darkone/memtest/memtest.c
>AFAICT there's no reason this should behave differently on linux or cygwin.
>You're comparing the speed of memcmp() against the speed of comparing ints 
>in
>a loop.  When you don't ask the compiler to in-line memcmp(), you get a
>library function which is written with enough smarts to compare 4 bytes at 
>a
>time.   Various versions of gcc are interpreting the instruction to use
>"optimized" in-line code as a rep cmpsb, which is slower than the newlib
>memcmp() function, even on my P-III.
>P4's, particularly early versions, are notorious for various performance
>glitches when using rep cmpsb on long strings.  gcc isn't smart enough to
>look at the lengths of your strings and second guess your instruction to do
>that, nor does it have a crystal ball to second guess your instruction to
>generate 486 code, even if you were running a version with P4 
>optimizations.
>In time critical applications, it can be quite important to learn the
>particular tricks of your compiler and when to choose a separately compiled
>string function, or when to ask for in-line, as well as to acquire a 
>library
>of such functions built for the processor of your choice.   On the P4, you
>would have available 64-bit integer comparisons if you chose to use them to
>speed this up.
>--


gcc 3.1+ are supposed to be 'more' intelligent about such things - althought 
they arent brilliant.

Regards,
Gareth

_
Send and receive Hotmail on your mobile device: http://mobile.msn.com


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

Re: 1.3.10 memcmp() bug

2002-04-24 Thread Tim Prince


On Tuesday 23 April 2002 23:41, Sami Korhonen wrote:
> On Tue, 23 Apr 2002, Tim Prince wrote:
> > On Tuesday 23 April 2002 22:04, Sami Korhonen wrote:
> > >  I wasnt sure wheter I should post about this on gcc bug report list or
> > > here. Anyways, it seems that using -O2 flag with gcc causes huge
> > > slowdown in memcmp(). However i dont see performance drop under linux,
> > > so I suppose it is cygwin issue.
> > >
> > > $ gcc memtest.c -O2 -o memtest ; ./memtest.exe
> > > Amount of memory to scan (mbytes)? 100
> > > Memory block size (default 1024)? 1024
> > > Allocating memory
> > > Testing memory - read (1 byte at time)
> > > Complete: 889.73MB/sec
> > > Testing memory - read (4 bytes at time)
> > > Complete: 3313.07MB/sec
> > > Freeing memory
> > >
> > > $ gcc memtest.c -o memtest ; ./memtest.exe
> > > Amount of memory to scan (mbytes)? 100
> > > Memory block size (default 1024)? 1024
> > > Allocating memory
> > > Testing memory - read (1 byte at time)
> > > Complete: 2517.94MB/sec
> > > Testing memory - read (4 bytes at time)
> > > Complete: 2933.50MB/sec
> > > Freeing memory
> > >
> > >
> > > '1 byte at time' is using memcmp() to compare two blocks.
> >
> > You leave so many relevant considerations unspecified, that anything I
> > say must be a stab in the dark.  I assume you have a standard cygwin
> > installation, where binutils is built to honor only 4-byte alignments,
> > while recent linux configurations provide for 16-byte alignments.  The
> > significance of that is different on various CPU families, with code
> > alignment being quite important on certain CPU's, and data alignment on
> > others.  Do we assume that you are running on a 486, since you have not
> > told gcc otherwise?  You may have fallen accidentally into good alignment
> > in one case and bad in the other.  You might or might not be using
> > similar versions of gcc in cygwin and linux.  If you would provide a test
> > case, and mention some hardware parameters, some of the mystery could be
> > eliminated; for example, we could find out whether memcmp() is code
> > generated by gcc or from a library.  cygwin is not generally considered
> > an important target for performance optimization, as you can see from the
> > alignment considerations and the differences in the libraries.
> > --
> > Tim Prince
>
>  Sorry that I wasnt specific enough with my system configuration. I'm
> running standard installation of cygwin on x86 (P4) and WinXP. Both
> test were run under same setup, only difference was the use of -O2 flag. I
> find it odd, that performance differnece is that huge. Source is available
> at: http://kotisivu.raketti.net/darkone/memtest/memtest.c
AFAICT there's no reason this should behave differently on linux or cygwin.  
You're comparing the speed of memcmp() against the speed of comparing ints in 
a loop.  When you don't ask the compiler to in-line memcmp(), you get a 
library function which is written with enough smarts to compare 4 bytes at a 
time.   Various versions of gcc are interpreting the instruction to use 
"optimized" in-line code as a rep cmpsb, which is slower than the newlib 
memcmp() function, even on my P-III.  
P4's, particularly early versions, are notorious for various performance 
glitches when using rep cmpsb on long strings.  gcc isn't smart enough to 
look at the lengths of your strings and second guess your instruction to do 
that, nor does it have a crystal ball to second guess your instruction to 
generate 486 code, even if you were running a version with P4 optimizations.
In time critical applications, it can be quite important to learn the 
particular tricks of your compiler and when to choose a separately compiled 
string function, or when to ask for in-line, as well as to acquire a library 
of such functions built for the processor of your choice.   On the P4, you 
would have available 64-bit integer comparisons if you chose to use them to 
speed this up.
-- 
Tim Prince

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

Re: 1.3.10 memcmp() bug

2002-04-24 Thread C. J.


> >  I wasnt sure wheter I should post about this on gcc bug report list or
> > here. Anyways, it seems that using -O2 flag with gcc causes huge 
>slowdown
> > in memcmp(). However i dont see performance drop under linux, so I 
>suppose
> > it is cygwin issue.

cygwin's gcc version may be using an outdated x86 'optimization' for memcmp. 
  VC++ has a similar problem, see this:

http://groups.google.com/groups?hl=en&threadm=ucZKhyE3BHA.1464%40tkmsftngp02&rnum=3&prev=/groups%3Fq%3Drep%2Bgroup:microsoft.public.dotnet.languages.vc%26hl%3Den%26selm%3DucZKhyE3BHA.1464%2540tkmsftngp02%26rnum%3D3


_
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp.


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

Re: 1.3.10 memcmp() bug

2002-04-23 Thread Sami Korhonen


On Tue, 23 Apr 2002, Tim Prince wrote:

> On Tuesday 23 April 2002 22:04, Sami Korhonen wrote:
> >  I wasnt sure wheter I should post about this on gcc bug report list or
> > here. Anyways, it seems that using -O2 flag with gcc causes huge slowdown
> > in memcmp(). However i dont see performance drop under linux, so I suppose
> > it is cygwin issue.
> >
> > $ gcc memtest.c -O2 -o memtest ; ./memtest.exe
> > Amount of memory to scan (mbytes)? 100
> > Memory block size (default 1024)? 1024
> > Allocating memory
> > Testing memory - read (1 byte at time)
> > Complete: 889.73MB/sec
> > Testing memory - read (4 bytes at time)
> > Complete: 3313.07MB/sec
> > Freeing memory
> >
> > $ gcc memtest.c -o memtest ; ./memtest.exe
> > Amount of memory to scan (mbytes)? 100
> > Memory block size (default 1024)? 1024
> > Allocating memory
> > Testing memory - read (1 byte at time)
> > Complete: 2517.94MB/sec
> > Testing memory - read (4 bytes at time)
> > Complete: 2933.50MB/sec
> > Freeing memory
> >
> >
> > '1 byte at time' is using memcmp() to compare two blocks.
> You leave so many relevant considerations unspecified, that anything I say 
> must be a stab in the dark.  I assume you have a standard cygwin 
> installation, where binutils is built to honor only 4-byte alignments, while 
> recent linux configurations provide for 16-byte alignments.  The significance 
> of that is different on various CPU families, with code alignment being quite 
> important on certain CPU's, and data alignment on others.  Do we assume that 
> you are running on a 486, since you have not told gcc otherwise?  You may 
> have fallen accidentally into good alignment in one case and bad in the 
> other.  You might or might not be using similar versions of gcc in cygwin and 
> linux.  If you would provide a test case, and mention some hardware 
> parameters, some of the mystery could be eliminated; for example, we could 
> find out whether memcmp() is code generated by gcc or from a library.  cygwin 
> is not generally considered an important target for performance optimization, 
> as you can see from the alignment considerations and the differences in the 
> libraries.
> -- 
> Tim Prince
> 

 Sorry that I wasnt specific enough with my system configuration. I'm
running standard installation of cygwin on x86 (P4) and WinXP. Both
test were run under same setup, only difference was the use of -O2 flag. I
find it odd, that performance differnece is that huge. Source is available
at: http://kotisivu.raketti.net/darkone/memtest/memtest.c


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

Re: 1.3.10 memcmp() bug

2002-04-23 Thread Tim Prince


On Tuesday 23 April 2002 22:04, Sami Korhonen wrote:
>  I wasnt sure wheter I should post about this on gcc bug report list or
> here. Anyways, it seems that using -O2 flag with gcc causes huge slowdown
> in memcmp(). However i dont see performance drop under linux, so I suppose
> it is cygwin issue.
>
> $ gcc memtest.c -O2 -o memtest ; ./memtest.exe
> Amount of memory to scan (mbytes)? 100
> Memory block size (default 1024)? 1024
> Allocating memory
> Testing memory - read (1 byte at time)
> Complete: 889.73MB/sec
> Testing memory - read (4 bytes at time)
> Complete: 3313.07MB/sec
> Freeing memory
>
> $ gcc memtest.c -o memtest ; ./memtest.exe
> Amount of memory to scan (mbytes)? 100
> Memory block size (default 1024)? 1024
> Allocating memory
> Testing memory - read (1 byte at time)
> Complete: 2517.94MB/sec
> Testing memory - read (4 bytes at time)
> Complete: 2933.50MB/sec
> Freeing memory
>
>
> '1 byte at time' is using memcmp() to compare two blocks.
You leave so many relevant considerations unspecified, that anything I say 
must be a stab in the dark.  I assume you have a standard cygwin 
installation, where binutils is built to honor only 4-byte alignments, while 
recent linux configurations provide for 16-byte alignments.  The significance 
of that is different on various CPU families, with code alignment being quite 
important on certain CPU's, and data alignment on others.  Do we assume that 
you are running on a 486, since you have not told gcc otherwise?  You may 
have fallen accidentally into good alignment in one case and bad in the 
other.  You might or might not be using similar versions of gcc in cygwin and 
linux.  If you would provide a test case, and mention some hardware 
parameters, some of the mystery could be eliminated; for example, we could 
find out whether memcmp() is code generated by gcc or from a library.  cygwin 
is not generally considered an important target for performance optimization, 
as you can see from the alignment considerations and the differences in the 
libraries.
-- 
Tim Prince

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

1.3.10 memcmp() bug

2002-04-23 Thread Sami Korhonen


 I wasnt sure wheter I should post about this on gcc bug report list or
here. Anyways, it seems that using -O2 flag with gcc causes huge slowdown
in memcmp(). However i dont see performance drop under linux, so I suppose
it is cygwin issue.

$ gcc memtest.c -O2 -o memtest ; ./memtest.exe
Amount of memory to scan (mbytes)? 100
Memory block size (default 1024)? 1024
Allocating memory
Testing memory - read (1 byte at time)
Complete: 889.73MB/sec
Testing memory - read (4 bytes at time)
Complete: 3313.07MB/sec
Freeing memory

$ gcc memtest.c -o memtest ; ./memtest.exe
Amount of memory to scan (mbytes)? 100
Memory block size (default 1024)? 1024
Allocating memory
Testing memory - read (1 byte at time)
Complete: 2517.94MB/sec
Testing memory - read (4 bytes at time)
Complete: 2933.50MB/sec
Freeing memory


'1 byte at time' is using memcmp() to compare two blocks.


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

Re: 1.3.10 memcmp() bug

Re: 1.3.10 memcmp() bug

Re: 1.3.10 memcmp() bug

Re: 1.3.10 memcmp() bug

Re: 1.3.10 memcmp() bug

Re: 1.3.10 memcmp() bug

1.3.10 memcmp() bug

7 matches

Site Navigation

Mail list logo

Footer information