Your message dated Thu, 17 Dec 2015 19:35:14 +0100
with message-id <20151217183514.ga22...@aurel32.net>
and subject line Re: Bug#572746: libm: sinf/cosf performance is awful on amd64
has caused the Debian Bug report #572746,
regarding libm: sinf/cosf performance is awful on amd64
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)
--
572746: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=572746
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Package: libc6
Version: 2.10.2-6
Severity: normal
Hi,
After many tests and research I've come to the conclusion that the float
variants
of
sin/cos (and maybe others) are anormaly slow Debian amd64.
The performance loss is really impressive (around 8 to 9 times slower).
I've attached the prog used to make my experiments and used it in the following
cases.
+ Lenny-amd64: sinf/cosf is really slow
+ Lenny-i386: float performance is ok (faster than the cos/sin using double)
+ Sid-amd64: sinf/cosf slow
+ Lenny-amd64 using lenny-i386 binary and 32bits libs: float performance is OK.
+ OpenSuse 64 bits (10.3 and 11.1): using the binary compiled on lenny-amd64,
the tests run fine !
=> The problem is not compiler related.
There seems to be a problem with the way libm is compiled for the amd64
architecture on Debian.
This is why the OpenSuse test was run: the problem is somewhere in the compile
chain or debian specific patches.
We're extensively using these for calculations and this is a real problem.
Using
cos/sin as
a temporary workaround would do the trick but this is still slower than the
sinf/cosf
implementations that works so well on 32 bits computers...
Thank you
Jerome
-- System Information:
Debian Release: squeeze/sid
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.32-trunk-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8) (ignored: LC_ALL
set to en_US.utf8)
Shell: /bin/sh linked to /bin/bash
Versions of packages libc6 depends on:
ii libc-bin 2.10.2-6 Embedded GNU C Library: Binaries
ii libgcc1 1:4.4.3-3 GCC support library
libc6 recommends no packages.
Versions of packages libc6 suggests:
ii debconf [debconf-2.0] 1.5.28 Debian configuration management sy
pn glibc-doc <none> (no description available)
ii locales 2.10.2-6 Embedded GNU C Library: National L
-- debconf information excluded
CC=gcc
CFLAGS=-DNDEBUG -O3 -D_ISOC99_SOURCE -Wall -Wextra
LDFLAGS=-lm
all: test_trig
clean:
rm test_trig
test_trig: test_trig.c
#include <math.h>
#include <sys/time.h>
#include <stdio.h>
int main(void)
{
const int nbElement_i = 10000000;
int i=0;
float f1=0.0f, f2=0.0f, f3=0.0f;
struct timeval tv1, tv2;
printf("Testing %d sinf and cosf... ", nbElement_i);
fflush(stdout);
gettimeofday(&tv1, NULL);
for(i=0; i<nbElement_i; i++){
f1 += cosf(i);
f2 += sinf(i);
}
// This is needed for gcc to know a and b results
// really matters, otherwise sin and cos could
// be ignored.
f3 = f1+f2;
gettimeofday(&tv2, NULL);
//
printf("Result: %f, Duration: %ld sec %ld usec\n", f3, tv2.tv_sec - tv1.tv_sec, tv2.tv_usec - tv1.tv_usec);
f1 = 0.0f; f2 = 0.0f;
printf("Testing %d sin and cos (with float args)... ", nbElement_i);
fflush(stdout);
gettimeofday(&tv1, NULL);
for(i=0; i<nbElement_i; i++){
f1 += cos(i);
f2 += sin(i);
}
// This is needed for gcc to know a and b results
// really matters, otherwise sin and cos could
// be ignored.
f3 = f1+f2;
gettimeofday(&tv2, NULL);
//
printf("Result: %f, Duration: %ld sec %ld usec\n", f3, tv2.tv_sec - tv1.tv_sec, tv2.tv_usec - tv1.tv_usec);
return 0;
}
--- End Message ---
--- Begin Message ---
Version: 2.17-1
On 2010-03-06 11:42, Jerome Vizcaino wrote:
> Package: libc6
> Version: 2.10.2-6
> Severity: normal
>
> Hi,
>
> After many tests and research I've come to the conclusion that the float
> variants
> of
> sin/cos (and maybe others) are anormaly slow Debian amd64.
> The performance loss is really impressive (around 8 to 9 times slower).
> I've attached the prog used to make my experiments and used it in the
> following
> cases.
>
> + Lenny-amd64: sinf/cosf is really slow
> + Lenny-i386: float performance is ok (faster than the cos/sin using double)
> + Sid-amd64: sinf/cosf slow
> + Lenny-amd64 using lenny-i386 binary and 32bits libs: float performance is
> OK.
>
> + OpenSuse 64 bits (10.3 and 11.1): using the binary compiled on lenny-amd64,
> the tests run fine !
> => The problem is not compiler related.
>
> There seems to be a problem with the way libm is compiled for the amd64
> architecture on Debian.
> This is why the OpenSuse test was run: the problem is somewhere in the
> compile
> chain or debian specific patches.
>
> We're extensively using these for calculations and this is a real problem.
> Using
> cos/sin as
> a temporary workaround would do the trick but this is still slower than the
> sinf/cosf
> implementations that works so well on 32 bits computers...
SSE2 based sinf/cosf optimized routines have been added in version
2.17-1, fixing the performance and precision issue. I am therefore
closing this bug.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net
--- End Message ---