Bug#625522: glibc: causes segfault in Xorg

2011-05-07 Thread Aurelien Jarno
On Wed, May 04, 2011 at 02:30:35PM +0200, Aurelien Jarno wrote:
 Le 04/05/2011 07:42, Steve M. Robbins a écrit :
  On Wed, May 04, 2011 at 12:10:48AM -0500, Jonathan Nieder wrote:
  
  Sounds like http://sourceware.org/bugzilla/show_bug.cgi?id=12518
  which is fixed (sort of) by commit 0354e355 (2011-04-01).
  
  Oh my word.  So glibc 2.13 breaks random binaries that happened to
  incorrectly use memcpy() instead of memmove()?  What's wrong with the
  glibc developers (and Ulrich Drepper in particular)?
  
  I'm with Linus on this: let's just revert to the old behaviour.  A
  tiny amount of clock cycles saved isn't worth the instability.
  
  Thanks,
  -Steve
  
  P.S.  I tried rebuilding glibc myself locally, but gcc also segfaults
  in the process :-(
  
 
 Are you sure it is something related? Which gcc version are you using?
 Do you have a backtrace point to the same issue?
 
 I am using this libc version for two months (on a CPU having ssse3
 instruction set), it is also used by other distributions, so I find
 strange it breaks something so common than gcc. For XOrg it can be due
 to the difference in configuration, that's why the problem stayed unnoticed.
 

Any news about that? Which GCC version is affected? Can you please send
us the backtrace?

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#625522: glibc: causes segfault in Xorg

2011-05-07 Thread Steve M. Robbins
On Sat, May 07, 2011 at 12:25:15PM +0200, Aurelien Jarno wrote:
 On Wed, May 04, 2011 at 02:30:35PM +0200, Aurelien Jarno wrote:
  Le 04/05/2011 07:42, Steve M. Robbins a écrit :

   P.S.  I tried rebuilding glibc myself locally, but gcc also segfaults
   in the process :-(
  
  Are you sure it is something related? Which gcc version are you using?
  Do you have a backtrace point to the same issue?

I was careless in my initial report; I should have specified that I
tried rebuilding the *old* glibc and got a segfault.  At this point,
all I really know is that building the eglibc 2.11.2-11 Debian source
package on my up-to-date sid amd64 machine fails:

make[3]: Entering directory `/home/steve/tmp/old-eglibc/eglibc-2.11.2/sunrpc'
CPP='gcc-4.4 -E -x c-header'
/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/elf/ld-linux-x86-64.so.2
 --library-path 
/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc:/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/math:/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/elf:/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/dlfcn:/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/nss:/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/nis:/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/rt:/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/resolv:/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/crypt:/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/nptl
 /home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/sunrpc/rpcgen 
-Y ../scripts -c rpcsvc/bootparam_prot.x -o 
/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/sunrpc/xbootparam_prot.T
make[3]: *** 
[/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/sunrpc/xbootparam_prot.stmp]
 Segmentation fault (core dumped)

The segfault is actually in the ld-linux-x86-64.so.2 binary produced
during the build, not gcc as I had earlier written.  The backtrace is:

(gdb) bt full
#0  0x in ?? ()
No symbol table info available.
#1  0x2b4d84d7e990 in call_init (l=value optimized out, argc=7, 
argv=0x7fff26bf50a0, env=0x7fff26bf50e0) at dl-init.c:85
j = 1
jm = 4
init_array = 0x2b4d852ebb50
#2  0x2b4d84d7ea87 in _dl_init (main_map=0x2b4d84f90178, argc=7, 
argv=0x7fff26bf50a0, env=0x7fff26bf50e0) at dl-init.c:134
preinit_array = value optimized out
preinit_array_size = 0x0
i = 0
#3  0x2b4d84d71b2a in _dl_start_user ()
   from 
/home/steve/tmp/old-eglibc/eglibc-2.11.2/build-tree/amd64-libc/elf/ld-linux-x86-64.so.2
No symbol table info available.
#4  0x7fff26bf6574 in ?? ()
No symbol table info available.
#5  0x0007 in ?? ()
No symbol table info available.
#6  0x7fff26bf6825 in ?? ()
No symbol table info available.
#7  0x7fff26bf6872 in ?? ()
No symbol table info available.
#8  0x7fff26bf6875 in ?? ()
No symbol table info available.
#9  0x7fff26bf6880 in ?? ()
No symbol table info available.
#10 0x7fff26bf6883 in ?? ()
No symbol table info available.
#11 0x7fff26bf689b in ?? ()
No symbol table info available.
#12 0x7fff26bf689e in ?? ()
No symbol table info available.
#13 0x in ?? ()
No symbol table info available.

Hope this clarifies the issue somewhat.

Thanks,
-Steve


signature.asc
Description: Digital signature


Bug#625522: glibc: causes segfault in Xorg

2011-05-04 Thread Jonathan Nieder
Aurelien Jarno wrote:

 Except that package rebuild doesn't mean a new upload (e.g binNMUs).

Yes, it would be painful if many packages have bugs of this kind.
Open source projects tend to check for this (and I've never run into
it after using libc 2.13 for a while) but I could easily be
underestimating how bad it is.

What I meant is that packages rebuilt against libc from sid are
generally targetted at wheezy.  That would (one hopes) give a little
time to test and fix them.

 I am not convinced that the upstream fix is really the solution. As soon
 as the package is rebuild, the problem will happen again.

I think it's mostly meant as a workaround to allow people to keep
using Flash and old binaries.

Another big downside is making almost everything depend on libc6 (=
2.14).  Binaries built against glibc with the upstream fix wouldn't be
usable on older systems.

 Le 04/05/2011 09:05, Jonathan Nieder a écrit :

 E.g., how about adopting hjl's suggestion and making the
 behavior (temporarily) conditional on a LD_DONT_BIND_IFUNC_MEMCPY_TO_MEMMOVE
 environment variable?

 I don't really feel like enabling critical features depending on an
 environment variable that might not be properly propagated in some shell
 scripts.

I'm not a huge fan of the envvar trick, but I think you read it
backwards.  Unlike hjl in the bug log, I was suggesting using the safe
behavior when the envvar is not set.  At worst a script using sudo or
env -i would cause programs it calls to use memmove instead of
memcpy.

Unfortunately I fear testers would be unlikely to actually use such
a variable.  Even MALLOC_PERTURB_ is not as widely used as one would
like, judging from the bugs it sometimes uncovers.

So yes, back to the drawing board.  Thanks for your thoughtfulness.



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#625522: glibc: causes segfault in Xorg

2011-05-04 Thread Aurelien Jarno
Le 04/05/2011 14:02, Jonathan Nieder a écrit :
 Aurelien Jarno wrote:
 
 Except that package rebuild doesn't mean a new upload (e.g binNMUs).
 
 Yes, it would be painful if many packages have bugs of this kind.
 Open source projects tend to check for this (and I've never run into
 it after using libc 2.13 for a while) but I could easily be
 underestimating how bad it is.
 
 What I meant is that packages rebuilt against libc from sid are
 generally targetted at wheezy.  That would (one hopes) give a little
 time to test and fix them.
 
 I am not convinced that the upstream fix is really the solution. As soon
 as the package is rebuild, the problem will happen again.
 
 I think it's mostly meant as a workaround to allow people to keep
 using Flash and old binaries.
 
 Another big downside is making almost everything depend on libc6 (=
 2.14).  Binaries built against glibc with the upstream fix wouldn't be
 usable on older systems.
 
 Le 04/05/2011 09:05, Jonathan Nieder a écrit :
 
 E.g., how about adopting hjl's suggestion and making the
 behavior (temporarily) conditional on a LD_DONT_BIND_IFUNC_MEMCPY_TO_MEMMOVE
 environment variable?

 I don't really feel like enabling critical features depending on an
 environment variable that might not be properly propagated in some shell
 scripts.
 
 I'm not a huge fan of the envvar trick, but I think you read it
 backwards.  Unlike hjl in the bug log, I was suggesting using the safe
 behavior when the envvar is not set.  At worst a script using sudo or
 env -i would cause programs it calls to use memmove instead of
 memcpy.
 
 Unfortunately I fear testers would be unlikely to actually use such
 a variable.  Even MALLOC_PERTURB_ is not as widely used as one would
 like, judging from the bugs it sometimes uncovers.
 
 So yes, back to the drawing board.  Thanks for your thoughtfulness.
 

I have tried to play a bit with some test codes. I have discovered that
even with old memcpy() implementation, it's not always possible to have
code that overlap. What changes with the new memcpy_ssse3 is that the
the copy happens backward, so the conditions are not the same.

It means that if we simply replace memcpy() by memmove(), people might
write code that works well with the new libc, but doesn't work on old
libc (or even worse depending on how other distributions have chosen to
workaround this bug, if they chose to do so). It doesn't seems to be a
good idea, especially for people using Debian as a development platform.

Basically it seems we only want to replace calls to __memcpy_ssse3_back
by calls to __memmove_ssse3_back.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org