** Description changed: + NOTE: See comment 17 for a test case, and following comments for + analysis. + Binary package hint: gcc-4.5 I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only. It is hanging here: #0 0xffffe405 in __kernel_vsyscall () #1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 #2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0 #3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61 #4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417 #5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766 #6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845 #7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857 #8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873 #9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76 #10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39 #11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1 #12 0xf3b48f0d in ?? () from /lib/libselinux.so.1 #13 0xf3b37158 in _init () from /lib/libselinux.so.1 #14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70 #15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134 #16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2 ...which is happening even before entering main() If I break in pthread_mutex_lock, and look at the contents of the mutex, I see: $61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}}, __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} ...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc(): static inline void * imalloc(size_t size) { assert(size != 0); if (size <= arena_maxclass) return (arena_malloc(choose_arena(), size, false)); else return (huge_malloc(size, false)); Note that in this case, size < arena_maxclass. choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see: (gdb) print arenas[0] $3 = (arena_t *) 0xf394e040 (gdb) print &arenas[0] $4 = (arena_t **) 0xf394e000 (gdb) print arenas[0]->lock $5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000", __align = 0} (gdb) print arenas_map $6 = (arena_t *) 0xf394e040 (gdb) print &arenas_map $7 = (arena_t **) 0xf3a4faac The mutex looks correctly initialized here, and is unlocked. If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely: (gdb) print arena $6 = (arena_t *) 0xf3a4fab0 ...this should be the same as arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc Also: (gdb) print arena->lock $8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}}, __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer; (gdb) info registers eax 0xf3a4fab0 -207291728 ecx 0x3 3 edx 0x0 0 ebx 0xf7ffeebc -134222148 esp 0xffffd3d4 0xffffd3d4 ebp 0xffffd42c 0xffffd42c esi 0x160 352 edi 0xf3a4fab0 -207291728 eip 0xf7ff673f 0xf7ff673f <malloc+159> eflags 0x282 [ SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 Here is the interesting part of malloc disassembled: 0xf7ff66a0 <+0>: push %ebp 0xf7ff66a1 <+1>: mov %esp,%ebp 0xf7ff66a3 <+3>: sub $0x58,%esp 0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp) 0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx> 0xf7ff66ae <+14>: add $0x880e,%ebx 0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp) 0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi 0xf7ff66ba <+26>: mov %edi,-0x4(%ebp) 0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx) 0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88> 0xf7ff66c6 <+38>: test %esi,%esi 0xf7ff66c8 <+40>: mov $0x1,%eax 0xf7ff66cd <+45>: cmove %eax,%esi 0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi 0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128> 0xf7ff66d8 <+56>: mov %esi,%eax 0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc> 0xf7ff66df <+63>: mov %eax,%esi 0xf7ff66e1 <+65>: test %esi,%esi 0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97> 0xf7ff66e5 <+69>: mov %esi,%eax 0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx 0xf7ff66ea <+74>: mov -0x8(%ebp),%esi 0xf7ff66ed <+77>: mov -0x4(%ebp),%edi 0xf7ff66f0 <+80>: mov %ebp,%esp 0xf7ff66f2 <+82>: pop %ebp 0xf7ff66f3 <+83>: ret 0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi 0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard> 0xf7ff66fd <+93>: test %al,%al 0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38> 0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt> 0xf7ff6706 <+102>: xor %esi,%esi 0xf7ff6708 <+104>: movl $0xc,(%eax) 0xf7ff670e <+110>: mov %esi,%eax 0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx 0xf7ff6713 <+115>: mov -0x8(%ebp),%esi 0xf7ff6716 <+118>: mov -0x4(%ebp),%edi 0xf7ff6719 <+121>: mov %ebp,%esp 0xf7ff671b <+123>: pop %ebp 0xf7ff671c <+124>: ret 0xf7ff671d <+125>: lea 0x0(%esi),%esi 0xf7ff6720 <+128>: mov %gs:0x0,%eax 0xf7ff6726 <+134>: nop 0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi 0xf7ff672b <+139>: lea 0x0,%edx 0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp) 0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi 0xf7ff6737 <+151>: test %edi,%edi 0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704> 0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi 0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512> 0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi 0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440> 0xf7ff6757 <+183>: sub $0x1,%esi 0xf7ff675a <+186>: mov %esi,%edx 0xf7ff675c <+188>: shr %edx 0xf7ff675e <+190>: or %esi,%edx 0xf7ff6760 <+192>: mov %edx,%eax 0xf7ff6762 <+194>: shr $0x2,%eax 0xf7ff6765 <+197>: or %edx,%eax 0xf7ff6767 <+199>: mov %eax,%edx 0xf7ff6769 <+201>: shr $0x4,%edx 0xf7ff676c <+204>: or %eax,%edx 0xf7ff676e <+206>: mov %edx,%eax 0xf7ff6770 <+208>: shr $0x8,%eax 0xf7ff6773 <+211>: or %edx,%eax 0xf7ff6775 <+213>: mov $0xffffffff,%edx 0xf7ff677a <+218>: mov %eax,%esi 0xf7ff677c <+220>: shr $0x10,%esi 0xf7ff677f <+223>: or %eax,%esi 0xf7ff6781 <+225>: add $0x1,%esi 0xf7ff6784 <+228>: mov %esi,%eax 0xf7ff6786 <+230>: shr $0x2,%eax 0xf7ff6789 <+233>: bsf %eax,%eax 0xf7ff678c <+236>: cmove %edx,%eax 0xf7ff678f <+239>: cmp $0x1,%esi 0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax 0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax 0xf7ff679d <+253>: mov %eax,-0x1c(%ebp) 0xf7ff67a0 <+256>: mov $0x2,%eax 0xf7ff67a5 <+261>: cmova %esi,%eax 0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp) 0xf7ff67ab <+267>: mov %edi,(%esp) 0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt> So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t). From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock: 0xf7ff6720 <+128>: mov %gs:0x0,%eax 0xf7ff672b <+139>: lea 0x0,%edx 0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored: (gdb) print *0xf3a4faac $10 = −208347070 ...which is 0xf394e040 I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/663294 Title: Firefox built with gcc-4.5 is a non-starter on i386 with -pie -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs