Re: dc hitting a compiler bug, or undefined behavior
On Mon, Mar 31, 2014 at 02:17:33PM +0200, Denys Vlasenko wrote: > On Sun, Mar 30, 2014 at 11:18 AM, Lauri Kasanen wrote: > > Hi, > > > > I'm seeing busybox dc acting funny when compiled with some versions of > > gcc. This is on busybox git. The x86 binary busybox_unstripped and > > config are attached. > > > > gcc 4.2.2 - ok > > gcc 4.7.2: > > nc 10 1 add p > > 2.738e+93 > > > > So either bb is hitting a compiler bug, > > Looks like a compiler bug: > > d = strtod(argument, &end); > if (end != argument && *end == '\0') { > push(d); > > is compiled to this: > >0x08049490 <+27>:call 0x80488b0 >0x08049495 <+32>:mov-0x14(%ebp),%eax >0x08049498 <+35>:pop%ecx >0x08049499 <+36>:pop%ebx >0x0804949a <+37>:cmp%esi,%eax >0x0804949c <+39>:je 0x80494b1 >0x0804949e <+41>:cmpb $0x0,(%eax) >0x080494a1 <+44>:jne0x80494b5 >0x080494a3 <+46>:push %eax >0x080494a4 <+47>:push %eax >0x080494a5 <+48>:fstpl (%esp) < HERE >0x080494a8 <+51>:call 0x8048f10 > > Note how push(double a) argument gets passed on stack. > But push() starts with this: > > Dump of assembler code for function push: > => 0x08048f10 <+0>:fldl (%edi) > > which reads argument from some bogus location instead > of stack: > > (gdb) nexti <=== EXECUTE fldl > 0x08048f12 in push () > (gdb) p $st0 > $1 = 2.0554264135011661055418432229752339e-314 > > That's not 10! > 10 exists on stack all right: > > (gdb) p *(double*)($esp+4) > $5 = 10 > > the program just doesn't fetch it correctly. It really looks to me like some non-default optimization is going on and misbehaving. I've never seen gcc emit this kind of 'bare' function without prologue/epilogue (or in this case, the ugly prologue moved inside the conditional just before the function call). Is the busybox build system doing anything behind the scenes (or perhaps with #pragma or hidden attribte tags?) to get gcc to adjust the calling convention for some functions? Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: dc hitting a compiler bug, or undefined behavior
On Sun, Mar 30, 2014 at 11:18 AM, Lauri Kasanen wrote: > Hi, > > I'm seeing busybox dc acting funny when compiled with some versions of > gcc. This is on busybox git. The x86 binary busybox_unstripped and > config are attached. > > gcc 4.2.2 - ok > gcc 4.7.2: > nc 10 1 add p > 2.738e+93 > > So either bb is hitting a compiler bug, Looks like a compiler bug: d = strtod(argument, &end); if (end != argument && *end == '\0') { push(d); is compiled to this: 0x08049490 <+27>:call 0x80488b0 0x08049495 <+32>:mov-0x14(%ebp),%eax 0x08049498 <+35>:pop%ecx 0x08049499 <+36>:pop%ebx 0x0804949a <+37>:cmp%esi,%eax 0x0804949c <+39>:je 0x80494b1 0x0804949e <+41>:cmpb $0x0,(%eax) 0x080494a1 <+44>:jne0x80494b5 0x080494a3 <+46>:push %eax 0x080494a4 <+47>:push %eax 0x080494a5 <+48>:fstpl (%esp) < HERE 0x080494a8 <+51>:call 0x8048f10 Note how push(double a) argument gets passed on stack. But push() starts with this: Dump of assembler code for function push: => 0x08048f10 <+0>:fldl (%edi) which reads argument from some bogus location instead of stack: (gdb) nexti <=== EXECUTE fldl 0x08048f12 in push () (gdb) p $st0 $1 = 2.0554264135011661055418432229752339e-314 That's not 10! 10 exists on stack all right: (gdb) p *(double*)($esp+4) $5 = 10 the program just doesn't fetch it correctly. ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: dc hitting a compiler bug, or undefined behavior
Lauri Kasanen wrote: On Mon, Mar 31, 2014, at 0:37, Ralf Friedl wrote: Are you using some special compiler options, especially regarding parameter passing in registers and stack alignment? None, I'm afraid. CFLAGS etc are all unset. See later on for the gcc defaults too. Thanks for the research. I only speak x86 asm on a "yep, that's asm" level. The instruction at 12 loads the double from address %edi after %edi has been set to point to the parameter area. The instruction at 0 in the failed case is exactly the same, except that %edi has not been setup before. So I would consider this a compiler bug. Would it be possible for you to submit a gcc bug? I can try to do so too, but it'll take a (long) while before I have time to build latest gcc to test. And if they ask what to do to reproduce the bug then I answer that I don't know. Besides I don't know whether 4.7.1 is still in development. They might ask whether an update to 4.7.3 or 4.8.2 solves the problem. ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: dc hitting a compiler bug, or undefined behavior
On Mon, Mar 31, 2014, at 0:37, Ralf Friedl wrote: > >>> What's even worse is that adding any output to push(), even a puts("hi") > >>> that does not print the argument or any of the stack vars, fixes it. So > >>> something magic is going on inside the GCC optimization, I'm afraid this > >>> is above my pay grade. > >> Could you send the file miscutils/dc.o that is created with and without > >> this puts("hi") in push()? > > Attached. > Are you using some special compiler options, especially regarding > parameter passing in registers and stack alignment? None, I'm afraid. CFLAGS etc are all unset. See later on for the gcc defaults too. Thanks for the research. I only speak x86 asm on a "yep, that's asm" level. > The instruction at 12 loads the double from address %edi after %edi has > been set to point to the parameter area. The instruction at 0 in the > failed case is exactly the same, except that %edi has not been setup > before. So I would consider this a compiler bug. Would it be possible for you to submit a gcc bug? I can try to do so too, but it'll take a (long) while before I have time to build latest gcc to test. On bb side, I wonder if this gcc version should be blacklisted, or if some workaround can be found. Here's the -S -fverbose-asm output of an empty C file. I see nothing about alignment, regparms, or otherwise relevant default options. # GNU C (GCC) version 4.7.2 (i486-pc-linux-gnu) # compiled by GNU C version 4.7.2, GMP version 5.1.1, MPFR version 3.1.1, MPC version 1.0.1 # GGC heuristics: --param ggc-min-expand=38 --param ggc-min-heapsize=15472 # options passed: # -iprefix /tmp/tcloop/gcc/usr/local/bin/../lib/gcc/i486-pc-linux-gnu/4.7.2/ # test.c -mtune=i486 -march=i486 -fverbose-asm # options enabled: -fauto-inc-dec -fbranch-count-reg -fcommon # -fdebug-types-section -fdelete-null-pointer-checks -fdwarf2-cfi-asm # -fearly-inlining -feliminate-unused-debug-types -ffunction-cse -fgcse-lm # -fgnu-runtime -fident -finline-atomics -fira-share-save-slots # -fira-share-spill-slots -fivopts -fkeep-static-consts # -fleading-underscore -fmath-errno -fmerge-debug-strings # -fmove-loop-invariants -fpcc-struct-return -fpeephole # -fprefetch-loop-arrays -fsched-critical-path-heuristic # -fsched-dep-count-heuristic -fsched-group-heuristic -fsched-interblock # -fsched-last-insn-heuristic -fsched-rank-heuristic -fsched-spec # -fsched-spec-insn-heuristic -fsched-stalled-insns-dep -fshow-column # -fsigned-zeros -fsplit-ivs-in-unroller -fstrict-volatile-bitfields # -ftrapping-math -ftree-cselim -ftree-forwprop -ftree-loop-if-convert # -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize # -ftree-parallelize-loops= -ftree-phiprop -ftree-pta -ftree-reassoc # -ftree-scev-cprop -ftree-slp-vectorize -ftree-vect-loop-version # -funit-at-a-time -fvect-cost-model -fverbose-asm # -fzero-initialized-in-bss -m32 -m80387 -m96bit-long-double # -malign-stringops -mfancy-math-387 -mfp-ret-in-387 -mglibc -mieee-fp # -mno-red-zone -mno-sse4 -mpush-args -msahf -mtls-direct-seg-refs - Lauri -- http://www.fastmail.fm - Faster than the air-speed velocity of an unladen european swallow ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: dc hitting a compiler bug, or undefined behavior
Lauri Kasanen wrote: On Sun, Mar 30, 2014, at 18:26, Ralf Friedl wrote: What's even worse is that adding any output to push(), even a puts("hi") that does not print the argument or any of the stack vars, fixes it. So something magic is going on inside the GCC optimization, I'm afraid this is above my pay grade. Could you send the file miscutils/dc.o that is created with and without this puts("hi") in push()? Attached. Are you using some special compiler options, especially regarding parameter passing in registers and stack alignment? The relevant part of fail-dc.o is this: : 0: dd 07 fldl (%edi) The function expects the value to push at the address pointed to by %edi. But the functions that call push pass the value at the top of the CPU stack (not to be confused with the stack dc implements). The relevant part ofsuccess-dc.o is this: : 0: 57 push %edi 1: 8d 7c 24 08 lea0x8(%esp),%edi 5: 83 e4 f0and$0xfff0,%esp 8: ff 77 fcpushl -0x4(%edi) b: 55 push %ebp c: 89 e5 mov%esp,%ebp e: 57 push %edi f: 83 ec 14sub$0x14,%esp 12: dd 07 fldl (%edi) These lines set up an aligned stack. 0: save %edi 1: put address of top of stack at the time the function was called in %edi. This is the address of the parameter. 5: align stack to 0x10 boundary 8: push return address of the function b, c: normal frame setup e: save %edi for later use f: make space for a double and align to 0x10 12: load parameter, %edi still points to the address of the parameter. The instruction at 12 loads the double from address %edi after %edi has been set to point to the parameter area. The instruction at 0 in the failed case is exactly the same, except that %edi has not been setup before. So I would consider this a compiler bug. I wrote that the instruction at f makes space for an aligned double. This is itself is strange because later on the double that is loaded from %edi is saved on the CPU stack and later loaded from the CPU stack and saved in the dc stack, which is unnecessary. Also the double is always loaded to the FPU stack and then removed if bb_error_msg_and_die is called, instead of loading it only after it is clear that it will be used. So there is also opportunity for further optimization of the compiler. This stack alignment makes your code bigger, and the additional instructions also have to be executed, which also takes time. I'm not sure whether the aligned stack saves enough time to offset this. ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: dc hitting a compiler bug, or undefined behavior
On Sun, Mar 30, 2014, at 18:26, Ralf Friedl wrote: > > What's even worse is that adding any output to push(), even a puts("hi") > > that does not print the argument or any of the stack vars, fixes it. So > > something magic is going on inside the GCC optimization, I'm afraid this > > is above my pay grade. > > Could you send the file miscutils/dc.o that is created with and without > this puts("hi") in push()? Attached. - Lauri -- http://www.fastmail.fm - Email service worth paying for. Try it for free fail-dc.o.gz Description: GNU Zip compressed data success-dc.o.gz Description: GNU Zip compressed data ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: dc hitting a compiler bug, or undefined behavior
Lauri Kasanen schrieb: On Sun, Mar 30, 2014, at 15:05, Ralf Friedl wrote: I'm seeing busybox dc acting funny when compiled with some versions of gcc. This is on busybox git. The x86 binary busybox_unstripped and config are attached. gcc 4.2.2 - ok gcc 4.7.2: nc 10 1 add p 2.738e+93 So either bb is hitting a compiler bug, or undefined behavior somewhere with new gcc's more aggressive optimizations. I have 4.7.2 on x86 and I get 11 as output. You could add debug output to push and pop to see what happens. I think it's related to http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html which hit the radeon driver as well http://lists.freedesktop.org/archives/dri-devel/2013-August/044122.html but changing the stack[1] to stack[0], stack[], or even stack[10] did not fix the issue. If the change didn't fix it, then maybe it's not related to that. What's even worse is that adding any output to push(), even a puts("hi") that does not print the argument or any of the stack vars, fixes it. So something magic is going on inside the GCC optimization, I'm afraid this is above my pay grade. Could you send the file miscutils/dc.o that is created with and without this puts("hi") in push()? ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: dc hitting a compiler bug, or undefined behavior
On Sun, Mar 30, 2014, at 15:05, Ralf Friedl wrote: > > I'm seeing busybox dc acting funny when compiled with some versions of > > gcc. This is on busybox git. The x86 binary busybox_unstripped and > > config are attached. > > > > gcc 4.2.2 - ok > > gcc 4.7.2: > > nc 10 1 add p > > 2.738e+93 > > > > So either bb is hitting a compiler bug, or undefined behavior somewhere > > with new gcc's more aggressive optimizations. > > I have 4.7.2 on x86 and I get 11 as output. You could add debug output > to push and pop to see what happens. I think it's related to http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html which hit the radeon driver as well http://lists.freedesktop.org/archives/dri-devel/2013-August/044122.html but changing the stack[1] to stack[0], stack[], or even stack[10] did not fix the issue. What's even worse is that adding any output to push(), even a puts("hi") that does not print the argument or any of the stack vars, fixes it. So something magic is going on inside the GCC optimization, I'm afraid this is above my pay grade. FWIW printf calls in pop() or any other functions did not affect the calculation. push() is being given the correct argument after strtod. - Lauri -- http://www.fastmail.fm - The way an email service should be ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: dc hitting a compiler bug, or undefined behavior
Lauri Kasanen wrote: I'm seeing busybox dc acting funny when compiled with some versions of gcc. This is on busybox git. The x86 binary busybox_unstripped and config are attached. gcc 4.2.2 - ok gcc 4.7.2: nc 10 1 add p 2.738e+93 So either bb is hitting a compiler bug, or undefined behavior somewhere with new gcc's more aggressive optimizations. I have 4.7.2 on x86 and I get 11 as output. You could add debug output to push and pop to see what happens. Whether the output 11 is to be expected is another question: $ dc --version dc (GNU bc 1.06.95) 1.3.95 $ dc --help Usage: dc [OPTION] [file ...] -e, --expression=EXPRevaluate expression -f, --file=FILE evaluate contents of file -h, --help display this help and exit -V, --versionoutput version information and exit $ dc 10 1 add p dc: Could not open file 10 dc: Could not open file 1 dc: Could not open file add dc: Could not open file p $ dc -e '10 1 add p' $ dc -e '10 1 + p' 11 ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
dc hitting a compiler bug, or undefined behavior
Hi, I'm seeing busybox dc acting funny when compiled with some versions of gcc. This is on busybox git. The x86 binary busybox_unstripped and config are attached. gcc 4.2.2 - ok gcc 4.7.2: nc 10 1 add p 2.738e+93 So either bb is hitting a compiler bug, or undefined behavior somewhere with new gcc's more aggressive optimizations. - Lauri -- http://www.fastmail.fm - Email service worth paying for. Try it for free busybox_unstripped.gz Description: GNU Zip compressed data bb-config.gz Description: GNU Zip compressed data ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox