Re: dc hitting a compiler bug, or undefined behavior

2014-04-02 Thread Rich Felker
On Mon, Mar 31, 2014 at 02:17:33PM +0200, Denys Vlasenko wrote:
 On Sun, Mar 30, 2014 at 11:18 AM, Lauri Kasanen cur...@operamail.com wrote:
  Hi,
 
  I'm seeing busybox dc acting funny when compiled with some versions of
  gcc. This is on busybox git. The x86 binary busybox_unstripped and
  config are attached.
 
  gcc 4.2.2 - ok
  gcc 4.7.2:
  nc 10 1 add p
  2.738e+93
 
  So either bb is hitting a compiler bug,
 
 Looks like a compiler bug:
 
 d = strtod(argument, end);
 if (end != argument  *end == '\0') {
 push(d);
 
 is compiled to this:
 
0x08049490 +27:call   0x80488b0 strtod@plt
0x08049495 +32:mov-0x14(%ebp),%eax
0x08049498 +35:pop%ecx
0x08049499 +36:pop%ebx
0x0804949a +37:cmp%esi,%eax
0x0804949c +39:je 0x80494b1 stack_machine+60
0x0804949e +41:cmpb   $0x0,(%eax)
0x080494a1 +44:jne0x80494b5 stack_machine+64
0x080494a3 +46:push   %eax
0x080494a4 +47:push   %eax
0x080494a5 +48:fstpl  (%esp)   HERE
0x080494a8 +51:call   0x8048f10 push
 
 Note how push(double a) argument gets passed on stack.
 But push() starts with this:
 
 Dump of assembler code for function push:
 = 0x08048f10 +0:fldl   (%edi)
 
 which reads argument from some bogus location instead
 of stack:
 
 (gdb) nexti  === EXECUTE fldl
 0x08048f12 in push ()
 (gdb) p $st0
 $1 = 2.0554264135011661055418432229752339e-314
 
 That's not 10!
 10 exists on stack all right:
 
 (gdb) p *(double*)($esp+4)
 $5 = 10
 
 the program just doesn't fetch it correctly.

It really looks to me like some non-default optimization is going on
and misbehaving. I've never seen gcc emit this kind of 'bare' function
without prologue/epilogue (or in this case, the ugly prologue moved
inside the conditional just before the function call). Is the busybox
build system doing anything behind the scenes (or perhaps with #pragma
or hidden attribte tags?) to get gcc to adjust the calling convention
for some functions?

Rich
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: dc hitting a compiler bug, or undefined behavior

2014-03-31 Thread Lauri Kasanen
On Mon, Mar 31, 2014, at 0:37, Ralf Friedl wrote:
  What's even worse is that adding any output to push(), even a puts(hi)
  that does not print the argument or any of the stack vars, fixes it. So
  something magic is going on inside the GCC optimization, I'm afraid this
  is above my pay grade.
  Could you send the file miscutils/dc.o that is created with and without
  this puts(hi) in push()?
  Attached.
 Are you using some special compiler options, especially regarding 
 parameter passing in registers and stack alignment?

None, I'm afraid. CFLAGS etc are all unset. See later on for the gcc
defaults too.

Thanks for the research. I only speak x86 asm on a yep, that's asm
level.

 The instruction at 12 loads the double from address %edi after %edi has 
 been set to point to the parameter area. The instruction at 0 in the 
 failed case is exactly the same, except that %edi has not been setup 
 before. So I would consider this a compiler bug.

Would it be possible for you to submit a gcc bug? I can try to do so
too, but it'll take a (long) while before I have time to build latest
gcc to test.
On bb side, I wonder if this gcc version should be blacklisted, or if
some workaround can be found.

Here's the -S -fverbose-asm output of an empty C file. I see nothing
about alignment, regparms, or otherwise relevant default options.

# GNU C (GCC) version 4.7.2 (i486-pc-linux-gnu)
#   compiled by GNU C version 4.7.2, GMP version 5.1.1, MPFR version
3.1.1, MPC version 1.0.1
# GGC heuristics: --param ggc-min-expand=38 --param
ggc-min-heapsize=15472
# options passed: 
# -iprefix
/tmp/tcloop/gcc/usr/local/bin/../lib/gcc/i486-pc-linux-gnu/4.7.2/
# test.c -mtune=i486 -march=i486 -fverbose-asm
# options enabled:  -fauto-inc-dec -fbranch-count-reg -fcommon
# -fdebug-types-section -fdelete-null-pointer-checks -fdwarf2-cfi-asm
# -fearly-inlining -feliminate-unused-debug-types -ffunction-cse
-fgcse-lm
# -fgnu-runtime -fident -finline-atomics -fira-share-save-slots
# -fira-share-spill-slots -fivopts -fkeep-static-consts
# -fleading-underscore -fmath-errno -fmerge-debug-strings
# -fmove-loop-invariants -fpcc-struct-return -fpeephole
# -fprefetch-loop-arrays -fsched-critical-path-heuristic
# -fsched-dep-count-heuristic -fsched-group-heuristic -fsched-interblock
# -fsched-last-insn-heuristic -fsched-rank-heuristic -fsched-spec
# -fsched-spec-insn-heuristic -fsched-stalled-insns-dep -fshow-column
# -fsigned-zeros -fsplit-ivs-in-unroller -fstrict-volatile-bitfields
# -ftrapping-math -ftree-cselim -ftree-forwprop -ftree-loop-if-convert
# -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize
# -ftree-parallelize-loops= -ftree-phiprop -ftree-pta -ftree-reassoc
# -ftree-scev-cprop -ftree-slp-vectorize -ftree-vect-loop-version
# -funit-at-a-time -fvect-cost-model -fverbose-asm
# -fzero-initialized-in-bss -m32 -m80387 -m96bit-long-double
# -malign-stringops -mfancy-math-387 -mfp-ret-in-387 -mglibc -mieee-fp
# -mno-red-zone -mno-sse4 -mpush-args -msahf -mtls-direct-seg-refs

- Lauri

-- 
http://www.fastmail.fm - Faster than the air-speed velocity of an
  unladen european swallow

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: dc hitting a compiler bug, or undefined behavior

2014-03-31 Thread Ralf Friedl

Lauri Kasanen wrote:

On Mon, Mar 31, 2014, at 0:37, Ralf Friedl wrote:

Are you using some special compiler options, especially regarding
parameter passing in registers and stack alignment?

None, I'm afraid. CFLAGS etc are all unset. See later on for the gcc
defaults too.

Thanks for the research. I only speak x86 asm on a yep, that's asm
level.


The instruction at 12 loads the double from address %edi after %edi has
been set to point to the parameter area. The instruction at 0 in the
failed case is exactly the same, except that %edi has not been setup
before. So I would consider this a compiler bug.

Would it be possible for you to submit a gcc bug? I can try to do so
too, but it'll take a (long) while before I have time to build latest
gcc to test.
And if they ask what to do to reproduce the bug then I answer that I 
don't know. Besides I don't know whether 4.7.1 is still in development. 
They might ask whether an update to 4.7.3 or 4.8.2 solves the problem.

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: dc hitting a compiler bug, or undefined behavior

2014-03-31 Thread Denys Vlasenko
On Sun, Mar 30, 2014 at 11:18 AM, Lauri Kasanen cur...@operamail.com wrote:
 Hi,

 I'm seeing busybox dc acting funny when compiled with some versions of
 gcc. This is on busybox git. The x86 binary busybox_unstripped and
 config are attached.

 gcc 4.2.2 - ok
 gcc 4.7.2:
 nc 10 1 add p
 2.738e+93

 So either bb is hitting a compiler bug,

Looks like a compiler bug:

d = strtod(argument, end);
if (end != argument  *end == '\0') {
push(d);

is compiled to this:

   0x08049490 +27:call   0x80488b0 strtod@plt
   0x08049495 +32:mov-0x14(%ebp),%eax
   0x08049498 +35:pop%ecx
   0x08049499 +36:pop%ebx
   0x0804949a +37:cmp%esi,%eax
   0x0804949c +39:je 0x80494b1 stack_machine+60
   0x0804949e +41:cmpb   $0x0,(%eax)
   0x080494a1 +44:jne0x80494b5 stack_machine+64
   0x080494a3 +46:push   %eax
   0x080494a4 +47:push   %eax
   0x080494a5 +48:fstpl  (%esp)   HERE
   0x080494a8 +51:call   0x8048f10 push

Note how push(double a) argument gets passed on stack.
But push() starts with this:

Dump of assembler code for function push:
= 0x08048f10 +0:fldl   (%edi)

which reads argument from some bogus location instead
of stack:

(gdb) nexti  === EXECUTE fldl
0x08048f12 in push ()
(gdb) p $st0
$1 = 2.0554264135011661055418432229752339e-314

That's not 10!
10 exists on stack all right:

(gdb) p *(double*)($esp+4)
$5 = 10

the program just doesn't fetch it correctly.
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: dc hitting a compiler bug, or undefined behavior

2014-03-30 Thread Ralf Friedl

Lauri Kasanen wrote:

I'm seeing busybox dc acting funny when compiled with some versions of
gcc. This is on busybox git. The x86 binary busybox_unstripped and
config are attached.

gcc 4.2.2 - ok
gcc 4.7.2:
nc 10 1 add p
2.738e+93

So either bb is hitting a compiler bug, or undefined behavior somewhere
with new gcc's more aggressive optimizations.
I have 4.7.2 on x86 and I get 11 as output. You could add debug output 
to push and pop to see what happens.


Whether the output 11 is to be expected is another question:
$ dc --version
dc (GNU bc 1.06.95) 1.3.95
$ dc --help
Usage: dc [OPTION] [file ...]
  -e, --expression=EXPRevaluate expression
  -f, --file=FILE  evaluate contents of file
  -h, --help   display this help and exit
  -V, --versionoutput version information and exit
$ dc 10 1 add p
dc: Could not open file 10
dc: Could not open file 1
dc: Could not open file add
dc: Could not open file p
$ dc -e '10 1 add p'

$ dc -e '10 1 + p'
11

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: dc hitting a compiler bug, or undefined behavior

2014-03-30 Thread Lauri Kasanen
On Sun, Mar 30, 2014, at 15:05, Ralf Friedl wrote:
  I'm seeing busybox dc acting funny when compiled with some versions of
  gcc. This is on busybox git. The x86 binary busybox_unstripped and
  config are attached.
 
  gcc 4.2.2 - ok
  gcc 4.7.2:
  nc 10 1 add p
  2.738e+93
 
  So either bb is hitting a compiler bug, or undefined behavior somewhere
  with new gcc's more aggressive optimizations.

 I have 4.7.2 on x86 and I get 11 as output. You could add debug output 
 to push and pop to see what happens.

I think it's related to 
http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
which hit the radeon driver as well
http://lists.freedesktop.org/archives/dri-devel/2013-August/044122.html

but changing the stack[1] to stack[0], stack[], or even stack[10] did
not fix the issue.

What's even worse is that adding any output to push(), even a puts(hi)
that does not print the argument or any of the stack vars, fixes it. So
something magic is going on inside the GCC optimization, I'm afraid this
is above my pay grade.

FWIW printf calls in pop() or any other functions did not affect the
calculation. push() is being given the correct argument after strtod.

- Lauri

-- 
http://www.fastmail.fm - The way an email service should be

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: dc hitting a compiler bug, or undefined behavior

2014-03-30 Thread Ralf Friedl

Lauri Kasanen schrieb:

On Sun, Mar 30, 2014, at 15:05, Ralf Friedl wrote:

I'm seeing busybox dc acting funny when compiled with some versions of
gcc. This is on busybox git. The x86 binary busybox_unstripped and
config are attached.

gcc 4.2.2 - ok
gcc 4.7.2:
nc 10 1 add p
2.738e+93

So either bb is hitting a compiler bug, or undefined behavior somewhere
with new gcc's more aggressive optimizations.

I have 4.7.2 on x86 and I get 11 as output. You could add debug output
to push and pop to see what happens.

I think it's related to
http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
which hit the radeon driver as well
http://lists.freedesktop.org/archives/dri-devel/2013-August/044122.html

but changing the stack[1] to stack[0], stack[], or even stack[10] did
not fix the issue.

If the change didn't fix it, then maybe it's not related to that.


What's even worse is that adding any output to push(), even a puts(hi)
that does not print the argument or any of the stack vars, fixes it. So
something magic is going on inside the GCC optimization, I'm afraid this
is above my pay grade.
Could you send the file miscutils/dc.o that is created with and without 
this puts(hi) in push()?

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: dc hitting a compiler bug, or undefined behavior

2014-03-30 Thread Lauri Kasanen
On Sun, Mar 30, 2014, at 18:26, Ralf Friedl wrote:
  What's even worse is that adding any output to push(), even a puts(hi)
  that does not print the argument or any of the stack vars, fixes it. So
  something magic is going on inside the GCC optimization, I'm afraid this
  is above my pay grade.

 Could you send the file miscutils/dc.o that is created with and without 
 this puts(hi) in push()?

Attached.

- Lauri

-- 
http://www.fastmail.fm - Email service worth paying for. Try it for free



fail-dc.o.gz
Description: GNU Zip compressed data


success-dc.o.gz
Description: GNU Zip compressed data
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: dc hitting a compiler bug, or undefined behavior

2014-03-30 Thread Ralf Friedl

Lauri Kasanen wrote:

On Sun, Mar 30, 2014, at 18:26, Ralf Friedl wrote:

What's even worse is that adding any output to push(), even a puts(hi)
that does not print the argument or any of the stack vars, fixes it. So
something magic is going on inside the GCC optimization, I'm afraid this
is above my pay grade.

Could you send the file miscutils/dc.o that is created with and without
this puts(hi) in push()?

Attached.
Are you using some special compiler options, especially regarding 
parameter passing in registers and stack alignment?


The relevant part of fail-dc.o is this:
 push:
   0:   dd 07   fldl   (%edi)
The function expects the value to push at the address pointed to by 
%edi. But the functions that call push pass the value at the top of the 
CPU stack (not to be confused with the stack dc implements).


The relevant part ofsuccess-dc.o is this:
 push:
   0:   57  push   %edi
   1:   8d 7c 24 08 lea0x8(%esp),%edi
   5:   83 e4 f0and$0xfff0,%esp
   8:   ff 77 fcpushl  -0x4(%edi)
   b:   55  push   %ebp
   c:   89 e5   mov%esp,%ebp
   e:   57  push   %edi
   f:   83 ec 14sub$0x14,%esp
  12:   dd 07   fldl   (%edi)
These lines set up an aligned stack.
0: save %edi
1: put address of top of stack at the time the function was called in 
%edi. This is the address of the parameter.

5: align stack to 0x10 boundary
8: push return address of the function
b, c: normal frame setup
e: save %edi for later use
f: make space for a double and align to 0x10
12: load parameter, %edi still points to the address of the parameter.

The instruction at 12 loads the double from address %edi after %edi has 
been set to point to the parameter area. The instruction at 0 in the 
failed case is exactly the same, except that %edi has not been setup 
before. So I would consider this a compiler bug.


I wrote that the instruction at f makes space for an aligned double. 
This is itself is strange because later on the double that is loaded 
from %edi is saved on the CPU stack and later loaded from the CPU stack 
and saved in the dc stack, which is unnecessary. Also the double is 
always loaded to the FPU stack and then removed if bb_error_msg_and_die 
is called, instead of loading it only after it is clear that it will be 
used. So there is also opportunity for further optimization of the compiler.


This stack alignment makes your code bigger, and the additional 
instructions also have to be executed, which also takes time. I'm not 
sure whether the aligned stack saves enough time to offset this.

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox