On Sat, Jan 05, 2013 at 03:36:49PM +0100, Thomas Preud'homme wrote: > Le jeudi 13 décembre 2012 17:55:41, vous avez écrit : > > [SNIP] > > > > > from arch/arm/kernel/stacktrace.c: > > > > * With framepointer enabled, a simple function prologue looks like > > this: * mov ip, sp > > * stmdb sp!, {fp, ip, lr, pc} > > * sub fp, ip, #4 > > * > > * A simple function epilogue looks like this: > > * ldm sp, {fp, sp, pc} > > > > so since fp is initialized from ip-4 (old_sp-4) it points to (near ?) > > fp, not pc, right? > > stmdb means sp is decremented first (sp = sp - 4 * nbregs) and then register > are saved in ascending order from this new location. So, pc will end up at > initial_sp - 4 :) Hence ip points at pc. > > > > > also linux kernel is compiled with > > > > -mapcs -mno-sched-prolog -mabi=apcs-gnu > > > > and this changes frame layout compared to what gcc produces by default. > > > > (sorry for that near - i don't remember stack and stmdb arm details > > good enough) > > stmdb stands for STore Multiple Decrement Before > > > > > > > 8: e92d0870 push {r4, r5, r6, fp} ; NOTE > > > > r4... go before fp > > > > > > > > 14: e28db00c add fp, sp, #12 > > > > > > > > 5c: e59b0000 ldr r0, [fp] ; BUG here! > > > > > > Why is this a bug in GCC? It will load the old value of fp to r0. > > > > Hmm, after push here is how stack looks like: > > > > r4 <- old_sp > > r5 <- sp + 12 > > r6 > > fp > > <- sp > > > > (again, off-by-one error probably, but anywayy that sp+12 can't point to > > old fp). > > Nope. push store register in ascending order from the lowest address to the > highest address. push {regs} is actually the same as stmdb sp! {regs}. So, > after that initial push, you'd have: > > x <- old_sp > fp <- sp + 12 > r6 > r5 > r4 <- new_sp > > so in fp is stored sp+12 that is the address where is stored the old fp :) > > Hence the ldr does load the address of the previous frame in the return > register (r0). > > > > > ~~~~ > > > > Anyway, I've prepared more simple complete example to demonstrate the > > bug: > > > > ---- 8< ---- fp.c > > #include <stdio.h> > > > > void *func(int x) > > { > > char *fp1 = __builtin_frame_address(1); > > > > printf("Hello World! %i\n", x); > > > > fp1 += x; > > return fp1; > > } > > > > int main() > > { > > void *fp0 = __builtin_frame_address(0); > > void *fp1 = func(0); > > > > printf("fp0: %p fp1: %p\n", fp0, fp1); > > return 0; > > } > > > > The code should output > > > > Hello World! 0 > > fp0: <addr> fp1: <the-same-addr!> > > > > right? on i386: > > > > $ gcc -o fp_i386 -O2 fp.c > > $ ./fp_i386 > > Hello World! 0 > > fp0: 0xafeda538 fp1: 0xafeda538 > > > > > > but it does not on arm, look: > > > > $ arm-linux-gnueabi-gcc-4.4 -marm -o fp_arm -O2 fp.c > > $ qemu-arm -L /usr/arm-linux-gnueabi ./fp_arm > > Hello World! 0 > > fp0: 0x4080010c fp1: 0x8414 > > > > and the problem here is that gcc, as I think, incorrectly setups fp: > > > > $ arm-linux-gnueabi-gcc-4.4 -marm -o fp.s -S -O2 fp.c > > $ cat fp.s > > .cpu arm9tdmi > > .fpu softvfp > > .eabi_attribute 20, 1 > > .eabi_attribute 21, 1 > > .eabi_attribute 23, 3 > > .eabi_attribute 24, 1 > > .eabi_attribute 25, 1 > > .eabi_attribute 26, 2 > > .eabi_attribute 30, 2 > > .eabi_attribute 18, 4 > > .file "fp.c" > > .text > > .align 2 > > .global func > > .type func, %function > > func: > > @ Function supports interworking. > > @ args = 0, pretend = 0, frame = 0 > > @ frame_needed = 1, uses_anonymous_args = 0 > > stmfd sp!, {r4, r5, fp, lr} > > add fp, sp, #12 > > mov r4, r0 > > ldr r5, [fp, #0] ; <- bug here, should be [fp, #-4] > > mov r1, r4 > > ldr r0, .L3 > > bl printf > > add r0, r5, r4 > > sub sp, fp, #12 > > ldmfd sp!, {r4, r5, fp, lr} > > bx lr > > .L4: > > .align 2 > > .L3: > > .word .LC0 > > .size func, .-func > > .align 2 > > .global main > > .type main, %function > > main: > > @ Function supports interworking. > > @ args = 0, pretend = 0, frame = 0 > > @ frame_needed = 1, uses_anonymous_args = 0 > > stmfd sp!, {fp, lr} > > mov r0, #0 > > add fp, sp, #4 > > bl func > > mov r1, fp > > mov r2, r0 > > ldr r0, .L7 > > bl printf > > mov r0, #0 > > sub sp, fp, #4 > > ldmfd sp!, {fp, lr} > > bx lr > > .L8: > > .align 2 > > .L7: > > .word .LC1 > > .size main, .-main > > .section .rodata.str1.4,"aMS",%progbits,1 > > .align 2 > > .LC0: > > .ascii "Hello World! %i\012\000" > > .space 3 > > .LC1: > > .ascii "fp0: %p fp1: %p\012\000" > > .ident "GCC: (Debian 4.4.6-11) 4.4.6" > > .section .note.GNU-stack,"",%progbits > > Agreed, that's weird. I fails equally on my machine (though the code is > different, maybe because I'm using armv7-a as machine arch). > > > > > > > and if I apply the following patch > > > > diff --git a/t/fp.s b/t/fp1.s > > index 2047b05..85bb402 100644 > > --- a/t/fp.s > > +++ b/t/fp1.s > > @@ -20,7 +20,7 @@ func: > > stmfd sp!, {r4, r5, fp, lr} > > add fp, sp, #12 > > mov r4, r0 > > - ldr r5, [fp, #0] > > + ldr r5, [fp, #-4] > > mov r1, r4 > > ldr r0, .L3 > > bl printf > > > > it does work: > > > > $ arm-linux-gnueabi-gcc-4.4 -o fp fp.s > > $ arm-linux-gnueabi-gcc-4.4 -o fp1 fp1.s > > $ qemu-arm -L /usr/arm-linux-gnueabi ./fp > > Hello World! 0 > > fp0: 0x4080012c fp1: 0x8414 > > > > $ qemu-arm -L /usr/arm-linux-gnueabi ./fp1 > > Hello World! 0 > > fp0: 0x4080012c fp1: 0x4080012c > > > > that's why I think gcc generated wrong code here. > > In this case I agree. It seems that no matter what, __builtin_frame_address > loads fp, no matter where it points. > > > > > And to the reference, here is how code would look like if being part of > > linux: > > > > $ arm-linux-gnueabi-gcc-4.4 -o fpapcs.s -marm -mapcs > > -mno-sched-prolog -S -O2 fp.c $ arm-linux-gnueabi-gcc-4.4 -o fpapcs_gnu.s > > -marm -mapcs -mno-sched-prolog -mabi=apcs-gnu -S -O2 fp.c $ diff -u fp.s > > fpapcs.s > > --- fp.s 2012-12-13 20:34:09.000000000 +0400 > > +++ fpapcs.s 2012-12-13 20:34:09.000000000 +0400 > > @@ -17,16 +17,17 @@ > > @ Function supports interworking. > > @ args = 0, pretend = 0, frame = 0 > > @ frame_needed = 1, uses_anonymous_args = 0 > > - stmfd sp!, {r4, r5, fp, lr} > > - add fp, sp, #12 > > + mov ip, sp > > + stmfd sp!, {r4, r5, fp, ip, lr, pc} > > + sub fp, ip, #4 > > mov r4, r0 > > ldr r5, [fp, #0] > > mov r1, r4 > > ldr r0, .L3 > > bl printf > > add r0, r5, r4 > > - sub sp, fp, #12 > > - ldmfd sp!, {r4, r5, fp, lr} > > + sub sp, fp, #20 > > + ldmfd sp, {r4, r5, fp, sp, lr} > > bx lr > > .L4: > > .align 2 > > @@ -40,17 +41,18 @@ > > @ Function supports interworking. > > @ args = 0, pretend = 0, frame = 0 > > @ frame_needed = 1, uses_anonymous_args = 0 > > - stmfd sp!, {fp, lr} > > + mov ip, sp > > + stmfd sp!, {fp, ip, lr, pc} > > + sub fp, ip, #4 > > thus fp points to the location where pc is stored > > > mov r0, #0 > > - add fp, sp, #4 > > bl func > > mov r1, fp > > mov r2, r0 > > ldr r0, .L7 > > bl printf > > mov r0, #0 > > - sub sp, fp, #4 > > - ldmfd sp!, {fp, lr} > > + sub sp, fp, #12 > > + ldmfd sp, {fp, sp, lr} > > bx lr > > .L8: > > .align 2 > > > > $ diff -u fpapcs.s fpapcs_gnu.s > > --- fpapcs.s 2012-12-13 20:34:09.000000000 +0400 > > +++ fpapcs_gnu.s 2012-12-13 20:32:51.000000000 +0400 > > @@ -1,13 +1,3 @@ > > - .cpu arm9tdmi > > - .fpu softvfp > > - .eabi_attribute 20, 1 > > - .eabi_attribute 21, 1 > > - .eabi_attribute 23, 3 > > - .eabi_attribute 24, 1 > > - .eabi_attribute 25, 1 > > - .eabi_attribute 26, 2 > > - .eabi_attribute 30, 2 > > - .eabi_attribute 18, 4 > > .file "fp.c" > > .text > > .align 2 > > > > > > but it does not work either: > > > > $ arm-linux-gnueabi-gcc-4.4 -o fpapcs -mapcs -mno-sched-prolog fpapcs.s > > $ qemu-arm -L /usr/arm-linux-gnueabi ./fpapcs > > Hello World! 0 > > fp0: 0x4080011c fp1: 0x83e0 > > Of course since it's the return address which is displayed > > > > > > > ~~~~ > > > > There are many arm abis and maybe I'm doing something stupid, but to me > > all this looks like gcc generates incorrect code. > > Are you sure the code you compiled for ARM doesn't contain > __builtin_return_address instead of __builtin_frame_address?
Thomas, thanks for correcting my arm assembly errors. I agree I did make mistakes, but I'm sure __builtin_frame_address() is used, not __builtin_return_address(), and that __builtin_frame_address(1) is wrong. Here it is once again: $ cat fp.c #include <stdio.h> void *func(int x) { char *fp1 = __builtin_frame_address(1); printf("Hello World! %i\n", x); fp1 += x; return fp1; } int main() { void *fp0 = __builtin_frame_address(0); void *fp1 = func(0); printf("fp0: %p fp1: %p\n", fp0, fp1); return 0; } $ arm-linux-gnueabi-gcc-4.4 -ofpapcs -marm -mapcs -mno-sched-prolog -O2 fp.c $ qemu-arm -L /usr/arm-linux-gnueabi ./fpapcs Hello World! 0 fp0: 0x4080010c fp1: 0x83e0 > > > > ... as probably required by arm calling convention (not looked at the > > > > spec yet, but it seems reasanoble, given how push is really a block > > > > str and that str stores register in ascending order). > > > > > > I have once seen a page in MSDN describing the ARM stack frame of > > > Windows CE. I don't know if ARM specified how stack frames should look > > > like. > > > > From gcc manual: > > > > `-mapcs-frame' > > Generate a stack frame that is compliant with the ARM Procedure > > Call Standard for all functions, even if this is not strictly > > necessary for correct execution of the code. Specifying > > `-fomit-frame-pointer' with this option will cause the stack > > frames not to be generated for leaf functions. The default is > > `-mno-apcs-frame'. > > > > so at least there is "ARM Procedure Call Standard". Sorry, no time to > > search for it... > > > > > > Thanks, > > Undertimed Kirill > > Sorry for taking so much time to answer you. Nevermind... and thanks, Kirill _______________________________________________ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel