[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3
--- Comment #1 from mattst88 at gmail dot com 2010-04-08 16:50 --- Created an attachment (id=20337) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20337action=view) rewritten.S - external assembly -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691
[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3
--- Comment #2 from mattst88 at gmail dot com 2010-04-08 16:50 --- Created an attachment (id=20338) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20338action=view) test.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691
Re: [Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3
I don't think this is a bug in gcc. The inline-asm uses $16 but any of the output/temp registers could use that as you don't say the agrument is used as an input. Sent from my iPhone On Apr 8, 2010, at 9:50 AM, mattst88 at gmail dot com gcc-bugzi...@gcc.gnu.org wrote: --- Comment #2 from mattst88 at gmail dot com 2010-04-08 16:50 --- Created an attachment (id=20338) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20338action=view) test.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691
[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3
--- Comment #3 from pinskia at gmail dot com 2010-04-08 17:06 --- Subject: Re: Code segfault when compiled with -Os, -O2, or -O3 I don't think this is a bug in gcc. The inline-asm uses $16 but any of the output/temp registers could use that as you don't say the agrument is used as an input. Sent from my iPhone On Apr 8, 2010, at 9:50 AM, mattst88 at gmail dot com gcc-bugzi...@gcc.gnu.org wrote: --- Comment #2 from mattst88 at gmail dot com 2010-04-08 16:50 --- Created an attachment (id=20338) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20338action=view) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20338action=view) test.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691
[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3
--- Comment #4 from zackw at panix dot com 2010-04-08 17:28 --- (In reply to comment #0) When this testcase, using inline assembly, is compiled with -Os, -O2, or -O3 it segfaults. -O0 and -O1 allow it to run correctly. Moving the inline assembly into a separate file and including it in the compilation allow the program to run correctly at all -O levels. From these symptoms, it is practically certain that you have done something wrong with the asm inputs and outputs. I don't have an Alpha compiler to hand, but just from looking at your code, I bet it will work correctly if you rewrite it like so: unsigned long rewritten(const unsigned long b[2]) { unsigned long ofs, output; asm( cmoveq %0,64,%1# ofs= (b[0] ? ofs : 64);\n cmoveq %0,%2,%0# temp = (b[0] ? b[0] : b[1]);\n cttz %0,%0 # output = cttz(temp);\n : =r (output), =r (ofs) : r (b[1]), 0 (b[0]), 1 (0) ); return output + ofs; } (I've assumed that the semantic of cmoveq a,b,c is if (a==0) c=b;) The trick with asm() is to do as little as possible. I assume that the reason the assembly version beats the pure-C version is the cmoveq's, so I stripped the setup code and the addition. This allows me to express the _real_ argument constraints rather than fake ones, which lets me be confident that the optimizers will do what you want. Note that this also means volatile is unnecessary. As a general principle, if you find yourself writing an asm() with a big long list of earlyclobber outputs but no inputs, you are doing it wrong. -- zackw at panix dot com changed: What|Removed |Added CC||zackw at panix dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691
[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3
-- zackw at panix dot com changed: What|Removed |Added Status|UNCONFIRMED |WAITING http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691
[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3
--- Comment #5 from ubizjak at gmail dot com 2010-04-08 17:45 --- The problem is, that when rewritten gets inlined (in -O2+ case), you can't expect argument to be passed into the inlined section of the function via locations, specifed by ABI. So, you have: $main..ng: lda $30,-48($30) #,, stq $26,16($30) #, stq $9,24($30) #, .prologue 1 stq $31,32($30) # array, stq $31,40($30) # array, .setmacro # 35 test.c 1 ldq $1,0($16) # b0 # b0 clr $3 # ofs# ofs ldq $2,8($16) # b1 # b1 cmoveq $1,64,$3# ofs= (b0 ? ofs : 64); # b0, ofs cmoveq $1,$2,$1# output = (b0 ? b0 : b1); # b0, b1 cttz$1,$4 # temp = cttz(temp); # b0, temp addq$4,$3,$9# ret = temp + ofs # temp, ofs, result_1 which is clearly wrong, due to wrong ASM. So, invalid. -- ubizjak at gmail dot com changed: What|Removed |Added Status|WAITING |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691
[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3
--- Comment #6 from ubizjak at gmail dot com 2010-04-08 17:52 --- (In reply to comment #4) From these symptoms, it is practically certain that you have done something wrong with the asm inputs and outputs. I don't have an Alpha compiler to hand, but just from looking at your code, I bet it will work correctly if you rewrite *** ** ** ** YOU WIN ** ** ** *** Your proposed code works OK for all optimization flags... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691
[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3
--- Comment #7 from mattst88 at gmail dot com 2010-04-08 17:53 --- (In reply to comment #4) (In reply to comment #0) When this testcase, using inline assembly, is compiled with -Os, -O2, or -O3 it segfaults. -O0 and -O1 allow it to run correctly. Moving the inline assembly into a separate file and including it in the compilation allow the program to run correctly at all -O levels. From these symptoms, it is practically certain that you have done something wrong with the asm inputs and outputs. I don't have an Alpha compiler to hand, but just from looking at your code, I bet it will work correctly if you rewrite it like so: unsigned long rewritten(const unsigned long b[2]) { unsigned long ofs, output; asm( cmoveq %0,64,%1# ofs= (b[0] ? ofs : 64);\n cmoveq %0,%2,%0# temp = (b[0] ? b[0] : b[1]);\n cttz %0,%0 # output = cttz(temp);\n : =r (output), =r (ofs) : r (b[1]), 0 (b[0]), 1 (0) ); return output + ofs; } Yep, your code works. (I've assumed that the semantic of cmoveq a,b,c is if (a==0) c=b;) The trick with asm() is to do as little as possible. I assume that the reason the assembly version beats the pure-C version is the cmoveq's, so I stripped the setup code and the addition. This allows me to express the _real_ argument constraints rather than fake ones, which lets me be confident that the optimizers will do what you want. Note that this also means volatile is unnecessary. As a general principle, if you find yourself writing an asm() with a big long list of earlyclobber outputs but no inputs, you are doing it wrong. Thanks a ton for the advice. You knocked that out of the water. Marking as INVALID. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691