[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440 Richard Earnshaw changed: What|Removed |Added Status|NEW |RESOLVED CC||rearnsha at gcc dot gnu.org Resolution||FIXED Target Milestone|--- |4.6.0 --- Comment #12 from Richard Earnshaw 2010-11-13 23:09:47 UTC --- Fixed on trunk.
[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440 --- Comment #11 from Richard Earnshaw 2010-11-13 23:08:31 UTC --- Author: rearnsha Date: Sat Nov 13 23:08:26 2010 New Revision: 166723 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=166723 Log: PR target/43440 * tm.texi.in (OVERLAPPING_REGISTER_NAMES): Document new macro. * tm.texi: Regenerated. * output.h (decode_reg_name_and_count): Declare. * varasm.c (decode_reg_name_and_count): New function. (decode_reg_name): Reimplement using decode_reg_name_and_count. * reginfo.c (fix_register): Use decode_reg_name_and_count and iterate over all regs used. * stmt.c (expand_asm_operands): Likewise. * arm/aout.h (OVERLAPPING_REGISTER_NAMES): Define. (ADDITIONAL_REGISTER_NAMES): Remove aliases that overlap multiple machine registers. Modified: trunk/gcc/ChangeLog
[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
--- Comment #10 from pinskia at gcc dot gnu dot org 2010-04-22 23:21 --- *** Bug 43860 has been marked as a duplicate of this bug. *** -- pinskia at gcc dot gnu dot org changed: What|Removed |Added CC||lessen42+gcc at gmail dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440
[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
--- Comment #9 from ramana at gcc dot gnu dot org 2010-03-21 13:41 --- (In reply to comment #8) > What about just forbidding to use "q" registers in the inline assembly clobber > list? Is it difficult to do? > It's not difficult to - The patch only penalises the cases where an s ends up clobbering another s in the same quad (or double for that matter) but the common case of the q or the d clobbering the s in the same quad is a necessity and correctness. Thus if we wanted to allow q and d in the clobber list then we need to live with this restriction. Ramana -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440
[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
--- Comment #8 from siarhei dot siamashka at gmail dot com 2010-03-21 10:05 --- What about just forbidding to use "q" registers in the inline assembly clobber list? Is it difficult to do? As a nice bonus, the existing potentially unsafe inline assembly will fail to compile, will be spotted and will have to be fixed (forcing the application developer to manually convert clobber list to use "d" or "s" registers). It will also solve compatibility problems with the older versions of gcc which still have this bug and still might be in use for a very long time. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440
[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
--- Comment #7 from ramana at gcc dot gnu dot org 2010-03-21 09:23 --- (In reply to comment #5) > The proper solution seems to be extremely simple to me and it should do > exactly > the same what an application programmer would do to workaround the bug. Just > when initially parsing clobber list do a simple text substitution "q0" -> > "d0", > "d1". Same for all the other "q" registers. > Maybe I wasn't clear enough. If you've written to s3 and used q0 some place in your program then you want to indicate that q0 has been clobbered. IIUC the way the Neon registers are represented in GCC is by having the same register "number" for s0, d0 and q0 - thus the clobber list generated for your inline asm statement at expand time is essentialy clobber ( reg:QI 68) or whatever be the register number for s0, d0 and q0. Thus at the time the register allocator comes along you've lost all information regarding the mode in which the register was used and it would think that just s0 was being clobbered. Thus the clean solution to this IMO is to have some way of representing a proper mode for each of the register numbers and then making the register allocator / other optimizers honour it. In theory it should work but in practice I'm not sure if that's the correct thing to do in stage4. This isn't a problem in standard C code because the RTL for accessing those variables would have the Mode in which the registers were being accessed and thus any overlapping can be detected. Hope that makes things clearer. Ramana -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440
[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
--- Comment #6 from siarhei dot siamashka at gmail dot com 2010-03-21 03:56 --- (In reply to comment #4) > IMO the reasons as described in my email is another motivation for Neon > programmers to be using intrinsics rather than inline assembler and to improve > in general Neon intrinsics. The problem is that today neon intrinsics have a lot more issues in practice. The resulting code is way too slow to be usable, especially when gcc thinks that it is running out of registers and starts spilling variables to memory. Bug 43118 and bug 43364 are just some very basic examples of performance issues without looking any deeper. Not having many bugs in bugzilla for NEON intrinsics means that either they work good enough or nobody seriously uses them. At least for me it is the latter case. Autovectorization is even worse than intrinsics. Inline assembly has a few bugs, but they can be easily workarounded. Sorry for this rant/offtopic. Just thought that you might be somewhat interested the opinion of someone from the other side of the fence :-) -- siarhei dot siamashka at gmail dot com changed: What|Removed |Added CC||siarhei dot siamashka at ||gmail dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440
[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
--- Comment #5 from siarhei dot siamashka at gmail dot com 2010-03-21 03:33 --- I don't quite understand what's the problem: "This patch has the unhappy side effect of clobbering s0, s1 and s2 if s3 is used because that's the only way we can indicate that q0 is clobbered by the write to s0." The proper solution seems to be extremely simple to me and it should do exactly the same what an application programmer would do to workaround the bug. Just when initially parsing clobber list do a simple text substitution "q0" -> "d0", "d1". Same for all the other "q" registers. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440
[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
--- Comment #4 from ramana at gcc dot gnu dot org 2010-03-20 12:42 --- Patch submitted here for comments. http://gcc.gnu.org/ml/gcc-patches/2010-03/msg00925.html IMO the reasons as described in my email is another motivation for Neon programmers to be using intrinsics rather than inline assembler and to improve in general Neon intrinsics. cheers Ramana -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440
[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
--- Comment #3 from batuzovk at ispras dot ru 2010-03-20 12:35 --- (In reply to comment #2) > Does that look any better to you ? Yes, definitely better. Thank you for quick response. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440
[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
--- Comment #2 from ramana at gcc dot gnu dot org 2010-03-20 10:51 --- TARGET_MD_ASSEMBLE_CLOBBERS might just be the help we need on this one - I think I have a patch which generates the following code. Does that look any better to you ? foo: @ args = 0, pretend = 0, frame = 8 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. fmsrs13, r0 fmsrs14, r0 fmacs s13, s14, s14 fstmfdd sp!, {d8, d9, d10, d11, d12, d13, d14, d15} movwr3, #:lower16:y sub sp, sp, #8 movtr3, #:upper16:y fstss13, [sp, #4] fstss13, [r3, #0] @ 5 "/tmp/fail.c" 1 vmov.i8 q0, #0 @ 6 "/tmp/fail.c" 1 vmov.i8 q1, #0 @ 7 "/tmp/fail.c" 1 vmov.i8 q2, #0 @ 8 "/tmp/fail.c" 1 vmov.i8 q3, #0 @ 9 "/tmp/fail.c" 1 vmov.i8 q4, #0 @ 10 "/tmp/fail.c" 1 vmov.i8 q5, #0 @ 11 "/tmp/fail.c" 1 vmov.i8 q6, #0 @ 12 "/tmp/fail.c" 1 vmov.i8 q7, #0 ldr r0, [sp, #4]@ float add sp, sp, #8 fldmfdd sp!, {d8, d9, d10, d11, d12, d13, d14, d15} bx lr .size foo, .-foo .comm y,4,4 .ident "GCC: (GNU) 4.5.0 20100319 (experimental)" -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440
[Bug target/43440] Overwriting neon quad register does not clobber all included single registers
--- Comment #1 from ramana at gcc dot gnu dot org 2010-03-19 13:32 --- Yes, there is no easy way of describing the container relationship of the Neon registers to the compiler at the minute. Thus the only work around at the moment is to indicate that all the registers contained as part of this register name are clobbered in the explicit asm statement. -- ramana at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Keywords||wrong-code Last reconfirmed|-00-00 00:00:00 |2010-03-19 13:32:13 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43440