[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086 Jakub Jelinek changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |WONTFIX --- Comment #10 from Jakub Jelinek --- Ok, closing.
[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086 --- Comment #9 from Vladimir Makarov --- (In reply to Jan Hubicka from comment #8) > > Do we have any documentation that states how many registers can be used in > > inline assembler for a particular arch and optset? "almost all" is not good > > enough for that. > > It is somewhat difficult question. Basically you are not supposed to use > any "fixed" > registers where fixed set of registers depends on compilation flags. For > example ESP is > always fixed, EBP is fixed depending on frame pointer elimination. > With PIC you need also EBP for GOT pointer and for DRAP you need another > register. > You need to leave enough registers so reload can load all additional operand > and addresses. > Things may have also changes with IRA - last time I looked into this was > with old reload. > > Vladimir, do you have idea about more precise description? It is very difficult to formulate this for all cases. For one alternative, I would say for any given class the number of allocatable hard regs of given class should be not less than max (non-output operands with a constraint register class intersecting given class, analogous for non-input operands). Early clobber operands are considered non-input and non-output. Operand modes should be taken into account too, e.g. an operand needs 2 regs. We cannot assume that a pseudo has an equivalent unallocatable hard reg expression. Still this description might not work when we have a subreg of bigger pseudo requiring at least 2 hard regs as sometimes we need to reload all pseudo. Also this formulation can be too strict when there are register and non-register constraint, e.g. memory. Several alternative can complicate the situation even more. As algorithms behind reload/LRA are too complicated and has a lot of details, the formulation will be too long. The simple formulation for all possible cases would be too constrained and not practical. Therefore nobody tried to write it down and we just have a fatal message 'can not reload'. > > > > If the user code that worked in 4.8 correctly is now broken for 4.9, we > > better > > need to respect the user and document it properly. > > Well, the problem is that precise set of constraints is exremely tied to > compiler internals (and it also changes from release to release, indeed). It > is overall problem of ASM statement extension in GCC, sadly. > >
[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086 --- Comment #8 from Jan Hubicka --- > Do we have any documentation that states how many registers can be used in > inline assembler for a particular arch and optset? "almost all" is not good > enough for that. It is somewhat difficult question. Basically you are not supposed to use any "fixed" registers where fixed set of registers depends on compilation flags. For example ESP is always fixed, EBP is fixed depending on frame pointer elimination. With PIC you need also EBP for GOT pointer and for DRAP you need another register. You need to leave enough registers so reload can load all additional operand and addresses. Things may have also changes with IRA - last time I looked into this was with old reload. Vladimir, do you have idea about more precise description? > > If the user code that worked in 4.8 correctly is now broken for 4.9, we better > need to respect the user and document it properly. Well, the problem is that precise set of constraints is exremely tied to compiler internals (and it also changes from release to release, indeed). It is overall problem of ASM statement extension in GCC, sadly. Honza
[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086 --- Comment #7 from Alexander Ivchenko --- Do we have any documentation that states how many registers can be used in inline assembler for a particular arch and optset? "almost all" is not good enough for that. If the user code that worked in 4.8 correctly is now broken for 4.9, we better need to respect the user and document it properly.
[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #6 from Jakub Jelinek --- (In reply to Alexander Ivchenko from comment #5) > I understand the technical reasons of the complexity of the correct and > efficient register allocation here, but what I don't understand is this: > > $> gcc_4.7 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32 > $> gcc_4.8 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32 > $> gcc_4.9 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32 > test.c: In function 'testFunc': > test.c:7:3: error: 'asm' operand has impossible constraints >__asm__( >^ > > How can we allow to break the user code with the release version of the > compiler here..? If it does something wrong, and using almost all or all available registers in an asm is always wrong, then why not. Just compile it with -maccumulate-outgoing-args or without -mstackrealign or better rework either to need fewer registers (pass some more arguments in memory or even better some memory structure, so that they can all be loaded/saved from there using fewer registers). I'd say this should be closed NOTABUG (or WONTFIX?).
[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086 --- Comment #5 from Alexander Ivchenko --- I understand the technical reasons of the complexity of the correct and efficient register allocation here, but what I don't understand is this: $> gcc_4.7 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32 $> gcc_4.8 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32 $> gcc_4.9 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32 test.c: In function 'testFunc': test.c:7:3: error: 'asm' operand has impossible constraints __asm__( ^ How can we allow to break the user code with the release version of the compiler here..?
[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #4 from Jan Hubicka --- We never really allowed asm statements to use nearly all available registers, so I would declare this to be bug in android's asm statement... Though for sure it would be better to free EBP somehow.
[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086 --- Comment #3 from Vladimir Makarov --- First of all, reload also cannot generate this code with mentioned i386.c change When we use -maccumulate-outgoing-args, we have before reload/LRA (insn 6 10 7 2 (parallel [ (set (reg:SI 87 [ x ]) (asm_operands:SI ("1: ") ("=d") 0 [ (mem/c:SI (reg/f:SI 16 argp) [0 count+0 S4 A32]) (mem/c:V2DI (plus:SI (reg/f:SI 20 frame) (const_int -32 [0xffe0])) [0 tmp1+0 S16 A128]) (reg:SI 87 [ x ]) ] [ (asm_input:SI ("m") (null):0) (asm_input:V2DI ("m") (null):0) (asm_input:SI ("0") (null):0) ] [] b2.c:7)) (clobber (reg:QI 18 fpsr)) (clobber (reg:QI 17 flags)) (clobber (reg:QI 0 ax)) (clobber (reg:QI 5 di)) (clobber (reg:QI 4 si)) (clobber (reg:QI 2 cx)) ]) b2.c:7 -1 and argp is transformed after lra/reload into fp+const: (insn 6 10 7 2 (parallel [ (set (reg:SI 1 dx [orig:87 x ] [87]) (asm_operands:SI ("1: ") ("=d") 0 [ (mem/c:SI (plus:SI (reg/f:SI 6 bp) (const_int 8 [0x8])) [0 count+0 S4 A32]) (mem/c:V2DI (reg/f:SI 7 sp) [0 tmp1+0 S16 A128]) (reg:SI 1 dx [orig:87 x ] [87]) ] If we don't use -maccumulate-outgoing-args, we have before reload/LRA: (insn/f 10 3 2 2 (set (reg:SI 89) (reg:SI 2 cx)) 86 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 2 cx) (expr_list:REG_CFA_SET_VDRAP (reg:SI 89) (nil ... (insn 6 11 7 2 (parallel [ (set (reg:SI 87 [ x ]) (asm_operands:SI ("1: ") ("=d") 0 [ (mem/c:SI (reg:SI 89) [0 count+0 S4 A32]) (mem/c:V2DI (plus:SI (reg/f:SI 20 frame) (const_int -32 [0xffe0])) [0 tmp1+0 S16 A128]) (reg:SI 87 [ x ]) ] As we have only 1 free reg for asm (4 regs are clobbered in the asm, ebx is taken for -fPIC, ebp is always needed for -mstackrealign), we cannot use it for p87 (it has + constraint) and p89. The generated code also has no equiv for p89 to use it. So reload/LRA can do nothing in this situation. Even if i implement fp elimination in presence of sp changes in RTL (and i am working on it as it is needed for other existing PRs and very important for generated code performance), bp will not be free (again because of -mstackrealign). So if we want to compile the code, we should revert the original change - /* ??? Unwind info is not correct around the CFG unless either a frame - pointer is present or M_A_O_A is set. Fixing this requires rewriting - unwind info generation to be aware of the CFG and propagating states - around edges. */ - if ((flag_unwind_tables || flag_asynchronous_unwind_tables - || flag_exceptions || flag_non_call_exceptions) - && flag_omit_frame_pointer - && !(target_flags & MASK_ACCUMULATE_OUTGOING_ARGS)) -{ - if (target_flags_explicit & MASK_ACCUMULATE_OUTGOING_ARGS) - warning (0, "unwind tables currently require either a frame pointer " -"or %saccumulate-outgoing-args%s for correctness", -prefix, suffix); - target_flags |= MASK_ACCUMULATE_OUTGOING_ARGS; -} - although we could modify the comment if it is not true anymore. Still the code will be broken for some tunings with -mno-accumulate-args by default. To really solve the problem, we should free bp somehow. Probably it can be done by smarter stack realign implementation or saving/restoring bp around asm. The later is very complicated task and can not be done in gcc-4.9 time frame.
[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086 Vladimir Makarov changed: What|Removed |Added CC||vmakarov at gcc dot gnu.org --- Comment #2 from Vladimir Makarov --- There are not enough regs for the asm as frame pointer can not be used after x86 tuning changes. The root of the problem is an absence some functionality in LRA preventing frame pointer elimination in presence of explicit sp changes. It also hurts performance for new tuning (there are PRs of this too). I've started to work on this problem. I hope it will be solved in 2 weeks.
[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086 Richard Biener changed: What|Removed |Added Target||i?86-*-* Status|UNCONFIRMED |NEW Last reconfirmed||2013-11-14 Component|regression |rtl-optimization Target Milestone|--- |4.9.0 Summary|error: ‘asm’ operand has|[4.9 Regression] error: |impossible constraints |‘asm’ operand has ||impossible constraints Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Confirmed.