[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints

2013-12-16 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #10 from Jakub Jelinek jakub at gcc dot gnu.org ---
Ok, closing.


[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints

2013-12-04 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org ---
(In reply to Alexander Ivchenko from comment #5)
 I understand the technical reasons of the complexity of the correct and
 efficient register allocation here, but what I don't understand is this:
 
 $ gcc_4.7 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32
 $ gcc_4.8 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32
 $ gcc_4.9 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32
 test.c: In function 'testFunc':
 test.c:7:3: error: 'asm' operand has impossible constraints
__asm__(
^
 
 How can we allow to break the user code with the release version of the
 compiler here..?

If it does something wrong, and using almost all or all available registers in
an asm is always wrong, then why not.  Just compile it with
-maccumulate-outgoing-args or without -mstackrealign or better rework either to
need fewer registers (pass some more arguments in memory or even better some
memory structure, so that they can all be loaded/saved from there using fewer
registers).

I'd say this should be closed NOTABUG (or WONTFIX?).


[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints

2013-12-04 Thread aivchenk at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086

--- Comment #7 from Alexander Ivchenko aivchenk at gmail dot com ---
Do we have any documentation that states how many registers can be used in
inline assembler for a particular arch and optset? almost all is not good
enough for that.

If the user code that worked in 4.8 correctly is now broken for 4.9, we better
need to respect the user and document it properly.


[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints

2013-12-04 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086

--- Comment #8 from Jan Hubicka hubicka at ucw dot cz ---
 Do we have any documentation that states how many registers can be used in
 inline assembler for a particular arch and optset? almost all is not good
 enough for that.

It is somewhat difficult question.  Basically you are not supposed to use any
fixed
registers where fixed set of registers depends on compilation flags. For
example ESP is
always fixed, EBP is fixed depending on frame pointer elimination.
With PIC you need also EBP for GOT pointer and for DRAP you need another
register.
You need to leave enough registers so reload can load all additional operand
and addresses.
Things may have also changes with IRA - last time I looked into this was with
old reload.

Vladimir, do you have idea about more precise description?
 
 If the user code that worked in 4.8 correctly is now broken for 4.9, we better
 need to respect the user and document it properly.

Well, the problem is that precise set of constraints is exremely tied to
compiler internals (and it also changes from release to release, indeed).  It
is overall problem of ASM statement extension in GCC, sadly.

Honza


[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints

2013-12-04 Thread vmakarov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086

--- Comment #9 from Vladimir Makarov vmakarov at gcc dot gnu.org ---
(In reply to Jan Hubicka from comment #8)
  Do we have any documentation that states how many registers can be used in
  inline assembler for a particular arch and optset? almost all is not good
  enough for that.
 
 It is somewhat difficult question.  Basically you are not supposed to use
 any fixed
 registers where fixed set of registers depends on compilation flags. For
 example ESP is
 always fixed, EBP is fixed depending on frame pointer elimination.
 With PIC you need also EBP for GOT pointer and for DRAP you need another
 register.
 You need to leave enough registers so reload can load all additional operand
 and addresses.
 Things may have also changes with IRA - last time I looked into this was
 with old reload.
 
 Vladimir, do you have idea about more precise description?

It is very difficult to formulate this for all cases.

For one alternative, I would say for any given class the number of allocatable
hard regs of given class should be not less than max (non-output operands with
a constraint register class intersecting given class, analogous for non-input
operands).  Early clobber operands are considered non-input and non-output. 
Operand modes should be taken into account too, e.g. an operand needs 2 regs.

We cannot assume that a pseudo has an equivalent unallocatable hard reg
expression.

Still this description might not work when we have a subreg of bigger pseudo
requiring at least 2 hard regs as sometimes we need to reload all pseudo.

Also this formulation can be too strict when there are register and
non-register constraint, e.g. memory.  Several alternative can complicate the
situation even more.

As algorithms behind reload/LRA are too complicated and has a lot of details,
the formulation will be too long.  The simple formulation for all possible
cases would be too constrained and not practical.  Therefore nobody tried to
write it down and we just have a fatal message 'can not reload'.


  
  If the user code that worked in 4.8 correctly is now broken for 4.9, we 
  better
  need to respect the user and document it properly.
 
 Well, the problem is that precise set of constraints is exremely tied to
 compiler internals (and it also changes from release to release, indeed).  It
 is overall problem of ASM statement extension in GCC, sadly.
 



[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints

2013-11-24 Thread aivchenk at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086

--- Comment #5 from Alexander Ivchenko aivchenk at gmail dot com ---
I understand the technical reasons of the complexity of the correct and
efficient register allocation here, but what I don't understand is this:

$ gcc_4.7 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32
$ gcc_4.8 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32
$ gcc_4.9 test.c -c -fPIC -mstackrealign -march=core-avx2 -m32
test.c: In function 'testFunc':
test.c:7:3: error: 'asm' operand has impossible constraints
   __asm__(
   ^

How can we allow to break the user code with the release version of the
compiler here..?


[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints

2013-11-22 Thread vmakarov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086

--- Comment #3 from Vladimir Makarov vmakarov at gcc dot gnu.org ---
First of all, reload also cannot generate this code with mentioned i386.c
change

When we use -maccumulate-outgoing-args, we have before reload/LRA

(insn 6 10 7 2 (parallel [
(set (reg:SI 87 [ x ])
(asm_operands:SI (1:
) (=d) 0 [
(mem/c:SI (reg/f:SI 16 argp) [0 count+0 S4 A32])
(mem/c:V2DI (plus:SI (reg/f:SI 20 frame)
(const_int -32 [0xffe0])) [0 tmp1+0
S16 A128])
(reg:SI 87 [ x ])
]
 [
(asm_input:SI (m) (null):0)
(asm_input:V2DI (m) (null):0)
(asm_input:SI (0) (null):0)
]
 [] b2.c:7))
(clobber (reg:QI 18 fpsr))
(clobber (reg:QI 17 flags))
(clobber (reg:QI 0 ax))
(clobber (reg:QI 5 di))
(clobber (reg:QI 4 si))
(clobber (reg:QI 2 cx))
]) b2.c:7 -1

and argp is transformed after lra/reload into fp+const:

(insn 6 10 7 2 (parallel [
(set (reg:SI 1 dx [orig:87 x ] [87])
(asm_operands:SI (1:
) (=d) 0 [
(mem/c:SI (plus:SI (reg/f:SI 6 bp)
(const_int 8 [0x8])) [0 count+0 S4 A32])
(mem/c:V2DI (reg/f:SI 7 sp) [0 tmp1+0 S16 A128])
(reg:SI 1 dx [orig:87 x ] [87])
]
 If we don't use -maccumulate-outgoing-args, we have before reload/LRA:

(insn/f 10 3 2 2 (set (reg:SI 89)
(reg:SI 2 cx)) 86 {*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 2 cx)
(expr_list:REG_CFA_SET_VDRAP (reg:SI 89)
(nil
...
(insn 6 11 7 2 (parallel [
(set (reg:SI 87 [ x ])
(asm_operands:SI (1:
) (=d) 0 [
(mem/c:SI (reg:SI 89) [0 count+0 S4 A32])
(mem/c:V2DI (plus:SI (reg/f:SI 20 frame)
(const_int -32 [0xffe0])) [0 tmp1+0
S16 A128])
(reg:SI 87 [ x ])
]
As we have only 1 free reg for asm (4 regs are clobbered in the asm, ebx is
taken for -fPIC, ebp is always needed for -mstackrealign), we cannot use it for
p87 (it has + constraint) and p89.  The generated code also has no equiv for
p89 to use it.  So reload/LRA can do nothing in this situation.

Even if i implement fp elimination in presence of sp changes in RTL (and i am
working on it as it is needed for other existing PRs and very important for
generated code performance), bp will not be free (again because of
-mstackrealign).  So if we want to compile the code, we should revert the
original change

-  /* ??? Unwind info is not correct around the CFG unless either a frame
- pointer is present or M_A_O_A is set.  Fixing this requires rewriting
- unwind info generation to be aware of the CFG and propagating states
- around edges.  */
-  if ((flag_unwind_tables || flag_asynchronous_unwind_tables
-   || flag_exceptions || flag_non_call_exceptions)
-   flag_omit_frame_pointer
-   !(target_flags  MASK_ACCUMULATE_OUTGOING_ARGS))
-{
-  if (target_flags_explicit  MASK_ACCUMULATE_OUTGOING_ARGS)
-   warning (0, unwind tables currently require either a frame pointer 
-or %saccumulate-outgoing-args%s for correctness,
-prefix, suffix);
-  target_flags |= MASK_ACCUMULATE_OUTGOING_ARGS;
-}
-

although we could modify the comment if it is not true anymore.

Still the code will be broken for some tunings with -mno-accumulate-args by
default.

To really solve the problem, we should free bp somehow.  Probably it can be
done by smarter stack realign implementation or saving/restoring bp around asm.
 The later is very complicated task and can not be done in gcc-4.9 time frame.


[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints

2013-11-22 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086

Jan Hubicka hubicka at gcc dot gnu.org changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #4 from Jan Hubicka hubicka at gcc dot gnu.org ---
We never really allowed asm statements to use nearly all available registers,
so I would declare this to be bug in android's asm statement...
Though for sure it would be better to free EBP somehow.


[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints

2013-11-19 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Priority|P3  |P1


[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints

2013-11-19 Thread vmakarov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086

Vladimir Makarov vmakarov at gcc dot gnu.org changed:

   What|Removed |Added

 CC||vmakarov at gcc dot gnu.org

--- Comment #2 from Vladimir Makarov vmakarov at gcc dot gnu.org ---
There are not enough regs for the asm as frame pointer can not be used after
x86 tuning changes.  The root of the problem is an absence some functionality
in LRA preventing frame pointer elimination in presence of explicit sp changes.
 It also hurts performance for new tuning (there are PRs of this too).  I've
started to work on this problem.  I hope it will be solved in 2 weeks.


[Bug rtl-optimization/59086] [4.9 Regression] error: ‘asm’ operand has impossible constraints

2013-11-14 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59086

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Target||i?86-*-*
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-11-14
  Component|regression  |rtl-optimization
   Target Milestone|--- |4.9.0
Summary|error: ‘asm’ operand has|[4.9 Regression] error:
   |impossible constraints  |‘asm’ operand has
   ||impossible constraints
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener rguenth at gcc dot gnu.org ---
Confirmed.