> > This issue unfortunately was not solved correctly.  In that example we
> > don’t have -fmodulo-sched-allow-regmoves enabled and we should not
> > create any register moves at all.
>
> Yes, but if we do for whatever reason, we should never create register
> moves of hard registers.  Because that is incorrect (there may not be
> insns to do it).  It's a separate issue.
>
> You're extending Jakub's patch here, not replacing it, so that's fine.

Certainly, the patch contains assertions to catch both situations:
when we try to create reg_move when it wasn’t allowed at all, and when
we try to create reg_move for hard register. Preventing both of them
in theory can be easily achieved when SMS algorithm got absolutely
correct DDG as an input data.

> > In pr84524.c we got a loop with an extended inline asm:
> > asm volatile ("" : "+r" (v))
> > which also gives us a “surprising” situation Alexander predicts.
> >
> > For sched-deps scanner such volatile asm is a “true barrier” and we
> > create dependencies to almost all other instructions, while DF scanners
> > don’t give us such information.
>
> There is no such thing as a "true barrier" in extended asm.  The only thing
> volatile asm means is that the asm has a side effect unknown to the compiler;
> this can *not* be a modification of a register or of memory contents, such
> things are known by the compiler, that's what clobbers and "memory" clobber
> are about.

In sched-deps.c we got:
    case ASM_OPERANDS:
    case ASM_INPUT:
      {
       /* Traditional and volatile asm instructions must be considered to use
          and clobber all hard registers, all pseudo-registers and all of
          memory.  So must TRAP_IF and UNSPEC_VOLATILE operations.

          Consider for instance a volatile asm that changes the fpu rounding
          mode.  An insn should not be moved across this even if it only uses
          pseudo-regs because it might give an incorrectly rounded result.  */
       if ((code != ASM_OPERANDS || MEM_VOLATILE_P (x))
           && !DEBUG_INSN_P (insn))
         reg_pending_barrier = TRUE_BARRIER;
       ...

I understand that mentioned “changing fpu rounding mode” example may
not be relevant in current compiler.  But we still got TRUE_BARRIER
for all volatile asms.

> > Maybe it is a good idea somehow re-implement modulo scheduler using only
> > one scanner instead of two, but at the moment the best thing to do is to
> > detect the situation earlier and skip such loops.
>
> Or fix the broken code...

I’m not sure what you mean here, but the only alternative way to build
correct DDG is to change sched-deps.c parts to make analysis
consistent with DF, but that change will affect all schedulers.  And
we’ll have 2 rather big problems with such change:
1) Testing correctness and quality of generated code in all imaginable
situations in all schedulers.
2) Potential compilation slowdown.  Sched-deps analysis algorithm
implements heuristic ideas, while DF pretends to be more accurate but
slower analysis.  Trying to keep them in sync probably could increase
sched-deps analysis time.

A year ago there were relevant discussion about some unnecessary
inline asm dependencies in DF:
https://gcc.gnu.org/ml/gcc-patches/2018-04/msg01056.html

Roman

Reply via email to