> > This issue unfortunately was not solved correctly. In that example we > > don’t have -fmodulo-sched-allow-regmoves enabled and we should not > > create any register moves at all. > > Yes, but if we do for whatever reason, we should never create register > moves of hard registers. Because that is incorrect (there may not be > insns to do it). It's a separate issue. > > You're extending Jakub's patch here, not replacing it, so that's fine.
Certainly, the patch contains assertions to catch both situations: when we try to create reg_move when it wasn’t allowed at all, and when we try to create reg_move for hard register. Preventing both of them in theory can be easily achieved when SMS algorithm got absolutely correct DDG as an input data. > > In pr84524.c we got a loop with an extended inline asm: > > asm volatile ("" : "+r" (v)) > > which also gives us a “surprising” situation Alexander predicts. > > > > For sched-deps scanner such volatile asm is a “true barrier” and we > > create dependencies to almost all other instructions, while DF scanners > > don’t give us such information. > > There is no such thing as a "true barrier" in extended asm. The only thing > volatile asm means is that the asm has a side effect unknown to the compiler; > this can *not* be a modification of a register or of memory contents, such > things are known by the compiler, that's what clobbers and "memory" clobber > are about. In sched-deps.c we got: case ASM_OPERANDS: case ASM_INPUT: { /* Traditional and volatile asm instructions must be considered to use and clobber all hard registers, all pseudo-registers and all of memory. So must TRAP_IF and UNSPEC_VOLATILE operations. Consider for instance a volatile asm that changes the fpu rounding mode. An insn should not be moved across this even if it only uses pseudo-regs because it might give an incorrectly rounded result. */ if ((code != ASM_OPERANDS || MEM_VOLATILE_P (x)) && !DEBUG_INSN_P (insn)) reg_pending_barrier = TRUE_BARRIER; ... I understand that mentioned “changing fpu rounding mode” example may not be relevant in current compiler. But we still got TRUE_BARRIER for all volatile asms. > > Maybe it is a good idea somehow re-implement modulo scheduler using only > > one scanner instead of two, but at the moment the best thing to do is to > > detect the situation earlier and skip such loops. > > Or fix the broken code... I’m not sure what you mean here, but the only alternative way to build correct DDG is to change sched-deps.c parts to make analysis consistent with DF, but that change will affect all schedulers. And we’ll have 2 rather big problems with such change: 1) Testing correctness and quality of generated code in all imaginable situations in all schedulers. 2) Potential compilation slowdown. Sched-deps analysis algorithm implements heuristic ideas, while DF pretends to be more accurate but slower analysis. Trying to keep them in sync probably could increase sched-deps analysis time. A year ago there were relevant discussion about some unnecessary inline asm dependencies in DF: https://gcc.gnu.org/ml/gcc-patches/2018-04/msg01056.html Roman