Hi all,

Live range splitting for hard regs which are assigned to some pseudo is
already implemented.  However, live ranges of hard regs which are live
due to hard register assignments are not split so far which may result
in unsatisfiable allocations.  For example, assume that
FIRST_PSEUDO_REGISTER == 100

1: r5=...
2: r101=exp(r100)
...
9: // some user of r5

Furthermore, assume that in insn 2 the input operand is constrained to
hard register r5 (e.g. via a hard register constraint, a single register
constraint as e.g. {a,b,c,d,S,D} from x86_64, or even via a new
dependent register filter assuming there was another input operand
resulting in r100 being constrained to r5).

In those cases we need to temporarily free hard register r5 for insn 2.
I have been playing with an implementation where we end up with:

1: r5=...
3: r102=r5             // free r5
4: r103=r100           // assign r5 to input reload r103
2: r104=exp(r103)
6: r101=r104
5: r5=r102             // restore r5
...
9: // some user of r5

The implementation is naive in the sense that it is not checked for an
optimal restore point for r5, i.e., insns 3 and 5 are always emitted;
independent of subsequent insns and whether they run into the same
problem w.r.t. r5 which could in the worst case lead to many unnecessary
"in-between" reloads.

However, before diving into those missed optimizations, I would rather
like to fix the place where such reloads should be introduced.  At the
moment I decided to check during curr_insn_transform() whether the
currently selected alternative makes use of a hard register constraint
and whether this particular hard register is live.  If so, then create a
new reload and emit moves before/after.  The obvious problem with this
is, there is no liveness information at this point in time.  We could
compute liveness just prior constraint handling (that's what I'm doing
in my hack), however, from the liveness information we cannot
distinguish why a hard register is live, i.e., whether it is live
because we assigned it to a pseudo or whether it is live due to a hard
register assignment as e.g. given in insn 1 above.  In the former case I
think it would be better to rely on the current split mechanism for
pseudos.

Of course, we might unnecessarily introduce save/restores, if
implemented via curr_insn_transfrom(), in case a single alternative
contains not only a hard register constraint but also another constraint
which is chosen in the end which in turn may result in r5 not being used
rendering the save/restore unnecessary.  This would be another missed
optimization.  Again, for the moment, I'm focusing on getting rid of
ICEs.  Speaking of ICEs my current implementation hack also does not
differentiate between spilling to registers vs. memory which may lead to
other ICEs in the end in case of high register pressure.  I haven't
looked into that so far.

Before making any substantial changes: any thoughts about this?  Maybe
there is a better place or mechanism to handle this at all?  I'm not
expecting that we often hit those cases for most targets.  Especially,
since on x86_64 with single register constraints I'm not aware of a real
world example which fails due to this problem (even though some of them
may have been obscured over the years by subsequent fixes in numerous
passes).  That being said, for a first implementation, I would rather
aim for a computational inexpensive solution instead of trying to emit
optimal code in each and every scenario which is why I started by simply
carrying hard registers around a single instructions.  Maybe some sort
of light-liveness which considers only hard register assignments, which
isn't influenced by pseudo allocation and is therefore to some degree
static, would be enough.

Cheers,
Stefan

Reply via email to