================
@@ -1248,18 +1249,62 @@ void InlineSpiller::spillAroundUses(Register Reg) {
// Create a new virtual register for spill/fill.
// FIXME: Infer regclass from instruction alone.
- Register NewVReg = Edit->createFrom(Reg);
+
+ unsigned SubReg = 0;
+ LaneBitmask CoveringLanes = LaneBitmask::getNone();
+ // If the subreg liveness is enabled, identify the subreg use(s) to try
+ // subreg reload. Skip if the instruction also defines the register.
+ // For copy bundles, get the covering lane masks.
+ if (MRI.subRegLivenessEnabled() && !RI.Writes) {
+ for (auto [MI, OpIdx] : Ops) {
+ const MachineOperand &MO = MI->getOperand(OpIdx);
+ assert(MO.isReg() && MO.getReg() == Reg);
+ if (MO.isUse()) {
+ SubReg = MO.getSubReg();
+ if (SubReg)
+ CoveringLanes |= TRI.getSubRegIndexLaneMask(SubReg);
+ }
+ }
+ }
+
+ if (MI.isBundled() && CoveringLanes.any()) {
+ CoveringLanes = LaneBitmask(bit_ceil(CoveringLanes.getAsInteger()) - 1);
----------------
cdevadas wrote:
Admittedly, Some of the logics I used in this patch with Lanemask manipulations
are somewhat hacky. This code here is needed to correctly handle copy bundles
where the individual copies may target non-contiguous subregisters of a tuple.
For instance, a bundle containing two copies: one covering sub0_sub1 and
another covering sub3 of a 256-bit tuple. With the bit_ceil-based compaction,
the resulting lane mask becomes contigous sub0_sub1_sub2_sub3 by filling-in
sub2, which is a valid subreg index for this tuple, and the
`getSubRegIdxFromLaneMask` helper I added returns the correct SubRegIdx.
Originally, I tried to use `getCoveringSubRegIndexes`. However, I found it
isn’t fully reliable for my use case. When the covering mask originally
represents a contiguous lane-range (say sub0_sub1_sub2), the function fails to
produce the correct index. The root cause is the check at this line
https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetRegisterInfo.cpp#L558
Given that behavior, the compaction + lane-mask-to-subregIdx approach was an
option that consistently returns the correct index for these irregular bundles.
I also understand the how the lane mask is interpreted here. The representation
is subtle, and I should find a better alternative to correctly gather the
requierd info.
https://github.com/llvm/llvm-project/pull/175002
_______________________________________________
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits