================
@@ -1248,18 +1249,62 @@ void InlineSpiller::spillAroundUses(Register Reg) {
 
     // Create a new virtual register for spill/fill.
     // FIXME: Infer regclass from instruction alone.
-    Register NewVReg = Edit->createFrom(Reg);
+
+    unsigned SubReg = 0;
+    LaneBitmask CoveringLanes = LaneBitmask::getNone();
+    // If the subreg liveness is enabled, identify the subreg use(s) to try
+    // subreg reload. Skip if the instruction also defines the register.
+    // For copy bundles, get the covering lane masks.
+    if (MRI.subRegLivenessEnabled() && !RI.Writes) {
+      for (auto [MI, OpIdx] : Ops) {
+        const MachineOperand &MO = MI->getOperand(OpIdx);
+        assert(MO.isReg() && MO.getReg() == Reg);
+        if (MO.isUse()) {
+          SubReg = MO.getSubReg();
+          if (SubReg)
+            CoveringLanes |= TRI.getSubRegIndexLaneMask(SubReg);
+        }
+      }
+    }
+
+    if (MI.isBundled() && CoveringLanes.any()) {
+      CoveringLanes = LaneBitmask(bit_ceil(CoveringLanes.getAsInteger()) - 1);
----------------
cdevadas wrote:

Admittedly, Some of the logics I used in this patch with Lanemask manipulations 
are somewhat hacky. This code here is needed to correctly handle copy bundles 
where the individual copies may target non-contiguous subregisters of a tuple. 
For instance, a bundle containing two copies: one covering sub0_sub1 and 
another covering sub3 of a 256-bit tuple. With the bit_ceil-based compaction, 
the resulting lane mask becomes contigous sub0_sub1_sub2_sub3 by filling-in 
sub2, which is a valid subreg index for this tuple, and the 
`getSubRegIdxFromLaneMask` helper I added returns the correct SubRegIdx. 
Originally, I tried to use `getCoveringSubRegIndexes`. However, I found it 
isn’t fully reliable for my use case. When the covering mask originally 
represents a contiguous lane-range (say sub0_sub1_sub2), the function fails to 
produce the correct index. The root cause is the check at this line 
https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetRegisterInfo.cpp#L558
Given that behavior, the compaction + lane-mask-to-subregIdx approach was an 
option that consistently returns the correct index for these irregular bundles. 
I also understand the how the lane mask is interpreted here. The representation 
is subtle, and I should find a better alternative to correctly gather the 
requierd info.

https://github.com/llvm/llvm-project/pull/175002
_______________________________________________
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to