================
@@ -1248,18 +1249,62 @@ void InlineSpiller::spillAroundUses(Register Reg) {
 
     // Create a new virtual register for spill/fill.
     // FIXME: Infer regclass from instruction alone.
-    Register NewVReg = Edit->createFrom(Reg);
+
+    unsigned SubReg = 0;
+    LaneBitmask CoveringLanes = LaneBitmask::getNone();
+    // If the subreg liveness is enabled, identify the subreg use(s) to try
+    // subreg reload. Skip if the instruction also defines the register.
+    // For copy bundles, get the covering lane masks.
+    if (MRI.subRegLivenessEnabled() && !RI.Writes) {
+      for (auto [MI, OpIdx] : Ops) {
+        const MachineOperand &MO = MI->getOperand(OpIdx);
+        assert(MO.isReg() && MO.getReg() == Reg);
+        if (MO.isUse()) {
+          SubReg = MO.getSubReg();
+          if (SubReg)
+            CoveringLanes |= TRI.getSubRegIndexLaneMask(SubReg);
+        }
+      }
+    }
+
+    if (MI.isBundled() && CoveringLanes.any()) {
+      CoveringLanes = LaneBitmask(bit_ceil(CoveringLanes.getAsInteger()) - 1);
+      // Obtain the covering subregister index, including any missing indices
+      // within the identified small range. Although this may be suboptimal due
+      // to gaps in the subregisters that are not part of the copy bundle, it 
is
+      // benificial when components outside this range of the original tuple 
can
+      // be completely skipped from the reload.
+      SubReg = TRI.getSubRegIdxFromLaneMask(CoveringLanes);
+    }
+
+    // If the target doesn't support subreg reload, fallback to restoring the
+    // full tuple.
+    if (SubReg && !TRI.shouldEnableSubRegReload(SubReg))
----------------
cdevadas wrote:

Yes, the targets can choose to ignore the subreg field passed to them within 
their loadRegFromStackSlot. However, that info (whether the target implemented 
subreg reload or not) should be returned to this callsite as we remove the 
subreg field in the use instruction as part of this optimization when the 
target truly implements the partial reload by constructing a concrete class for 
the subreg access. See the transition explained here.

%tuple:VReg_128 = ...      ; 128-bit tuple (4x32-bit)
SPILL_V128 %tuple to stack
...
; Later, only need sub1 (second 32-bit component)
; Current implementation - restore full.
%reload:Vreg_128 = RESTORE_V128, ofst:0
%val = USE **%reload.sub1**

; With subreg reload implemented.
%reload:VGPR_32 = RESTORE_V32, ofst:4
%val = USE **%reload** // drop the subreg.
The subreg fields are dropped in the InlineSpiller at 
https://github.com/llvm/llvm-project/pull/175002/files#diff-855df7e3f96ef7f3f499fdafba308dde780d710f717f19158a2f39059c8a6f5dR1305.
To pass the subreg info always inside `loadRegFromStackSlot` and let the 
targets decide whether to implement it or not, requires some changes in the way 
how InlineSpiller and this target hook interact.


https://github.com/llvm/llvm-project/pull/175002
_______________________________________________
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to