[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3,gpu-compute: Set gpuDynInst exec_mask before use

Kyle Roarty (Gerrit) via gem5-dev Tue, 11 May 2021 10:43:54 -0700

Kyle Roarty has uploaded this change for review. (https://gem5-review.googlesource.com/c/public/gem5/+/45346 )


Change subject: arch-gcn3,gpu-compute: Set gpuDynInst exec_mask before use
......................................................................

arch-gcn3,gpu-compute: Set gpuDynInst exec_mask before use

vector_register_file uses the exec_mask of a memory instruction in
order to determine if it should mark a register as in-use or not.
Previously, the exec_mask of memory instructions were only set on
execution of that instruction, which occurs after the code in
vector_register_file. This lead to the code reading potentially garbage
data, leading to a scenario where a register would be marked used when
it shouldn't be.

This fix sets the exec_mask of memory instructions in schedule_stage,
which works because the only time the wavefront execMask() is updated is
on a instruction executing, and we know the previous instruction will
have executed by the time schedule_stage executes, due to the order the
pipeline is executed in.

This also undoes part of a patch from last year (62ec973) which treated
the symptom of accidental register allocation, without preventing the
registers from being allocated in the first place.

This patch also removes now redundant code that sets the exec_mask in
instructions.cc for memory instructions

Change-Id: Idabd35020000764fb06133ac2458606c1aaf6f04
---
M src/arch/amdgpu/gcn3/insts/instructions.cc
M src/gpu-compute/schedule_stage.cc
2 files changed, 29 insertions(+), 155 deletions(-)

diff --git a/src/arch/amdgpu/gcn3/insts/instructions.ccb/src/arch/amdgpu/gcn3/insts/instructions.cc

index 4ae4c29..a5f28e3 100644
--- a/src/arch/amdgpu/gcn3/insts/instructions.cc
+++ b/src/arch/amdgpu/gcn3/insts/instructions.cc
@@ -31240,7 +31240,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -31301,7 +31300,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -31365,7 +31363,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -31545,7 +31542,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -31605,7 +31601,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -32070,7 +32065,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -32132,7 +32126,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -32197,7 +32190,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -32281,7 +32273,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -32362,7 +32353,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -32544,7 +32534,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()
                                 ->cyclesToTicks(Cycles(24)));
@@ -32616,7 +32605,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()
                                 ->cyclesToTicks(Cycles(24)));
@@ -32921,7 +32909,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -32982,7 +32969,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -33518,7 +33504,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -33580,7 +33565,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -33645,7 +33629,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(
                 gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
@@ -35046,7 +35029,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -35177,7 +35159,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -35308,7 +35289,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -35408,7 +35388,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -35513,7 +35492,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -35623,7 +35601,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -35738,7 +35715,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -35833,7 +35809,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -35928,7 +35903,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -36023,7 +35997,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -36122,7 +36095,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -36225,7 +36197,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -36348,7 +36319,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -36406,7 +36376,6 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();
         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -39422,19 +39391,15 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->rdGmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            gpuDynInst->exec_mask = wf->execMask();
-            wf->computeUnit->vrf[wf->simdId]->
-                scheduleWriteOperandsFromLoad(wf, gpuDynInst);
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = gpuDynInst->wavefront()->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -39495,19 +39460,15 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->rdGmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            gpuDynInst->exec_mask = wf->execMask();
-            wf->computeUnit->vrf[wf->simdId]->
-                scheduleWriteOperandsFromLoad(wf, gpuDynInst);
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = gpuDynInst->wavefront()->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -39567,19 +39528,15 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->rdGmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            gpuDynInst->exec_mask = wf->execMask();
-            wf->computeUnit->vrf[wf->simdId]->
-                scheduleWriteOperandsFromLoad(wf, gpuDynInst);
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = gpuDynInst->wavefront()->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -39668,19 +39625,15 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->rdGmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            gpuDynInst->exec_mask = wf->execMask();
-            wf->computeUnit->vrf[wf->simdId]->
-                scheduleWriteOperandsFromLoad(wf, gpuDynInst);
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = gpuDynInst->wavefront()->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -39741,19 +39694,15 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->rdGmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            gpuDynInst->exec_mask = wf->execMask();
-            wf->computeUnit->vrf[wf->simdId]->
-                scheduleWriteOperandsFromLoad(wf, gpuDynInst);
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = gpuDynInst->wavefront()->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -39814,19 +39763,15 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->rdGmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            gpuDynInst->exec_mask = wf->execMask();
-            wf->computeUnit->vrf[wf->simdId]->
-                scheduleWriteOperandsFromLoad(wf, gpuDynInst);
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -39896,19 +39841,15 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->rdGmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            gpuDynInst->exec_mask = wf->execMask();
-            wf->computeUnit->vrf[wf->simdId]->
-                scheduleWriteOperandsFromLoad(wf, gpuDynInst);
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -39981,7 +39922,7 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
@@ -39990,7 +39931,6 @@
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -40050,7 +39990,7 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
@@ -40059,7 +39999,6 @@
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -40119,7 +40058,7 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
@@ -40128,7 +40067,6 @@
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -40189,7 +40127,7 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
@@ -40198,7 +40136,6 @@
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -40259,7 +40196,7 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
@@ -40268,7 +40205,6 @@
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -40337,7 +40273,7 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
@@ -40346,7 +40282,6 @@
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -40425,23 +40360,17 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
             wf->rdGmReqsInPipe--;
             wf->wrLmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            if (instData.GLC) {
-                gpuDynInst->exec_mask = wf->execMask();
-                wf->computeUnit->vrf[wf->simdId]->
-                    scheduleWriteOperandsFromLoad(wf, gpuDynInst);
-            }
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -40534,23 +40463,17 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
             wf->rdGmReqsInPipe--;
             wf->wrLmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            if (instData.GLC) {
-                gpuDynInst->exec_mask = wf->execMask();
-                wf->computeUnit->vrf[wf->simdId]->
-                    scheduleWriteOperandsFromLoad(wf, gpuDynInst);
-            }
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -40644,23 +40567,17 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
             wf->rdGmReqsInPipe--;
             wf->wrLmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            if (instData.GLC) {
-                gpuDynInst->exec_mask = wf->execMask();
-                wf->computeUnit->vrf[wf->simdId]->
-                    scheduleWriteOperandsFromLoad(wf, gpuDynInst);
-            }
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -40741,23 +40658,17 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
             wf->rdGmReqsInPipe--;
             wf->wrLmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            if (instData.GLC) {
-                gpuDynInst->exec_mask = wf->execMask();
-                wf->computeUnit->vrf[wf->simdId]->
-                    scheduleWriteOperandsFromLoad(wf, gpuDynInst);
-            }
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -41012,23 +40923,17 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
             wf->rdGmReqsInPipe--;
             wf->wrLmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            if (instData.GLC) {
-                gpuDynInst->exec_mask = wf->execMask();
-                wf->computeUnit->vrf[wf->simdId]->
-                    scheduleWriteOperandsFromLoad(wf, gpuDynInst);
-            }
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -41109,23 +41014,17 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
             wf->rdGmReqsInPipe--;
             wf->wrLmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            if (instData.GLC) {
-                gpuDynInst->exec_mask = wf->execMask();
-                wf->computeUnit->vrf[wf->simdId]->
-                    scheduleWriteOperandsFromLoad(wf, gpuDynInst);
-            }
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -41235,23 +41134,17 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
             wf->rdGmReqsInPipe--;
             wf->wrLmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            if (instData.GLC) {
-                gpuDynInst->exec_mask = wf->execMask();
-                wf->computeUnit->vrf[wf->simdId]->
-                    scheduleWriteOperandsFromLoad(wf, gpuDynInst);
-            }
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -41346,23 +41239,17 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
             wf->rdGmReqsInPipe--;
             wf->wrLmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            if (instData.GLC) {
-                gpuDynInst->exec_mask = wf->execMask();
-                wf->computeUnit->vrf[wf->simdId]->
-                    scheduleWriteOperandsFromLoad(wf, gpuDynInst);
-            }
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -41445,23 +41332,17 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
             wf->rdGmReqsInPipe--;
             wf->wrLmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            if (instData.GLC) {
-                gpuDynInst->exec_mask = wf->execMask();
-                wf->computeUnit->vrf[wf->simdId]->
-                    scheduleWriteOperandsFromLoad(wf, gpuDynInst);
-            }
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -41726,23 +41607,17 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
             wf->rdGmReqsInPipe--;
             wf->wrLmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            if (instData.GLC) {
-                gpuDynInst->exec_mask = wf->execMask();
-                wf->computeUnit->vrf[wf->simdId]->
-                    scheduleWriteOperandsFromLoad(wf, gpuDynInst);
-            }
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

@@ -41826,23 +41701,17 @@
     {
         Wavefront *wf = gpuDynInst->wavefront();

-        if (wf->execMask().none()) {
+        if (gpuDynInst->exec_mask.none()) {
             wf->decVMemInstsIssued();
             wf->decLGKMInstsIssued();
             wf->wrGmReqsInPipe--;
             wf->rdGmReqsInPipe--;
             wf->wrLmReqsInPipe--;
             wf->rdLmReqsInPipe--;
-            if (instData.GLC) {
-                gpuDynInst->exec_mask = wf->execMask();
-                wf->computeUnit->vrf[wf->simdId]->
-                    scheduleWriteOperandsFromLoad(wf, gpuDynInst);
-            }
             return;
         }

         gpuDynInst->execUnitId = wf->execUnitId;
-        gpuDynInst->exec_mask = wf->execMask();
         gpuDynInst->latency.init(gpuDynInst->computeUnit());
         gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

diff --git a/src/gpu-compute/schedule_stage.ccb/src/gpu-compute/schedule_stage.cc

index ace6d0c..bee8703 100644
--- a/src/gpu-compute/schedule_stage.cc
+++ b/src/gpu-compute/schedule_stage.cc
@@ -581,6 +581,11 @@
                         computeUnit.globalMemoryPipe.acqCoalescerToken(mp);
                     }

+                    // Set instruction's exec_mask if it's a mem operation
+                    if (mp->isMemRef()) {
+                        mp->exec_mask = mp->wavefront()->execMask();
+                    }
+
                     doDispatchListTransition(j, EXREADY, schIter->first);

DPRINTF(GPUSched, "dispatchList[%d]:fillDispatchList: "

                             "EMPTY->EXREADY\n", j);

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/45346

To unsubscribe, or for help writing mail filters, visithttps://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: Idabd35020000764fb06133ac2458606c1aaf6f04
Gerrit-Change-Number: 45346
Gerrit-PatchSet: 1
Gerrit-Owner: Kyle Roarty <kyleroarty1...@gmail.com>
Gerrit-MessageType: newchange

_______________________________________________
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3,gpu-compute: Set gpuDynInst exec_mask before use

Reply via email to