[gem5-dev] Change in gem5/gem5[develop]: arch-vega: Free dest registers in non-memory Load DS insts

2022-01-17 Thread Matthew Poremba (Gerrit) via gem5-dev
Matthew Poremba has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/55463 )


Change subject: arch-vega: Free dest registers in non-memory Load DS insts
..

arch-vega: Free dest registers in non-memory Load DS insts

Ported from https://gem5-review.googlesource.com/c/public/gem5/+/48019:

Certain DS insts are classfied as Loads, but don't actually go through
the memory pipeline. However, any instruction classified as a load
marks its destination registers as free in the memory pipeline.

Because these instructions didn't use the memory pipeline, they
never freed their destination registers, which led to a deadlock.

This patch explicitly calls the function used to free the destination
registers in the execute() method of those Load instructions that
don't use the memory pipeline.

Change-Id: I8231217a79661ca6acc837b2ab4931b946049a1a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55463
Reviewed-by: Matt Sinclair 
Maintainer: Matt Sinclair 
Tested-by: kokoro 
---
M src/arch/amdgpu/vega/insts/instructions.cc
1 file changed, 53 insertions(+), 0 deletions(-)

Approvals:
  Matt Sinclair: Looks good to me, approved; Looks good to me, approved
  kokoro: Regressions pass




diff --git a/src/arch/amdgpu/vega/insts/instructions.cc  
b/src/arch/amdgpu/vega/insts/instructions.cc

index 7d55624..bd7ef70 100644
--- a/src/arch/amdgpu/vega/insts/instructions.cc
+++ b/src/arch/amdgpu/vega/insts/instructions.cc
@@ -35467,6 +35467,15 @@
 }

 vdst.write();
+
+/**
+ * This is needed because we treat this instruction as a load
+ * but it's not an actual memory request.
+ * Without this, the destination register never gets marked as
+ * free, leading to a  possible deadlock
+ */
+wf->computeUnit->vrf[wf->simdId]->
+scheduleWriteOperandsFromLoad(wf, gpuDynInst);
 } // execute
 // --- Inst_DS__DS_PERMUTE_B32 class methods ---

@@ -35541,6 +35550,15 @@
 }

 vdst.write();
+
+/**
+ * This is needed because we treat this instruction as a load
+ * but it's not an actual memory request.
+ * Without this, the destination register never gets marked as
+ * free, leading to a  possible deadlock
+ */
+wf->computeUnit->vrf[wf->simdId]->
+scheduleWriteOperandsFromLoad(wf, gpuDynInst);
 } // execute
 // --- Inst_DS__DS_BPERMUTE_B32 class methods ---

@@ -35615,6 +35633,15 @@
 }

 vdst.write();
+
+/**
+ * This is needed because we treat this instruction as a load
+ * but it's not an actual memory request.
+ * Without this, the destination register never gets marked as
+ * free, leading to a  possible deadlock
+ */
+wf->computeUnit->vrf[wf->simdId]->
+scheduleWriteOperandsFromLoad(wf, gpuDynInst);
 } // execute

 // --- Inst_DS__DS_ADD_U64 class methods ---

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/55463
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I8231217a79661ca6acc837b2ab4931b946049a1a
Gerrit-Change-Number: 55463
Gerrit-PatchSet: 2
Gerrit-Owner: Matthew Poremba 
Gerrit-Reviewer: Kyle Roarty 
Gerrit-Reviewer: Matt Sinclair 
Gerrit-Reviewer: Matthew Poremba 
Gerrit-Reviewer: kokoro 
Gerrit-MessageType: merged
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: mem: implement x86 locked accesses in timing-mode classic cache

2022-01-17 Thread Austin Harris (Gerrit) via gem5-dev
Austin Harris has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/52303 )


Change subject: mem: implement x86 locked accesses in timing-mode classic  
cache

..

mem: implement x86 locked accesses in timing-mode classic cache

Add LockedRMW(Read|Write)(Req|Resp) commands.  In timing mode,
use a combination of clearing permission bits and leaving
an MSHR in place to prevent accesses & snoops from touching
a locked block between the read and write parts of an locked
RMW sequence.

Based on an old patch by Steve Reinhardt:
http://reviews.gem5.org/r/2691/index.html

Jira Issue: https://gem5.atlassian.net/browse/GEM5-1105

Change-Id: Ieadda4deb17667ca4a6282f87f6da2af3b011f66
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52303
Reviewed-by: Nikos Nikoleris 
Maintainer: Nikos Nikoleris 
Tested-by: kokoro 
---
M src/mem/cache/mshr.cc
M src/mem/cache/mshr.hh
M src/mem/cache/cache.cc
M src/mem/cache/base.cc
M src/mem/packet.cc
M src/mem/packet.hh
M src/mem/cache/queue_entry.hh
7 files changed, 276 insertions(+), 65 deletions(-)

Approvals:
  Nikos Nikoleris: Looks good to me, approved; Looks good to me, approved
  kokoro: Regressions pass




diff --git a/src/mem/cache/base.cc b/src/mem/cache/base.cc
index dc21151..435684a 100644
--- a/src/mem/cache/base.cc
+++ b/src/mem/cache/base.cc
@@ -223,6 +223,59 @@
 void
 BaseCache::handleTimingReqHit(PacketPtr pkt, CacheBlk *blk, Tick  
request_time)

 {
+
+// handle special cases for LockedRMW transactions
+if (pkt->isLockedRMW()) {
+Addr blk_addr = pkt->getBlockAddr(blkSize);
+
+if (pkt->isRead()) {
+// Read hit for LockedRMW.  Since it requires exclusive
+// permissions, there should be no outstanding access.
+assert(!mshrQueue.findMatch(blk_addr, pkt->isSecure()));
+// The keys to LockedRMW are that (1) we always have an MSHR
+// allocated during the RMW interval to catch snoops and
+// defer them until after the RMW completes, and (2) we
+// clear permissions on the block to turn any upstream
+// access other than the matching write into a miss, causing
+// it to append to the MSHR as well.
+
+// Because we hit in the cache, we have to fake an MSHR to
+// achieve part (1).  If the read had missed, this MSHR
+// would get allocated as part of normal miss processing.
+// Basically we need to get the MSHR in the same state as if
+// we had missed and just received the response.
+// Request *req2 = new Request(*(pkt->req));
+RequestPtr req2 = std::make_shared(*(pkt->req));
+PacketPtr pkt2 = new Packet(req2, pkt->cmd);
+MSHR *mshr = allocateMissBuffer(pkt2, curTick(), true);
+// Mark the MSHR "in service" (even though it's not) to prevent
+// the cache from sending out a request.
+mshrQueue.markInService(mshr, false);
+// Part (2): mark block inaccessible
+assert(blk);
+blk->clearCoherenceBits(CacheBlk::ReadableBit);
+blk->clearCoherenceBits(CacheBlk::WritableBit);
+} else {
+assert(pkt->isWrite());
+// All LockedRMW writes come here, as they cannot miss.
+// Need to undo the two things described above.  Block
+// permissions were already restored earlier in this
+// function, prior to the access() call.  Now we just need
+// to clear out the MSHR.
+
+// Read should have already allocated MSHR.
+MSHR *mshr = mshrQueue.findMatch(blk_addr, pkt->isSecure());
+assert(mshr);
+// Fake up a packet and "respond" to the still-pending
+// LockedRMWRead, to process any pending targets and clear
+// out the MSHR
+PacketPtr resp_pkt =
+new Packet(pkt->req, MemCmd::LockedRMWWriteResp);
+resp_pkt->senderState = mshr;
+recvTimingResp(resp_pkt);
+}
+}
+
 if (pkt->needsResponse()) {
 // These delays should have been consumed by now
 assert(pkt->headerDelay == 0);
@@ -353,6 +406,20 @@
 // the delay provided by the crossbar
 Tick forward_time = clockEdge(forwardLatency) + pkt->headerDelay;

+if (pkt->cmd == MemCmd::LockedRMWWriteReq) {
+// For LockedRMW accesses, we mark the block inaccessible after the
+// read (see below), to make sure no one gets in before the write.
+// Now that the write is here, mark it accessible again, so the
+// write will succeed.  LockedRMWReadReq brings the block in in
+// exclusive mode, so we know it was previously writable.
+CacheBlk *blk = tags->findBlock(pkt->getAddr(), pkt->isSecure());
+assert(blk && blk->isValid());
+