[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Fix some settings installed by the init interrupt.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57051 )



Change subject: arch-x86: Fix some settings installed by the init interrupt.
..

arch-x86: Fix some settings installed by the init interrupt.

VMX requires that the present bit of the LDT and the TR are set, and
that the unusable bit of the TR is zero.

Change-Id: I4c4feba38d4fef11ad3b804d41dacb69cc3e6bd5
---
M src/arch/x86/faults.cc
1 file changed, 15 insertions(+), 3 deletions(-)



diff --git a/src/arch/x86/faults.cc b/src/arch/x86/faults.cc
index 09b4d99..209f3e8 100644
--- a/src/arch/x86/faults.cc
+++ b/src/arch/x86/faults.cc
@@ -256,7 +256,7 @@

 SegAttr tslAttr = 0;
 tslAttr.unusable = 1;
-tslAttr.present = 0;
+tslAttr.present = 1;
 tslAttr.type = 2; // LDT
 tc->setMiscReg(MISCREG_TSL, 0);
 tc->setMiscReg(MISCREG_TSL_BASE, 0);
@@ -264,8 +264,8 @@
 tc->setMiscReg(MISCREG_TSL_ATTR, tslAttr);

 SegAttr trAttr = 0;
-trAttr.unusable = 1;
-trAttr.present = 0;
+trAttr.unusable = 0;
+trAttr.present = 1;
 trAttr.type = 3; // Busy 16-bit TSS
 tc->setMiscReg(MISCREG_TR, 0);
 tc->setMiscReg(MISCREG_TR_BASE, 0);

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/57051
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I4c4feba38d4fef11ad3b804d41dacb69cc3e6bd5
Gerrit-Change-Number: 57051
Gerrit-PatchSet: 1
Gerrit-Owner: Gabe Black 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Propogate the unusable bit to KVM.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57050 )



Change subject: arch-x86: Propogate the unusable bit to KVM.
..

arch-x86: Propogate the unusable bit to KVM.

The unusable bit is now used by gem5. Pass that bit through to KVM
instead of a dummy value 0.

Change-Id: I59912b478a3de95684fb0cc65ff5703d201df8cb
---
M src/arch/x86/kvm/x86_cpu.cc
1 file changed, 13 insertions(+), 6 deletions(-)



diff --git a/src/arch/x86/kvm/x86_cpu.cc b/src/arch/x86/kvm/x86_cpu.cc
index af041a5..49274da 100644
--- a/src/arch/x86/kvm/x86_cpu.cc
+++ b/src/arch/x86/kvm/x86_cpu.cc
@@ -741,12 +741,7 @@
 kvm_seg.l = attr.longMode;
 kvm_seg.g = attr.granularity;
 kvm_seg.avl = attr.avl;
-
-// A segment is normally unusable when the selector is zero. There
-// is a attr.unusable flag in gem5, but it seems unused. qemu
-// seems to set this to 0 all the time, so we just do the same and
-// hope for the best.
-kvm_seg.unusable = 0;
+kvm_seg.unusable = attr.unusable;
 }

 static inline void

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/57050
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I59912b478a3de95684fb0cc65ff5703d201df8cb
Gerrit-Change-Number: 57050
Gerrit-PatchSet: 1
Gerrit-Owner: Gabe Black 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Respect LDT and TR bases in long mode.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57049 )



Change subject: arch-x86: Respect LDT and TR bases in long mode.
..

arch-x86: Respect LDT and TR bases in long mode.

The LDT and TR bases *are* respected in 64 bit mode, so the base values
need to be set as specified.

Change-Id: Ieb1b58511d9651e6e59be199059b9d2b8c670472
---
M src/arch/x86/fs_workload.cc
1 file changed, 13 insertions(+), 3 deletions(-)



diff --git a/src/arch/x86/fs_workload.cc b/src/arch/x86/fs_workload.cc
index dd71222..90df698 100644
--- a/src/arch/x86/fs_workload.cc
+++ b/src/arch/x86/fs_workload.cc
@@ -66,9 +66,7 @@
SegDescriptor desc, bool longmode)
 {
 bool honorBase = !longmode || seg == SEGMENT_REG_FS ||
-  seg == SEGMENT_REG_GS ||
-  seg == SEGMENT_REG_TSL ||
-  seg == SYS_SEGMENT_REG_TR;
+  seg == SEGMENT_REG_GS;

 SegAttr attr = 0;


--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/57049
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: Ieb1b58511d9651e6e59be199059b9d2b8c670472
Gerrit-Change-Number: 57049
Gerrit-PatchSet: 1
Gerrit-Owner: Gabe Black 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: mem-cache: Remove unused stats

2022-02-22 Thread Hoa Nguyen (Gerrit) via gem5-dev
Hoa Nguyen has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57029 )



Change subject: mem-cache: Remove unused stats
..

mem-cache: Remove unused stats

hitLatency, demandHitLatency, and overallHitLatency are not used.

Change-Id: I55f4e3f357b8fba28ad878697260fddb38c9208e
Signed-off-by: Hoa Nguyen 
---
M src/mem/cache/base.cc
M src/mem/cache/base.hh
2 files changed, 12 insertions(+), 35 deletions(-)



diff --git a/src/mem/cache/base.cc b/src/mem/cache/base.cc
index e268ae7..58a0f6c 100644
--- a/src/mem/cache/base.cc
+++ b/src/mem/cache/base.cc
@@ -2028,8 +2028,6 @@
("number of " + name + " hits").c_str()),
   ADD_STAT(misses, statistics::units::Count::get(),
("number of " + name + " misses").c_str()),
-  ADD_STAT(hitLatency, statistics::units::Tick::get(),
-   ("number of " + name + " hit ticks").c_str()),
   ADD_STAT(missLatency, statistics::units::Tick::get(),
("number of " + name + " miss ticks").c_str()),
   ADD_STAT(accesses, statistics::units::Count::get(),
@@ -2086,15 +2084,6 @@
 misses.subname(i, system->getRequestorName(i));
 }

-// Hit latency statistics
-hitLatency
-.init(max_requestors)
-.flags(total | nozero | nonan)
-;
-for (int i = 0; i < max_requestors; i++) {
-hitLatency.subname(i, system->getRequestorName(i));
-}
-
 // Miss latency statistics
 missLatency
 .init(max_requestors)
@@ -2201,10 +2190,6 @@
  "number of demand (read+write) hits"),
 ADD_STAT(overallHits, statistics::units::Count::get(),
  "number of overall hits"),
-ADD_STAT(demandHitLatency, statistics::units::Tick::get(),
- "number of demand (read+write) hit ticks"),
-ADD_STAT(overallHitLatency, statistics::units::Tick::get(),
-"number of overall hit ticks"),
 ADD_STAT(demandMisses, statistics::units::Count::get(),
  "number of demand (read+write) misses"),
 ADD_STAT(overallMisses, statistics::units::Count::get(),
@@ -2339,17 +2324,6 @@
 overallMissLatency.subname(i, system->getRequestorName(i));
 }

-demandHitLatency.flags(total | nozero | nonan);
-demandHitLatency = SUM_DEMAND(hitLatency);
-for (int i = 0; i < max_requestors; i++) {
-demandHitLatency.subname(i, system->getRequestorName(i));
-}
-overallHitLatency.flags(total | nozero | nonan);
-overallHitLatency = demandHitLatency + SUM_NON_DEMAND(hitLatency);
-for (int i = 0; i < max_requestors; i++) {
-overallHitLatency.subname(i, system->getRequestorName(i));
-}
-
 demandAccesses.flags(total | nozero | nonan);
 demandAccesses = demandHits + demandMisses;
 for (int i = 0; i < max_requestors; i++) {
diff --git a/src/mem/cache/base.hh b/src/mem/cache/base.hh
index 6fc7628..51cd2d1 100644
--- a/src/mem/cache/base.hh
+++ b/src/mem/cache/base.hh
@@ -1006,11 +1006,6 @@
 @sa Packet::Command */
 statistics::Vector misses;
 /**
- * Total number of ticks per thread/command spent waiting for a  
hit.

- * Used to calculate the average hit latency.
- */
-statistics::Vector hitLatency;
-/**
  * Total number of ticks per thread/command spent waiting for a  
miss.

  * Used to calculate the average miss latency.
  */
@@ -1055,10 +1050,6 @@
 statistics::Formula demandHits;
 /** Number of hit for all accesses. */
 statistics::Formula overallHits;
-/** Total number of ticks spent waiting for demand hits. */
-statistics::Formula demandHitLatency;
-/** Total number of ticks spent waiting for all hits. */
-statistics::Formula overallHitLatency;

 /** Number of misses for demand accesses. */
 statistics::Formula demandMisses;

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/57029
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I55f4e3f357b8fba28ad878697260fddb38c9208e
Gerrit-Change-Number: 57029
Gerrit-PatchSet: 1
Gerrit-Owner: Hoa Nguyen 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: scons: Handle TARGET_GPU_ISA not being set.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/56895 )


 (

4 is the latest approved patch-set.
No files were changed between the latest approved patch-set and the  
submitted one.

 )Change subject: scons: Handle TARGET_GPU_ISA not being set.
..

scons: Handle TARGET_GPU_ISA not being set.

If TARGET_GPU_ISA is not set, even if the GPU ISA namespace isn't used
by anything, the logic which figures out what to set it to will fail.
This checks for that condition and sets it to something invalid, but
doesn't crash. If that namespace is actually used, then the build will
still fail.

Change-Id: Iec44255cccbafa4aceaa68bdd8b6a835dc0637a0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56895
Reviewed-by: Matthew Poremba 
Maintainer: Gabe Black 
Tested-by: kokoro 
---
M src/SConscript
1 file changed, 23 insertions(+), 1 deletion(-)

Approvals:
  Matthew Poremba: Looks good to me, approved
  Gabe Black: Looks good to me, approved
  kokoro: Regressions pass




diff --git a/src/SConscript b/src/SConscript
index 6c8ccaf..d3c5a5c 100644
--- a/src/SConscript
+++ b/src/SConscript
@@ -563,7 +563,10 @@
 def makeTheGPUISA(source, target, env):
 gpu_isa = env['TARGET_GPU_ISA']

-namespace = gpu_isa[0].upper() + gpu_isa[1:].lower() + 'ISA'
+if gpu_isa:
+namespace = gpu_isa[0].upper() + gpu_isa[1:].lower() + 'ISA'
+else:
+namespace = 'None'

 code = code_formatter()
 code('''\

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/56895
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: Iec44255cccbafa4aceaa68bdd8b6a835dc0637a0
Gerrit-Change-Number: 56895
Gerrit-PatchSet: 6
Gerrit-Owner: Gabe Black 
Gerrit-Reviewer: Gabe Black 
Gerrit-Reviewer: Matthew Poremba 
Gerrit-Reviewer: kokoro 
Gerrit-MessageType: merged
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: dev: Make VirtIORng device use gem5's rng instead of C++'s

2022-02-22 Thread Hoa Nguyen (Gerrit) via gem5-dev
Hoa Nguyen has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/56889 )


Change subject: dev: Make VirtIORng device use gem5's rng instead of C++'s
..

dev: Make VirtIORng device use gem5's rng instead of C++'s

Currently, VirtIORng uses C++'s RNG. This causes nondeterminism
across simulations using this device. One example is the example RISC-V
board booting Ubuntu,

configs/example/gem5_library/riscv-ubuntu-run.py

JIRA: https://gem5.atlassian.net/browse/GEM5-1193

Change-Id: I299e72eb891819007b4260390f5c2ba94d2dec7b
Signed-off-by: Hoa Nguyen 
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56889
Maintainer: Bobby Bruce 
Reviewed-by: Jason Lowe-Power 
Tested-by: kokoro 
---
M src/dev/virtio/VirtIORng.py
M src/dev/virtio/rng.cc
M src/dev/virtio/rng.hh
3 files changed, 29 insertions(+), 36 deletions(-)

Approvals:
  Jason Lowe-Power: Looks good to me, approved
  Bobby Bruce: Looks good to me, approved
  kokoro: Regressions pass




diff --git a/src/dev/virtio/VirtIORng.py b/src/dev/virtio/VirtIORng.py
index 54848ee..13df059 100644
--- a/src/dev/virtio/VirtIORng.py
+++ b/src/dev/virtio/VirtIORng.py
@@ -46,5 +46,3 @@
 cxx_class = 'gem5::VirtIORng'

 qSize = Param.Unsigned(16, "Request queue size")
-
-entropy_source = Param.String("/dev/random", "The source of entropy")
diff --git a/src/dev/virtio/rng.cc b/src/dev/virtio/rng.cc
index 50a747c..c26568e 100644
--- a/src/dev/virtio/rng.cc
+++ b/src/dev/virtio/rng.cc
@@ -38,9 +38,7 @@

 #include "dev/virtio/rng.hh"

-#include 
-#include 
-
+#include "base/random.hh"
 #include "debug/VIORng.hh"
 #include "params/VirtIORng.hh"
 #include "sim/system.hh"
@@ -50,8 +48,7 @@

 VirtIORng::VirtIORng(const Params )
 : VirtIODeviceBase(params, ID_RNG, 0, 0),
-  qReq(params.system->physProxy, byteOrder, params.qSize,
-   params.entropy_source, *this)
+  qReq(params.system->physProxy, byteOrder, params.qSize, *this)
 {
 registerQueue(qReq);
 }
@@ -60,16 +57,10 @@
 {
 }

-VirtIORng::RngQueue::RngQueue(PortProxy , ByteOrder bo,
-uint16_t size, const std::string _file_path,
+VirtIORng::RngQueue::RngQueue(PortProxy , ByteOrder bo, uint16_t  
size,

 VirtIORng &_parent)
-: VirtQueue(proxy, bo, size), parent(_parent), dist(0,255)
+: VirtQueue(proxy, bo, size), parent(_parent)
 {
-rng_fd = open(rng_file_path.c_str(), O_RDONLY);
-if (rng_fd < 0) {
-DPRINTF(VIORng, "error when open entropy source: %s\n",
-rng_file_path.c_str());
-}
 }

 void
@@ -89,16 +80,7 @@
 DPRINTF(VIORng, "Got descriptor (len: %i)\n", d->size());
 size_t len = 0;
 while (len < d->size()) {
-uint8_t byte = 0;
-bool rng_read_success = false;
-if (rng_fd >= 0) {
-ssize_t result = read(rng_fd, , sizeof(uint8_t));
-rng_read_success = (result > 0);
-}
-if (!rng_read_success) {
-// fallback to C++ std rng generator
-byte = dist(rd_device);
-}
+uint8_t byte = gem5::random_mt.random();
 d->chainWrite(len, , sizeof(uint8_t));
 ++len;
 }
diff --git a/src/dev/virtio/rng.hh b/src/dev/virtio/rng.hh
index 50a3723..7be2354 100644
--- a/src/dev/virtio/rng.hh
+++ b/src/dev/virtio/rng.hh
@@ -39,8 +39,6 @@
 #ifndef __DEV_VIRTIO_RNG_HH__
 #define __DEV_VIRTIO_RNG_HH__

-#include 
-
 #include "base/compiler.hh"
 #include "dev/virtio/base.hh"

@@ -76,9 +74,8 @@
 : public VirtQueue
 {
   public:
-RngQueue(PortProxy , ByteOrder bo,
-uint16_t size, const std::string _file_path,
-VirtIORng &_parent);
+RngQueue(PortProxy , ByteOrder bo, uint16_t size,
+ VirtIORng &_parent);
 virtual ~RngQueue() {}

 void onNotify() { trySend(); }
@@ -90,12 +87,6 @@

   protected:
 VirtIORng 
-  private:
-// system's special file for generating random number
-int rng_fd;
-// fallback random number generator
-std::random_device rd_device;
-std::uniform_int_distribution dist;
 };
 /** Receive queue for port 0 */
 RngQueue qReq;

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/56889
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I299e72eb891819007b4260390f5c2ba94d2dec7b
Gerrit-Change-Number: 56889
Gerrit-PatchSet: 4
Gerrit-Owner: Hoa Nguyen 
Gerrit-Reviewer: Bobby Bruce 
Gerrit-Reviewer: Gabe Black 
Gerrit-Reviewer: Hoa Nguyen 
Gerrit-Reviewer: Jason Lowe-Power 
Gerrit-Reviewer: kokoro 
Gerrit-CC: Luming Wang 
Gerrit-MessageType: merged
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an 

[gem5-dev] Build failed in Jenkins: nightly #138

2022-02-22 Thread jenkins-no-reply--- via gem5-dev
See 

Changes:


--
[...truncated 1.76 MB...]
 [ CXX] GCN3_X86/debug/VIOBlock.cc -> .o
 [ CXX] GCN3_X86/debug/VIO9P.cc -> .o
 [ CXX] GCN3_X86/debug/VIO9PData.cc -> .o
 [ CXX] GCN3_X86/python/m5/defines.py.cc -> .o
 [ CXX] GCN3_X86/python/m5/info.py.cc -> .o
 [   SHCXX] iostream3/zfstream.cc -> .os
 [   SHCXX] nomali/lib/gpu.cc -> .os
 [   SHCXX] nomali/lib/gpublock.cc -> .os
 [   SHCXX] nomali/lib/gpucontrol.cc -> .os
 [   SHCXX] nomali/lib/jobcontrol.cc -> .os
 [   SHCXX] nomali/lib/jobslot.cc -> .os
 [   SHCXX] nomali/lib/mali_midgard.cc -> .os
 [   SHCXX] nomali/lib/mali_t6xx.cc -> .os
 [   SHCXX] nomali/lib/mali_t7xx.cc -> .os
 [   SHCXX] nomali/lib/addrspace.cc -> .os
 [   SHCXX] nomali/lib/mmu.cc -> .os
 [   SHCXX] nomali/lib/nomali_api.cc -> .os
 [   SHCXX] drampower/src/CommandAnalysis.cc -> .os
 [   SHCXX] drampower/src/MemArchitectureSpec.cc -> .os
 [   SHCXX] drampower/src/MemCommand.cc -> .os
 [   SHCXX] drampower/src/MemPowerSpec.cc -> .os
 [   SHCXX] drampower/src/MemTimingSpec.cc -> .os
 [   SHCXX] drampower/src/MemoryPowerModel.cc -> .os
 [   SHCXX] drampower/src/MemorySpecification.cc -> .os
 [   SHCXX] drampower/src/Parameter.cc -> .os
 [   SHCXX] drampower/src/Parametrisable.cc -> .os
 [   SHCXX] drampower/src/libdrampower/LibDRAMPower.cc -> .os
 [   SHCXX] drampower/src/CAHelpers.cc -> .os
 [   SHCXX] drampower/src/CmdHandlers.cc -> .os
 [   SHCXX] drampower/src/MemBankWiseParams.cc -> .os
 [ CXX] GCN3_X86/base/date.cc -> .o
 [LINK]  -> GCN3_X86/gem5.opt
scons: done building targets.
*** Summary of Warnings ***
Warning: Deprecated namespaces are not supported by this compiler.
 Please make sure to check the mailing list for deprecation
 announcements.
+ wget -qN http://dist.gem5.org/dist/develop/test-progs/square/square
+ mkdir -p tests/testing-results
+ docker run --rm -u 118: --volume 
/nobackup/jenkins/workspace/nightly/tests/..:/nobackup/jenkins/workspace/nightly/tests/..
 -w /nobackup/jenkins/workspace/nightly/tests/.. 
gcr.io/gem5-test/gcn-gpu:latest build/GCN3_X86/gem5.opt 
configs/example/apu_se.py --reg-alloc-policy=dynamic -n3 -c square
Global frequency set at 1 ticks per second
build/GCN3_X86/mem/mem_interface.cc:791: warn: DRAM device capacity (8192 
Mbytes) does not match the address range assigned (512 Mbytes)
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (5) does not divide 
range [1:75] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:10] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:64] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (1) does not 
divide range [1:1e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (5) does not divide 
range [1:75] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:10] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:64] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (1) does not 
divide range [1:1e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (5) does not divide 
range [1:75] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:10] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:64] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (1) does not 
divide range [1:1e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (5) does not divide 
range [1:75] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:10] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:64] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (1) does not 
divide range [1:1e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (1) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (1) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (1) does not 
divide range [1:1.6e+06] into equal-sized 

[gem5-dev] Re: Build failed in Jenkins: nightly #135

2022-02-22 Thread Poremba, Matthew via gem5-dev
[AMD Official Use Only]

Yes, I will take a look soon.


-Matt

From: Jason Lowe-Power 
Sent: Sunday, February 20, 2022 9:18 AM
To: gem5 Developer List ; Poremba, Matthew 

Cc: jenkins-no-re...@gem5.org
Subject: Re: [gem5-dev] Build failed in Jenkins: nightly #135

[CAUTION: External Email]
Hey Matt,

It looks like one of the Ruby changes you recently pushed is breaking the 
nightly tests. Can you take a look?

Thanks!
Jason

On Sat, Feb 19, 2022 at 2:06 PM jenkins-no-reply--- via gem5-dev 
mailto:gem5-dev@gem5.org>> wrote:
See 
>

Changes:

[tiago.muck] mem-ruby: Fix handling of stale CleanUnique

[matthew.poremba] mem-ruby: Ensure MOESI_AMD_Base-dir has probe destinations

[matthew.poremba] mem-ruby: Remove DirectoryMemory storage in MOESI_AMD_BASE-dir

[matthew.poremba] mem-ruby: Add protocol prints to MOESI_AMD_BASE-dma

[matthew.poremba] cpu: Only acquire needed tokens in PTL tester

[matthew.poremba] configs: Allow for no DMAs in Ruby GPU tester

[tiago.muck] cpu: fix issues with ruby's memtest

[mattdsinclair] gpu-compute: Set scratch_base, lds_base for gfx902

[mattdsinclair] gpu-compute: Fix register checking and allocation in dyn manager

[gabe.black] ext-testlib: Import MutableSet properly.


--
[...truncated 1.65 MB...]
 [ CXX] GCN3_X86/debug/VIOBlock.cc -> .o
 [ CXX] GCN3_X86/debug/VIO9P.cc -> .o
 [ CXX] GCN3_X86/debug/VIO9PData.cc -> .o
 [ CXX] 
GCN3_X86/python/m5/defines.py.cc
 -> .o
 [ CXX] 
GCN3_X86/python/m5/info.py.cc
 -> .o
 [   SHCXX] iostream3/zfstream.cc -> .os
 [   SHCXX] nomali/lib/gpu.cc -> .os
 [   SHCXX] nomali/lib/gpublock.cc -> .os
 [   SHCXX] nomali/lib/gpucontrol.cc -> .os
 [   SHCXX] nomali/lib/jobcontrol.cc -> .os
 [   SHCXX] nomali/lib/jobslot.cc -> .os
 [   SHCXX] nomali/lib/mali_midgard.cc -> .os
 [   SHCXX] nomali/lib/mali_t6xx.cc -> .os
 [   SHCXX] nomali/lib/mali_t7xx.cc -> .os
 [   SHCXX] nomali/lib/addrspace.cc -> .os
 [   SHCXX] nomali/lib/mmu.cc -> .os
 [   SHCXX] nomali/lib/nomali_api.cc -> .os
 [   SHCXX] drampower/src/CommandAnalysis.cc -> .os
 [   SHCXX] drampower/src/MemArchitectureSpec.cc -> .os
 [   SHCXX] drampower/src/MemCommand.cc -> .os
 [   SHCXX] drampower/src/MemPowerSpec.cc -> .os
 [   SHCXX] drampower/src/MemTimingSpec.cc -> .os
 [   SHCXX] drampower/src/MemoryPowerModel.cc -> .os
 [   SHCXX] drampower/src/MemorySpecification.cc -> .os
 [   SHCXX] drampower/src/Parameter.cc -> .os
 [   SHCXX] drampower/src/Parametrisable.cc -> .os
 [   SHCXX] drampower/src/libdrampower/LibDRAMPower.cc -> .os
 [   SHCXX] drampower/src/CAHelpers.cc -> .os
 [   SHCXX] drampower/src/CmdHandlers.cc -> .os
 [   SHCXX] drampower/src/MemBankWiseParams.cc -> .os
 [ CXX] GCN3_X86/base/date.cc -> .o
 [LINK]  -> GCN3_X86/gem5.opt
scons: done building targets.
*** Summary of Warnings ***
Warning: Deprecated namespaces are not supported by this compiler.
 Please make sure to check the mailing list for deprecation
 announcements.
+ wget -qN 
http://dist.gem5.org/dist/develop/test-progs/square/square
+ mkdir -p tests/testing-results
+ docker run --rm -u 118: --volume 
/nobackup/jenkins/workspace/nightly/tests/..:/nobackup/jenkins/workspace/nightly/tests/..
 -w /nobackup/jenkins/workspace/nightly/tests/.. 

[gem5-dev] Change in gem5/gem5[develop]: arch: Improve error reporting from the microcode assembler.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57020 )



Change subject: arch: Improve error reporting from the microcode assembler.
..

arch: Improve error reporting from the microcode assembler.

Automatically include a filename and line number.

Unfortunately the line number is wherever the parser is, which might be
past the point of the actual error. For instance, macroops are parsed in
their entirity before being processed. These error messages could be
improved further by recording what line microops, labels, directives,
etc, were on so that can be printed later.

Change-Id: I72c0e4ef0442080e382987dda15470de920daa62
---
M src/arch/ucasmlib/assembler.py
M src/arch/ucasmlib/block.py
M src/arch/ucasmlib/parser.py
3 files changed, 30 insertions(+), 10 deletions(-)



diff --git a/src/arch/ucasmlib/assembler.py b/src/arch/ucasmlib/assembler.py
index dcf77b7..e3d4d1b 100644
--- a/src/arch/ucasmlib/assembler.py
+++ b/src/arch/ucasmlib/assembler.py
@@ -47,8 +47,8 @@
 try:
 text = eval(f'(lambda {self.params} :  
f{repr(self.body)})({args})',

 {}, assembler.symbols)
-except Exception as e:
-assembler.print_error(f'Failed to expand macro: {e}')
+except BaseException as err:
+assembler.print_error(f'Failed to expand macro: {err}')
 raise
 return text

@@ -316,8 +316,8 @@
 'macroop_def : DEF MACROOP ID block SEMI'
 try:
 curop = self.macro_type(t[3])
-except TypeError:
-self.print_error("Error creating macroop object.")
+except BaseException as err:
+self.print_error(f'Error creating macroop object: {err}')
 raise
 for statement in t[4]:
 statement.handle(self, curop)
@@ -361,6 +361,6 @@
 def assemble(self, asm, path=None):
 if path is not None:
 # Update the path of the first level lexer.
-self.lexers[-1] = (os.path.dirname(path), self.lexer)
+self.lexers[-1] = (os.path.dirname(path), self.lexer, path)
 self.parser.parse(asm, lexer=self)
 return self.macroops
diff --git a/src/arch/ucasmlib/block.py b/src/arch/ucasmlib/block.py
index e6f1cad..0f49d59 100644
--- a/src/arch/ucasmlib/block.py
+++ b/src/arch/ucasmlib/block.py
@@ -55,8 +55,9 @@
 try:
 microop = eval(f'_cls({self.params})',
 {'_cls': microop}, assembler.symbols)
-except:
-assembler.print_error(f'Error instantiating  
microop "{self.name}"')

+except BaseException as err:
+assembler.print_error(
+f'Error instantiating microop "{self.name}": {err}')
 raise

 container.add_microop(self.name, microop)
@@ -74,8 +75,9 @@
 try:
 eval(f'_dir({self.params})',
 {'_dir': directive}, assembler.symbols)
-except:
-assembler.print_error(f'Error executing  
directive "{self.name}"')

+except BaseException as err:
+assembler.print_error(
+f'Error executing directive "{self.name}": {err}')
 raise

 class Label(Statement):
diff --git a/src/arch/ucasmlib/parser.py b/src/arch/ucasmlib/parser.py
index 1ee18e8..0feee11 100644
--- a/src/arch/ucasmlib/parser.py
+++ b/src/arch/ucasmlib/parser.py
@@ -30,7 +30,8 @@
 class ParserBase:
 def print_error(self, message):
 print()
-print("*** %s" % message)
+print(f'*** "{self.lexers[-1][2]}", line {self.lexer.lineno}: '
+  f'{message}')
 print()

 ##

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/57020
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I72c0e4ef0442080e382987dda15470de920daa62
Gerrit-Change-Number: 57020
Gerrit-PatchSet: 1
Gerrit-Owner: Gabe Black 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Add ldio and stio microops for IO port access.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57019 )



Change subject: arch-x86: Add ldio and stio microops for IO port access.
..

arch-x86: Add ldio and stio microops for IO port access.

Instead of using an "internal" segment to access IO ports, the only
remaining use of the internal segment, use specialized microops instead.
From the perspective of simulation this should be basically functionally
equivalent, but it simplifies the microcode, simplifies the
communication with the TLB, simplifies the TLB implementation, frees up
some bits for x86 specific memory request flags, and makes the special
"internal" segment available for other uses. It was never very clear
what the "internal" segment was actually supposed to be for, I had just
made something up that seemed plausible. If another plausible use comes
up, then it could be reused for that.

Change-Id: Ib1b7c13c58a9a70c1290bbee3229bcb532a45afe
---
M src/arch/amdgpu/common/tlb.cc
M src/arch/amdgpu/common/tlb.hh
M src/arch/x86/insts/microldstop.hh
M src/arch/x86/insts/microop_args.hh
M src/arch/x86/isa/microops/ldstop.isa
M src/arch/x86/isa/operands.isa
M src/arch/x86/ldstflags.hh
M src/arch/x86/microcode/general_purpose/input_output/general_io.ucode
M src/arch/x86/microcode/general_purpose/input_output/string_io.ucode
M src/arch/x86/tlb.cc
M src/arch/x86/tlb.hh
M src/arch/x86/ucasmlib/arch/x86/microops/ldstop.py
M src/arch/x86/x86_traits.hh
13 files changed, 303 insertions(+), 101 deletions(-)



diff --git a/src/arch/amdgpu/common/tlb.cc b/src/arch/amdgpu/common/tlb.cc
index 6b98363..7afe176 100644
--- a/src/arch/amdgpu/common/tlb.cc
+++ b/src/arch/amdgpu/common/tlb.cc
@@ -274,32 +274,6 @@
 }
 }

-
-
-Fault
-GpuTLB::translateInt(bool read, const RequestPtr , ThreadContext  
*tc)

-{
-DPRINTF(GPUTLB, "Addresses references internal memory.\n");
-Addr vaddr = req->getVaddr();
-Addr prefix = (vaddr >> 3) & IntAddrPrefixMask;
-
-if (prefix == IntAddrPrefixIO) {
-// TODO If CPL > IOPL or in virtual mode, check the I/O  
permission

-// bitmap in the TSS.
-
-Addr IOPort = vaddr & ~IntAddrPrefixMask;
-// Make sure the address fits in the expected 16 bit IO address
-// space.
-assert(!(IOPort & ~0x));
-req->setFlags(Request::UNCACHEABLE | Request::STRICT_ORDER);
-req->setPaddr(PhysAddrPrefixIO | IOPort);
-return NoFault;
-} else {
-panic("Access to unrecognized internal address space %#x.\n",
-  prefix);
-}
-}
-
 /**
  * TLB_lookup will only perform a TLB lookup returning true on a TLB  
hit

  * and false on a TLB miss.
@@ -361,21 +335,28 @@
   Translation *translation, Mode mode,
   bool , bool timing, int )
 {
-uint32_t flags = req->getFlags();
-int seg = flags & SegmentFlagMask;
-bool storeCheck = flags & Request::READ_MODIFY_WRITE;
-
-// If this is true, we're dealing with a request
-// to a non-memory address space.
-if (seg == SEGMENT_REG_MS) {
-return translateInt(mode == Mode::Read, req, tc);
-}
+const Request::Flags flags = req->getFlags();

 delayedResponse = false;
-Addr vaddr = req->getVaddr();
+
+const Addr vaddr = req->getVaddr();
+
+// If this is true, we're dealing with a request to a non-memory
+// address space.
+if (flags & IOFlagBit) {
+DPRINTF(GPUTLB, "Translating IO port %#x.\n", vaddr);
+
+// Make sure the IO port is within the valid range.
+assert((vaddr & ~mask(16)) == 0);
+
+req->setPaddr(PhysAddrPrefixIO | vaddr);
+return NoFault;
+}
+
 DPRINTF(GPUTLB, "Translating vaddr %#x.\n", vaddr);

-HandyM5Reg m5Reg = tc->readMiscRegNoEffect(MISCREG_M5_REG);
+const bool storeCheck = flags & Request::READ_MODIFY_WRITE;
+const HandyM5Reg m5Reg = tc->readMiscRegNoEffect(MISCREG_M5_REG);

 // If protected mode has been enabled...
 if (m5Reg.prot) {
diff --git a/src/arch/amdgpu/common/tlb.hh b/src/arch/amdgpu/common/tlb.hh
index dec0d61..4c65a85 100644
--- a/src/arch/amdgpu/common/tlb.hh
+++ b/src/arch/amdgpu/common/tlb.hh
@@ -158,9 +158,6 @@
  */
 std::vector entryList;

-Fault translateInt(bool read, const RequestPtr ,
-   ThreadContext *tc);
-
 Fault translate(const RequestPtr , ThreadContext *tc,
 Translation *translation, Mode mode, bool ,
 bool timing, int );
diff --git a/src/arch/x86/insts/microldstop.hh  
b/src/arch/x86/insts/microldstop.hh

index 90b8058..980deb7 100644
--- a/src/arch/x86/insts/microldstop.hh
+++ 

[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Fill out the implementation of IN/OUT/INS/OUTS insts.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57021 )



Change subject: arch-x86: Fill out the implementation of IN/OUT/INS/OUTS  
insts.

..

arch-x86: Fill out the implementation of IN/OUT/INS/OUTS insts.

Implement IOPL and IO-permission bitmap checks, and the virtual 8086 mode
versions. The difference is that in virtual 8086 mode, the IOPL is not
checked, and the IO-permission bitmap is always checked.

Change-Id: Id836a482493c06a14e08993ead1e4ab7d630e078
---
M src/arch/x86/microcode/general_purpose/input_output/general_io.ucode
M src/arch/x86/microcode/general_purpose/input_output/string_io.ucode
A src/arch/x86/microcode/macros/ioports.ucode
3 files changed, 261 insertions(+), 23 deletions(-)



diff --git  
a/src/arch/x86/microcode/general_purpose/input_output/general_io.ucode  
b/src/arch/x86/microcode/general_purpose/input_output/general_io.ucode

index 58df96e..e01f9f6 100644
--- a/src/arch/x86/microcode/general_purpose/input_output/general_io.ucode
+++ b/src/arch/x86/microcode/general_purpose/input_output/general_io.ucode
@@ -36,55 +36,116 @@
 # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

+include "macros/ioports.ucode"
+
 def macroop IN {
 .args 'R', 'I'
 .adjust_imm trimImm(8)
 limm t1, imm, dataSize=8
+m_check_iopl iopl_temp="t2", cpl_temp="t3"
+
 mfence
+
+# Do we need to check the IO-permission bitmap?
+br "iopl_ok", flags=(nCECF,)
+
+m_check_iopb io_port="t1", temp1="t2", temp2="t3"
+
+iopl_ok:
 ldio reg, t1
 mfence
 };

 def macroop IN_VIRT {
 .args 'R', 'I'
-panic "Virtual mode in isn't implemented!"
+.adjust_imm trimImm(8)
+limm t1, imm, dataSize=8
+mfence
+
+m_check_iopb io_port="t1", temp1="t2", temp2="t3"
+
+ldio reg, t1
+mfence
 };

 def macroop IN {
 .args 'R', 'R'
+m_check_iopl iopl_temp="t1", cpl_temp="t2"
+
 mfence
+
+# Do we need to check the IO-permission bitmap?
+br "iopl_ok", flags=(nCECF,)
+
+m_check_iopb io_port="regm", temp1="t1", temp2="t2"
+
+iopl_ok:
 ldio reg, regm
 mfence
 };

 def macroop IN_VIRT {
 .args 'R', 'R'
-panic "Virtual mode in isn't implemented!"
+mfence
+
+m_check_iopb io_port="regm", temp1="t1", temp2="t2"
+
+ldio reg, regm
+mfence
 };

 def macroop OUT {
 .args 'I', 'R'
 .adjust_imm trimImm(8)
 limm t1, imm, dataSize=8
+m_check_iopl iopl_temp="t2", cpl_temp="t3"
+
 mfence
+
+# Do we need to check the IO-permission bitmap?
+br "iopl_ok", flags=(nCECF,)
+
+m_check_iopb io_port="t1", temp1="t2", temp2="t3"
+
+iopl_ok:
 stio reg, t1
 mfence
 };

 def macroop OUT_VIRT {
 .args 'I', 'R'
-panic "Virtual mode out isn't implemented!"
+.adjust_imm trimImm(8)
+limm t1, imm, dataSize=8
+mfence
+
+m_check_iopb io_port="t1", temp1="t2", temp2="t3"
+
+stio reg, t1
+mfence
 };

 def macroop OUT {
 .args 'R', 'R'
-zexti t2, reg, 15, dataSize=8
+m_check_iopl iopl_temp="t1", cpl_temp="t2"
+
 mfence
+
+# Do we need to check the IO-permission bitmap?
+br "iopl_ok", flags=(nCECF,)
+
+m_check_iopb io_port="reg", temp1="t1", temp2="t2"
+
+iopl_ok:
 stio regm, reg
 mfence
 };

 def macroop OUT_VIRT {
 .args 'R', 'R'
-panic "Virtual mode out isn't implemented!"
+mfence
+
+m_check_iopb io_port="reg", temp1="t1", temp2="t2"
+
+stio regm, reg
+mfence
 };
diff --git  
a/src/arch/x86/microcode/general_purpose/input_output/string_io.ucode  
b/src/arch/x86/microcode/general_purpose/input_output/string_io.ucode

index 40157fc..fffae9d 100644
--- a/src/arch/x86/microcode/general_purpose/input_output/string_io.ucode
+++ b/src/arch/x86/microcode/general_purpose/input_output/string_io.ucode
@@ -33,18 +33,27 @@
 # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

+include "macros/ioports.ucode"
+
 def macroop INS {
 .args 'Y', 'R'
+
 # Find the constant we need to either add or subtract from rdi
 ruflag t0, 10
 movi t3, t3, dsz, flags=(CEZF,), dataSize=asz
 subi t4, t0, dsz, dataSize=asz
 mov t3, t3, t4, flags=(nCEZF,), dataSize=asz

-zexti t2, reg, 15, dataSize=8
+m_check_iopl iopl_temp="t8", cpl_temp="t9"

 mfence
-ldio t6, t2
+# Do we need to check the IO-permission bitmap?
+br "iopl_ok", flags=(nCECF,)
+
+m_check_iopb io_port="reg", temp1="t8", temp2="t9"
+
+iopl_ok:
+ldio t6, reg
 st t6, es, [1, t0, rdi]
 mfence

@@ -53,7 +62,22 @@

 def macroop INS_VIRT {
 .args 'Y', 'R'
-panic "Virtual mode ins isn't implemented!"
+
+# Find the constant we need to either add or subtract from rdi
+ruflag t0, 10
+movi t3, t3, dsz, flags=(CEZF,), 

[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Simplify and consolidate ld/st microop definitions.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57018 )



Change subject: arch-x86: Simplify and consolidate ld/st microop  
definitions.

..

arch-x86: Simplify and consolidate ld/st microop definitions.

A lot of mechanism was repeated between the various flavors of microop
base class in python. This code can mostly be consolidated. One
exception are the __init__ methods which are still pretty repetitive,
but they are often slightly different and provide the top level
interface to those microops. Having specific signatures which exactly
match how the microops can/should be used is valuable in that case, and
worth a little repetition.

Change-Id: Ie440ac3b8dcbd1b3d00e33a1b1209d4fde0a966e
---
M src/arch/x86/ucasmlib/arch/x86/microops/ldstop.py
1 file changed, 50 insertions(+), 96 deletions(-)



diff --git a/src/arch/x86/ucasmlib/arch/x86/microops/ldstop.py  
b/src/arch/x86/ucasmlib/arch/x86/microops/ldstop.py

index 5d46c71..960fc18 100644
--- a/src/arch/x86/ucasmlib/arch/x86/microops/ldstop.py
+++ b/src/arch/x86/ucasmlib/arch/x86/microops/ldstop.py
@@ -63,66 +63,30 @@
 if uncacheable:
 self.instFlags += " | Request::UNCACHEABLE"

-def getAllocator(self, microFlags):
-allocator = '''new %(class_name)s(machInst, macrocodeBlock,
-%(flags)s, %(dataSize)s, %(addressSize)s, %(memFlags)s,
-%(data)s, %(scale)s, %(index)s, %(base)s,
-%(disp)s, %(segment)s)''' % {
-"class_name" : self.className,
-"flags" : self.microFlagsText(microFlags) + self.instFlags,
-"scale" : self.scale, "index" : self.index,
-"base" : self.base,
-"disp" : self.disp,
-"segment" : self.segment, "data" : self.data,
-"dataSize" : self.dataSize, "addressSize" : self.addressSize,
-"memFlags" : self.memFlags}
-return allocator
-
-class BigLdStOp(X86Microop):
-def __init__(self, data, segment, addr, disp,
-dataSize, addressSize, baseFlags, atCPL0, prefetch, nonSpec,
-uncacheable):
-super().__init__()
-self.data = data
-[self.scale, self.index, self.base] = addr
-self.disp = disp
-self.segment = segment
-self.dataSize = dataSize
-self.addressSize = addressSize
-self.memFlags = baseFlags
-if atCPL0:
-self.memFlags += " | CPL0FlagBit"
-self.instFlags = ""
-if prefetch:
-self.memFlags += " | Request::PREFETCH"
-self.instFlags += " | (1ULL << StaticInst::IsDataPrefetch)"
-if nonSpec:
-self.instFlags += " | (1ULL << StaticInst::IsNonSpeculative)"
-if uncacheable:
-self.instFlags += " | Request::UNCACHEABLE"
+allocatorTemplate = '''new {self.className}(machInst, macrocodeBlock,
+{flags}, {self.dataSize}, {self.addressSize},
+{self.memFlags}, {self.data}, {self.scale},
+{self.index}, {self.base}, {self.disp}, {self.segment})
+'''

 def getAllocator(self, microFlags):
-allocString = '''
-(%(dataSize)s >= 4) ?
-(StaticInstPtr)(new %(class_name)sBig(machInst,
-macrocodeBlock, %(flags)s, %(dataSize)s,
-%(addressSize)s, %(memFlags)s, %(data)s, %(scale)s,
-%(index)s, %(base)s, %(disp)s, %(segment)s)) :
-(StaticInstPtr)(new %(class_name)s(machInst,
-macrocodeBlock, %(flags)s, %(dataSize)s,
-%(addressSize)s, %(memFlags)s, %(data)s, %(scale)s,
-%(index)s, %(base)s, %(disp)s, %(segment)s))
+return self.allocatorTemplate.format(self=self,
+flags=self.microFlagsText(microFlags))
+
+class BigLdStOp(LdStOp):
+allocatorTemplate = '''
+({self.dataSize} >= 4) ?
+(StaticInstPtr)(new {self.className}Big(machInst,
+macrocodeBlock, {flags}, {self.dataSize},
+{self.addressSize}, {self.memFlags}, {self.data},
+{self.scale}, {self.index}, {self.base}, {self.disp},
+{self.segment})) :
+(StaticInstPtr)(new {self.className}(machInst,
+macrocodeBlock, {flags}, {self.dataSize},
+{self.addressSize}, {self.memFlags}, {self.data},
+{self.scale}, {self.index}, {self.base}, {self.disp},
+{self.segment}))
 '''
-allocator = allocString % {
-"class_name" : self.className,
-"flags" : self.microFlagsText(microFlags) + self.instFlags,
-"scale" : self.scale, "index" : self.index,
-"base" : self.base,
-"disp" : self.disp,
-"segment" : self.segment, 

[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Move segmentation checks to ld/st microops.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57009 )



Change subject: arch-x86: Move segmentation checks to ld/st microops.
..

arch-x86: Move segmentation checks to ld/st microops.

These microops were applying the segment base, but not checking limits,
segmentation level access permissions, whether the segment is usable,
etc. Then in the TLB, the original offset and other information had to
be recovered from the post segmentation address, and the request flags.

Instead, we can perform those checks in the microops which already have
all the information they need. This simplifies the plumbing to the TLB,
and avoids having to reverse segmentation which is error prone.

Change-Id: I1fe70ad650bf4f221660e608c49b70e081223e3f
---
M src/arch/amdgpu/common/tlb.cc
M src/arch/x86/SConscript
A src/arch/x86/insts/microldstop.cc
M src/arch/x86/insts/microldstop.hh
M src/arch/x86/isa/microops/ldstop.isa
M src/arch/x86/ldstflags.hh
M src/arch/x86/tlb.cc
M src/arch/x86/ucasmlib/arch/x86/microops/ldstop.py
8 files changed, 235 insertions(+), 187 deletions(-)



diff --git a/src/arch/amdgpu/common/tlb.cc b/src/arch/amdgpu/common/tlb.cc
index 698570b..f0f806a 100644
--- a/src/arch/amdgpu/common/tlb.cc
+++ b/src/arch/amdgpu/common/tlb.cc
@@ -438,53 +438,25 @@
 // If protected mode has been enabled...
 if (m5Reg.prot) {
 DPRINTF(GPUTLB, "In protected mode.\n");
-// If we're not in 64-bit mode, do protection/limit checks
-if (m5Reg.mode != LongMode) {
-DPRINTF(GPUTLB, "Not in long mode. Checking segment "
-"protection.\n");
+// If we're not in 64-bit mode and doing a fetch, check CS  
limits.

+if (m5Reg.mode != LongMode && mode == BaseMMU::Execute) {
+DPRINTF(GPUTLB, "Not in long mode. Checking CS limits.\n");

-// Check for a null segment selector.
-if (!(seg == SEGMENT_REG_TSG || seg ==  
SYS_SEGMENT_REG_IDTR ||

-seg == SEGMENT_REG_HS || seg == SEGMENT_REG_LS)
-&& !tc->readMiscRegNoEffect(MISCREG_SEG_SEL(seg))) {
+const SegAttr attr =  
tc->readMiscRegNoEffect(MISCREG_CS_ATTR);

+// Check for an unusable segment.
+if (attr.unusable) {
+DPRINTF(GPUTLB, "Unusable segment.\n");
 return std::make_shared(0);
 }
-
-bool expandDown = false;
-SegAttr attr =  
tc->readMiscRegNoEffect(MISCREG_SEG_ATTR(seg));

-
-if (seg >= SEGMENT_REG_ES && seg <= SEGMENT_REG_HS) {
-if (!attr.writable && (mode == BaseMMU::Write ||
-storeCheck))
-return std::make_shared(0);
-
-if (!attr.readable && mode == BaseMMU::Read)
-return std::make_shared(0);
-
-expandDown = attr.expandDown;
-
-}
-
-Addr base = tc->readMiscRegNoEffect(MISCREG_SEG_BASE(seg));
-Addr limit =  
tc->readMiscRegNoEffect(MISCREG_SEG_LIMIT(seg));
-Addr logSize = (flags >> AddrSizeFlagShift) &  
AddrSizeFlagMask;

-int size = 8 << logSize;
-
-Addr offset = (vaddr - base) & mask(size);
-Addr endOffset = offset + req->getSize() - 1;
-
-if (expandDown) {
-DPRINTF(GPUTLB, "Checking an expand down segment.\n");
-warn_once("Expand down segments are untested.\n");
-
-if (offset <= limit || endOffset <= limit)
-return std::make_shared(0);
-} else {
-if (offset > limit || endOffset > limit)
-return std::make_shared(0);
+const Addr base =  
tc->readMiscRegNoEffect(MISCREG_CS_EFF_BASE);
+const Addr limit =  
tc->readMiscRegNoEffect(MISCREG_CS_LIMIT);

+const Addr offset = vaddr - base;
+if (offset + req->getSize() - 1 > limit) {
+DPRINTF(GPUTLB, "Segment limit check failed, "
+"offset = %#x limit = %#x.\n", offset, limit);
+return std::make_shared(0);
 }
 }
-
 // If paging is enabled, do the translation.
 if (m5Reg.paging) {
 DPRINTF(GPUTLB, "Paging enabled.\n");
diff --git a/src/arch/x86/SConscript b/src/arch/x86/SConscript
index 5c80532..c592248 100644
--- a/src/arch/x86/SConscript
+++ b/src/arch/x86/SConscript
@@ -48,6 +48,7 @@
 Source('fs_workload.cc', tags='x86 isa')
 Source('insts/badmicroop.cc', tags='x86 isa')
 Source('insts/microop.cc', tags='x86 isa')
+Source('insts/microldstop.cc', tags='x86 isa')
 

[gem5-dev] Change in gem5/gem5[develop]: arch-x86,dev: Handle PCI config addresses in an x86 PCI host.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57017 )



Change subject: arch-x86,dev: Handle PCI config addresses in an x86 PCI  
host.

..

arch-x86,dev: Handle PCI config addresses in an x86 PCI host.

Stop handling it in the x86 TLB.

Change-Id: I9d5d44e614e29cd6bcdac9b0efe75ffe08af2f5f
---
M src/arch/amdgpu/common/tlb.cc
M src/arch/x86/kvm/x86_cpu.cc
M src/arch/x86/regs/misc.hh
M src/arch/x86/tlb.cc
M src/arch/x86/x86_traits.hh
M src/dev/x86/Pc.py
M src/dev/x86/SConscript
A src/dev/x86/pci_host.cc
A src/dev/x86/pci_host.hh
9 files changed, 191 insertions(+), 129 deletions(-)



diff --git a/src/arch/amdgpu/common/tlb.cc b/src/arch/amdgpu/common/tlb.cc
index 127aa02..6b98363 100644
--- a/src/arch/amdgpu/common/tlb.cc
+++ b/src/arch/amdgpu/common/tlb.cc
@@ -276,26 +276,6 @@



-namespace
-{
-
-Cycles
-localMiscRegAccess(bool read, MiscRegIndex regNum,
-   ThreadContext *tc, PacketPtr pkt)
-{
-if (read) {
-RegVal data = htole(tc->readMiscReg(regNum));
-// Make sure we don't trot off the end of data.
-pkt->setData((uint8_t *));
-} else {
-RegVal data = htole(tc->readMiscRegNoEffect(regNum));
-tc->setMiscReg(regNum, letoh(data));
-}
-return Cycles(1);
-}
-
-} // anonymous namespace
-
 Fault
 GpuTLB::translateInt(bool read, const RequestPtr , ThreadContext  
*tc)

 {
@@ -311,29 +291,8 @@
 // Make sure the address fits in the expected 16 bit IO address
 // space.
 assert(!(IOPort & ~0x));
-if (IOPort == 0xCF8 && req->getSize() == 4) {
-req->setLocalAccessor(
-[read](ThreadContext *tc, PacketPtr pkt)
-{
-return localMiscRegAccess(
-read, MISCREG_PCI_CONFIG_ADDRESS, tc, pkt);
-}
-);
-} else if ((IOPort & ~mask(2)) == 0xCFC) {
-req->setFlags(Request::UNCACHEABLE |  
Request::STRICT_ORDER);

-Addr configAddress =
-tc->readMiscRegNoEffect(MISCREG_PCI_CONFIG_ADDRESS);
-if (bits(configAddress, 31, 31)) {
-req->setPaddr(PhysAddrPrefixPciConfig |
-mbits(configAddress, 30, 2) |
-(IOPort & mask(2)));
-} else {
-req->setPaddr(PhysAddrPrefixIO | IOPort);
-}
-} else {
-req->setFlags(Request::UNCACHEABLE |  
Request::STRICT_ORDER);

-req->setPaddr(PhysAddrPrefixIO | IOPort);
-}
+req->setFlags(Request::UNCACHEABLE | Request::STRICT_ORDER);
+req->setPaddr(PhysAddrPrefixIO | IOPort);
 return NoFault;
 } else {
 panic("Access to unrecognized internal address space %#x.\n",
diff --git a/src/arch/x86/kvm/x86_cpu.cc b/src/arch/x86/kvm/x86_cpu.cc
index af041a5..fbfda62 100644
--- a/src/arch/x86/kvm/x86_cpu.cc
+++ b/src/arch/x86/kvm/x86_cpu.cc
@@ -55,9 +55,6 @@

 #define MSR_TSC 0x10

-#define IO_PCI_CONF_ADDR 0xCF8
-#define IO_PCI_CONF_DATA_BASE 0xCFC
-
 // Task segment type of an inactive 32-bit or 64-bit task
 #define SEG_SYS_TYPE_TSS_AVAILABLE 9
 // Task segment type of an active 32-bit or 64-bit task
@@ -1319,30 +1316,7 @@
 DPRINTF(KvmIO, "KVM-x86: Handling IO instruction (%s) (port: 0x%x)\n",
 (isWrite ? "out" : "in"), kvm_run.io.port);

-/* Vanilla gem5 handles PCI discovery in the TLB(!). Since we
- * don't use the TLB component, we need to intercept and handle
- * the PCI configuration space IO ports here.
- *
- * The IO port PCI discovery mechanism uses one address register
- * and one data register. We map the address register to a misc
- * reg and use that to re-route data register accesses to the
- * right location in the PCI configuration space.
- */
-if (port == IO_PCI_CONF_ADDR) {
-handleIOMiscReg32(MISCREG_PCI_CONFIG_ADDRESS);
-return 0;
-} else if ((port & ~0x3) == IO_PCI_CONF_DATA_BASE) {
-Addr pciConfigAddr(tc->readMiscRegNoEffect(
-MISCREG_PCI_CONFIG_ADDRESS));
-if (pciConfigAddr & 0x8000) {
-pAddr = X86ISA::x86PciConfigAddress((pciConfigAddr &  
0x7ffc) |

-(port & 0x3));
-} else {
-pAddr = X86ISA::x86IOAddress(port);
-}
-} else {
-pAddr = X86ISA::x86IOAddress(port);
-}
+pAddr = X86ISA::x86IOAddress(port);

 const MemCmd cmd(isWrite ? MemCmd::WriteReq : MemCmd::ReadReq);
 // Temporarily lock and migrate to the device event queue to
diff --git a/src/arch/x86/regs/misc.hh 

[gem5-dev] Change in gem5/gem5[develop]: dev: Fix minor style problem in dev/pci/host.hh

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57012 )



Change subject: dev: Fix minor style problem in dev/pci/host.hh
..

dev: Fix minor style problem in dev/pci/host.hh

Change-Id: Id00431a972cfbae083813b8eb4ba6687fdb1eea9
---
M src/dev/pci/host.hh
1 file changed, 18 insertions(+), 3 deletions(-)



diff --git a/src/dev/pci/host.hh b/src/dev/pci/host.hh
index de36338..04302f5 100644
--- a/src/dev/pci/host.hh
+++ b/src/dev/pci/host.hh
@@ -286,15 +286,21 @@
 AddrRangeList getAddrRanges() const override;

   protected: // PciHost
-Addr pioAddr(const PciBusAddr _addr, Addr pci_addr) const override  
{

+Addr
+pioAddr(const PciBusAddr _addr, Addr pci_addr) const override
+{
 return pciPioBase + pci_addr;
 }

-Addr memAddr(const PciBusAddr _addr, Addr pci_addr) const override  
{

+Addr
+memAddr(const PciBusAddr _addr, Addr pci_addr) const override
+{
 return pciMemBase + pci_addr;
 }

-Addr dmaAddr(const PciBusAddr _addr, Addr pci_addr) const override  
{

+Addr
+dmaAddr(const PciBusAddr _addr, Addr pci_addr) const override
+{
 return pciDmaBase + pci_addr;
 }


--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/57012
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: Id00431a972cfbae083813b8eb4ba6687fdb1eea9
Gerrit-Change-Number: 57012
Gerrit-PatchSet: 1
Gerrit-Owner: Gabe Black 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: dev: Fix the size of the config space for the IDE controller.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57014 )



Change subject: dev: Fix the size of the config space for the IDE  
controller.

..

dev: Fix the size of the config space for the IDE controller.

The PCI_CONFIG_SIZE constant isn't really a size, it's actually a mask.
When using it to size the IDE controller config space, we need to add 1.

Change-Id: I4aa6c0a488ac3dc1977a95d36423160a4bdda133
---
M src/dev/storage/ide_ctrl.hh
1 file changed, 15 insertions(+), 1 deletion(-)



diff --git a/src/dev/storage/ide_ctrl.hh b/src/dev/storage/ide_ctrl.hh
index 38b97bf..635c446 100644
--- a/src/dev/storage/ide_ctrl.hh
+++ b/src/dev/storage/ide_ctrl.hh
@@ -96,7 +96,9 @@
 /* 0x48  */ Register8 udmaControl = {"udma control"};
 /* 0x49  */ RegisterRaz raz1 = {"raz1", 1};
 /* 0x4a-0x4b */ Register16 udmaTiming = {"udma timing"};
-/* 0x4c-...  */ RegisterRaz raz2 = {"raz2", PCI_CONFIG_SIZE -  
0x4c};

+// PCI_CONFIG_SIZE is really a mask and not a size, so we add one.
+/* 0x4c-...  */ RegisterRaz raz2 =
+{"raz2", (PCI_CONFIG_SIZE + 1) - 0x4c};

 void serialize(CheckpointOut ) const;
 void unserialize(CheckpointIn );

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/57014
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I4aa6c0a488ac3dc1977a95d36423160a4bdda133
Gerrit-Change-Number: 57014
Gerrit-PatchSet: 1
Gerrit-Owner: Gabe Black 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Dedent and constexpr-ize x86_traits.hh.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57013 )



Change subject: arch-x86: Dedent and constexpr-ize x86_traits.hh.
..

arch-x86: Dedent and constexpr-ize x86_traits.hh.

Change-Id: I285a9868be9cf4cee13bf572d9f914485e60ca26
---
M src/arch/x86/x86_traits.hh
1 file changed, 58 insertions(+), 52 deletions(-)



diff --git a/src/arch/x86/x86_traits.hh b/src/arch/x86/x86_traits.hh
index 71d0929..113326e 100644
--- a/src/arch/x86/x86_traits.hh
+++ b/src/arch/x86/x86_traits.hh
@@ -42,64 +42,61 @@

 #include "base/types.hh"

-namespace gem5
+namespace gem5::X86ISA
 {

-namespace X86ISA
+constexpr int NumMicroIntRegs = 16;
+
+constexpr int NumMMXRegs = 8;
+constexpr int NumXMMRegs = 16;
+constexpr int NumMicroFpRegs = 8;
+
+constexpr int NumCRegs = 16;
+constexpr int NumDRegs = 8;
+
+constexpr int NumSegments = 6;
+constexpr int NumSysSegments = 4;
+
+constexpr Addr IntAddrPrefixMask = 0xULL;
+constexpr Addr IntAddrPrefixIO = 0x3ULL;
+
+constexpr Addr PhysAddrPrefixIO = 0x8000ULL;
+constexpr Addr PhysAddrPrefixPciConfig = 0xC000ULL;
+constexpr Addr PhysAddrPrefixLocalAPIC = 0x2000ULL;
+constexpr Addr PhysAddrPrefixInterrupts = 0xA000ULL;
+// Each APIC gets two pages. One page is used for local apics to field
+// accesses from the CPU, and the other is for all APICs to communicate.
+constexpr Addr PhysAddrAPICRangeSize = 1 << 12;
+
+// Put this in an unused part of the 16 bit IO port address space.
+constexpr Addr PhysAddrIntA = 0x8001ULL;
+
+static constexpr inline Addr
+x86IOAddress(const uint32_t port)
 {
-const int NumMicroIntRegs = 16;
+return PhysAddrPrefixIO | port;
+}

-const int NumMMXRegs = 8;
-const int NumXMMRegs = 16;
-const int NumMicroFpRegs = 8;
+static constexpr inline Addr
+x86PciConfigAddress(const uint32_t addr)
+{
+return PhysAddrPrefixPciConfig | addr;
+}

-const int NumCRegs = 16;
-const int NumDRegs = 8;
+static constexpr inline Addr
+x86LocalAPICAddress(const uint8_t id, const uint16_t addr)
+{
+assert(addr < (1 << 12));
+return PhysAddrPrefixLocalAPIC | (id * (1 << 12)) | addr;
+}

-const int NumSegments = 6;
-const int NumSysSegments = 4;
+static constexpr inline Addr
+x86InterruptAddress(const uint8_t id, const uint16_t addr)
+{
+assert(addr < PhysAddrAPICRangeSize);
+return PhysAddrPrefixInterrupts | (id * PhysAddrAPICRangeSize) | addr;
+}

-const Addr IntAddrPrefixMask = 0xULL;
-const Addr IntAddrPrefixIO = 0x3ULL;
-
-const Addr PhysAddrPrefixIO = 0x8000ULL;
-const Addr PhysAddrPrefixPciConfig = 0xC000ULL;
-const Addr PhysAddrPrefixLocalAPIC = 0x2000ULL;
-const Addr PhysAddrPrefixInterrupts = 0xA000ULL;
-// Each APIC gets two pages. One page is used for local apics to field
-// accesses from the CPU, and the other is for all APICs to  
communicate.

-const Addr PhysAddrAPICRangeSize = 1 << 12;
-
-// Put this in an unused part of the 16 bit IO port address space.
-const Addr PhysAddrIntA = 0x8001ULL;
-
-static inline Addr
-x86IOAddress(const uint32_t port)
-{
-return PhysAddrPrefixIO | port;
-}
-
-static inline Addr
-x86PciConfigAddress(const uint32_t addr)
-{
-return PhysAddrPrefixPciConfig | addr;
-}
-
-static inline Addr
-x86LocalAPICAddress(const uint8_t id, const uint16_t addr)
-{
-assert(addr < (1 << 12));
-return PhysAddrPrefixLocalAPIC | (id * (1 << 12)) | addr;
-}
-
-static inline Addr
-x86InterruptAddress(const uint8_t id, const uint16_t addr)
-{
-assert(addr < PhysAddrAPICRangeSize);
-return PhysAddrPrefixInterrupts | (id * PhysAddrAPICRangeSize) |  
addr;

-}
-
-} // namespace X86ISA
-} // namespace gem5
+} // namespace gem5::X86ISA

 #endif //__ARCH_X86_X86TRAITS_HH__

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/57013
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I285a9868be9cf4cee13bf572d9f914485e60ca26
Gerrit-Change-Number: 57013
Gerrit-PatchSet: 1
Gerrit-Owner: Gabe Black 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: dev: Stop passing the pkt into read|writeConfig.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57016 )



Change subject: dev: Stop passing the pkt into read|writeConfig.
..

dev: Stop passing the pkt into read|writeConfig.

The config space offset is not necessarily the masked lower bits of the
address in the original packet. We're already computing the actual
offset, but then we're throwing it away. Instead, pass in the offset,
target (or source) buffer, and size.

Change-Id: Ib01ce1f8e6c40c73d1b14e29ba979370b2ab9e6b
---
M src/dev/amdgpu/amdgpu_device.cc
M src/dev/amdgpu/amdgpu_device.hh
M src/dev/net/i8254xGBe.cc
M src/dev/net/i8254xGBe.hh
M src/dev/net/ns_gige.cc
M src/dev/net/ns_gige.hh
M src/dev/pci/device.cc
M src/dev/pci/device.hh
M src/dev/pci/host.cc
M src/dev/storage/ide_ctrl.cc
M src/dev/storage/ide_ctrl.hh
11 files changed, 123 insertions(+), 104 deletions(-)



diff --git a/src/dev/amdgpu/amdgpu_device.cc  
b/src/dev/amdgpu/amdgpu_device.cc

index a4edef4..13c50a1 100644
--- a/src/dev/amdgpu/amdgpu_device.cc
+++ b/src/dev/amdgpu/amdgpu_device.cc
@@ -97,13 +97,12 @@
 }

 Tick
-AMDGPUDevice::readConfig(PacketPtr pkt)
+AMDGPUDevice::readConfig(Addr offset, void *data, Addr size)
 {
-[[maybe_unused]] int offset = pkt->getAddr() & PCI_CONFIG_SIZE;
 DPRINTF(AMDGPUDevice, "Read Config: from offset: %#x size: %#x "
 "data: %#x\n", offset, pkt->getSize(), config.data[offset]);

-Tick delay = PciDevice::readConfig(pkt);
+Tick delay = PciDevice::readConfig(offset, data, size);

 // Before sending MMIOs the driver sends three interrupts in a row.
 // Use this to trigger creating a checkpoint to restore in timing mode.
@@ -125,14 +124,25 @@
 }

 Tick
-AMDGPUDevice::writeConfig(PacketPtr pkt)
+AMDGPUDevice::writeConfig(Addr offset, const void *data, Addr size)
 {
-[[maybe_unused]] int offset = pkt->getAddr() & PCI_CONFIG_SIZE;
-DPRINTF(AMDGPUDevice, "Write Config: from offset: %#x size: %#x "
-"data: %#x\n", offset, pkt->getSize(),
-pkt->getUintX(ByteOrder::little));
+if (debug::AMDGPUDevice) {
+uint64_t val = 0;
+if (size == 1)
+val = letoh(*(const uint8_t *)data);
+else if (size == 2)
+val = letoh(*(const uint16_t *)data);
+else if (size == 4)
+val = letoh(*(const uint32_t *)data);
+else if (size == 8)
+val = letoh(*(const uint64_t *)data);
+else
+panic("Unsupported config write size %d.", size);
+DPRINTF(AMDGPUDevice, "Write Config: from offset: %#x size: %#x "
+"data: %#x\n", offset, size, val);
+}

-return PciDevice::writeConfig(pkt);
+return PciDevice::writeConfig(offset, data, size);
 }

 void
diff --git a/src/dev/amdgpu/amdgpu_device.hh  
b/src/dev/amdgpu/amdgpu_device.hh

index b97d682..62e40e9 100644
--- a/src/dev/amdgpu/amdgpu_device.hh
+++ b/src/dev/amdgpu/amdgpu_device.hh
@@ -114,8 +114,8 @@
  */
 void intrPost();

-Tick writeConfig(PacketPtr pkt) override;
-Tick readConfig(PacketPtr pkt) override;
+Tick writeConfig(Addr offset, const void *data, Addr size) override;
+Tick readConfig(Addr offset, void *data, Addr size) override;

 Tick read(PacketPtr pkt) override;
 Tick write(PacketPtr pkt) override;
diff --git a/src/dev/net/i8254xGBe.cc b/src/dev/net/i8254xGBe.cc
index 1bfa85f..c09e9a8 100644
--- a/src/dev/net/i8254xGBe.cc
+++ b/src/dev/net/i8254xGBe.cc
@@ -148,11 +148,10 @@
 }

 Tick
-IGbE::writeConfig(PacketPtr pkt)
+IGbE::writeConfig(Addr offset, const void *data, Addr size)
 {
-int offset = pkt->getAddr() & PCI_CONFIG_SIZE;
 if (offset < PCI_DEVICE_SPECIFIC)
-PciDevice::writeConfig(pkt);
+PciDevice::writeConfig(offset, data, size);
 else
 panic("Device specific PCI config space not implemented.\n");

diff --git a/src/dev/net/i8254xGBe.hh b/src/dev/net/i8254xGBe.hh
index e6aa35d..9db4871 100644
--- a/src/dev/net/i8254xGBe.hh
+++ b/src/dev/net/i8254xGBe.hh
@@ -490,7 +490,7 @@
 Tick read(PacketPtr pkt) override;
 Tick write(PacketPtr pkt) override;

-Tick writeConfig(PacketPtr pkt) override;
+Tick writeConfig(Addr offset, const void *data, Addr size) override;

 bool ethRxPkt(EthPacketPtr packet);
 void ethTxDone();
diff --git a/src/dev/net/ns_gige.cc b/src/dev/net/ns_gige.cc
index b8c3705..434948d 100644
--- a/src/dev/net/ns_gige.cc
+++ b/src/dev/net/ns_gige.cc
@@ -148,11 +148,10 @@
  * This is to write to the PCI general configuration registers
  */
 Tick
-NSGigE::writeConfig(PacketPtr pkt)
+NSGigE::writeConfig(Addr offset, const void *data, Addr size)
 {
-int offset = pkt->getAddr() & PCI_CONFIG_SIZE;
 if (offset < PCI_DEVICE_SPECIFIC)
-PciDevice::writeConfig(pkt);
+PciDevice::writeConfig(offset, data, size);
 else
 panic("Device specific PCI config space not 

[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Eliminate the CPUID internal addr prefix.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57010 )



Change subject: arch-x86: Eliminate the CPUID internal addr prefix.
..

arch-x86: Eliminate the CPUID internal addr prefix.

This was not implemented and not used. Since CPUID is implemented using
a different mechanism, this will never be used and so can be deleted.

Change-Id: I75c3eb7ebe290c972d4dc6a4827172b136387431
---
M src/arch/amdgpu/common/tlb.cc
M src/arch/x86/tlb.cc
M src/arch/x86/x86_traits.hh
3 files changed, 14 insertions(+), 7 deletions(-)



diff --git a/src/arch/amdgpu/common/tlb.cc b/src/arch/amdgpu/common/tlb.cc
index f0f806a..0a00185 100644
--- a/src/arch/amdgpu/common/tlb.cc
+++ b/src/arch/amdgpu/common/tlb.cc
@@ -303,9 +303,7 @@
 Addr vaddr = req->getVaddr();
 Addr prefix = (vaddr >> 3) & IntAddrPrefixMask;

-if (prefix == IntAddrPrefixCPUID) {
-panic("CPUID memory space not yet implemented!\n");
-} else if (prefix == IntAddrPrefixMSR) {
+if (prefix == IntAddrPrefixMSR) {
 vaddr = (vaddr >> 3) & ~IntAddrPrefixMask;

 MiscRegIndex regNum;
diff --git a/src/arch/x86/tlb.cc b/src/arch/x86/tlb.cc
index 6f5d51d..024f880 100644
--- a/src/arch/x86/tlb.cc
+++ b/src/arch/x86/tlb.cc
@@ -200,9 +200,7 @@
 DPRINTF(TLB, "Addresses references internal memory.\n");
 Addr vaddr = req->getVaddr();
 Addr prefix = (vaddr >> 3) & IntAddrPrefixMask;
-if (prefix == IntAddrPrefixCPUID) {
-panic("CPUID memory space not yet implemented!\n");
-} else if (prefix == IntAddrPrefixMSR) {
+if (prefix == IntAddrPrefixMSR) {
 vaddr = (vaddr >> 3) & ~IntAddrPrefixMask;

 MiscRegIndex regNum;
diff --git a/src/arch/x86/x86_traits.hh b/src/arch/x86/x86_traits.hh
index e4a6392..71c4c87 100644
--- a/src/arch/x86/x86_traits.hh
+++ b/src/arch/x86/x86_traits.hh
@@ -60,7 +60,6 @@
 const int NumSysSegments = 4;

 const Addr IntAddrPrefixMask = 0xULL;
-const Addr IntAddrPrefixCPUID = 0x1ULL;
 const Addr IntAddrPrefixMSR = 0x2ULL;
 const Addr IntAddrPrefixIO = 0x3ULL;


--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/57010
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I75c3eb7ebe290c972d4dc6a4827172b136387431
Gerrit-Change-Number: 57010
Gerrit-PatchSet: 1
Gerrit-Owner: Gabe Black 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: dev: Factor out the majority of read|writeConfig.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57015 )



Change subject: dev: Factor out the majority of read|writeConfig.
..

dev: Factor out the majority of read|writeConfig.

Most of what these two methods are doing in the GenericPciHost are
essentially required for any PCI host. They must be able to get the
device associated with a BDF, and then ask it to either read or write to
its config space. The DPRINTF is also non-trivial and generic.

This also uses structured binding to tidy up the moved code slightly.

Change-Id: I79187bbd2a56e4f40ebeda0e7c228e30c72f1d7d
---
M src/dev/pci/host.cc
M src/dev/pci/host.hh
2 files changed, 65 insertions(+), 35 deletions(-)



diff --git a/src/dev/pci/host.cc b/src/dev/pci/host.cc
index e7dea6c..ce86967 100644
--- a/src/dev/pci/host.cc
+++ b/src/dev/pci/host.cc
@@ -86,6 +86,48 @@
 return device != devices.end() ? device->second : nullptr;
 }

+Tick
+PciHost::writeConfig(const PciBusAddr , Addr offset, PacketPtr pkt)
+{
+const Addr size = pkt->getSize();
+const auto &[bus, dev, func] = bdf;
+pkt->makeAtomicResponse();
+DPRINTF(PciHost, "%02x:%02x.%i: write: offset=0x%x, size=0x%x\n",
+bus, dev, func, offset, size);
+
+PciDevice *const pci_dev = getDevice(bdf);
+panic_if(!pci_dev,
+ "%02x:%02x.%i: Write to config space on non-existent PCI  
device",

+ bus, dev, func);
+
+// @todo Remove this after testing
+pkt->headerDelay = pkt->payloadDelay = 0;
+
+return pci_dev->writeConfig(pkt);
+}
+
+Tick
+PciHost::readConfig(const PciBusAddr , Addr offset, PacketPtr pkt)
+{
+const auto &[bus, dev, func] = bdf;
+const Addr size = pkt->getSize();
+uint8_t *pkt_data = pkt->getPtr();
+pkt->makeAtomicResponse();
+
+DPRINTF(PciHost, "%02x:%02x.%i: read: offset=0x%x, size=0x%x\n",
+bus, dev, func, offset, size);
+
+PciDevice *const pci_dev = getDevice(bdf);
+if (pci_dev) {
+// @todo Remove this after testing
+pkt->headerDelay = pkt->payloadDelay = 0;
+return pci_dev->readConfig(pkt);
+} else {
+std::fill(pkt_data, pkt_data + size, 0xFF);
+return 0;
+}
+}
+
 PciHost::DeviceInterface::DeviceInterface(
 PciHost &_host,
 PciBusAddr _addr, PciIntPin interrupt_pin)
@@ -136,46 +178,15 @@
 Tick
 GenericPciHost::read(PacketPtr pkt)
 {
-const auto dev_addr(decodeAddress(pkt->getAddr() - confBase));
-const Addr size(pkt->getSize());
-
-DPRINTF(PciHost, "%02x:%02x.%i: read: offset=0x%x, size=0x%x\n",
-dev_addr.first.bus, dev_addr.first.dev, dev_addr.first.func,
-dev_addr.second,
-size);
-
-PciDevice *const pci_dev(getDevice(dev_addr.first));
-if (pci_dev) {
-// @todo Remove this after testing
-pkt->headerDelay = pkt->payloadDelay = 0;
-return pci_dev->readConfig(pkt);
-} else {
-uint8_t *pkt_data(pkt->getPtr());
-std::fill(pkt_data, pkt_data + size, 0xFF);
-pkt->makeAtomicResponse();
-return 0;
-}
+const auto &[bdf, offset] = decodeAddress(pkt->getAddr() - confBase);
+return readConfig(bdf, offset, pkt);
 }

 Tick
 GenericPciHost::write(PacketPtr pkt)
 {
-const auto dev_addr(decodeAddress(pkt->getAddr() - confBase));
-
-DPRINTF(PciHost, "%02x:%02x.%i: write: offset=0x%x, size=0x%x\n",
-dev_addr.first.bus, dev_addr.first.dev, dev_addr.first.func,
-dev_addr.second,
-pkt->getSize());
-
-PciDevice *const pci_dev(getDevice(dev_addr.first));
-panic_if(!pci_dev,
- "%02x:%02x.%i: Write to config space on non-existent PCI  
device\n",

- dev_addr.first.bus, dev_addr.first.dev, dev_addr.first.func);
-
-// @todo Remove this after testing
-pkt->headerDelay = pkt->payloadDelay = 0;
-
-return pci_dev->writeConfig(pkt);
+const auto &[bdf, offset] = decodeAddress(pkt->getAddr() - confBase);
+return writeConfig(bdf, offset, pkt);
 }

 AddrRangeList
diff --git a/src/dev/pci/host.hh b/src/dev/pci/host.hh
index 04302f5..f7bd459 100644
--- a/src/dev/pci/host.hh
+++ b/src/dev/pci/host.hh
@@ -240,6 +240,9 @@
  */
 const PciDevice *getDevice(const PciBusAddr ) const;

+Tick writeConfig(const PciBusAddr , Addr offset, PacketPtr pkt);
+Tick readConfig(const PciBusAddr , Addr offset, PacketPtr pkt);
+
   private:
 /** Currently registered PCI devices */
 std::map devices;

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/57015
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I79187bbd2a56e4f40ebeda0e7c228e30c72f1d7d
Gerrit-Change-Number: 57015
Gerrit-PatchSet: 1
Gerrit-Owner: Gabe Black 
Gerrit-MessageType: newchange

[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Handle MSRs with rdmsr/wrmsr microops.

2022-02-22 Thread Gabe Black (Gerrit) via gem5-dev
Gabe Black has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/57011 )



Change subject: arch-x86: Handle MSRs with rdmsr/wrmsr microops.
..

arch-x86: Handle MSRs with rdmsr/wrmsr microops.

MSRs are accessed with address-like indexes which map from a 32 bit
space down to the actual registers, but they are not actually loads or
stores in that they can't be accessed partially, or offset, or interact
with other loads/stores. It's much simpler to just map the MSR index to
a MiscReg index and then use the TC's readMiscReg and writeMiscReg
accessors.

Change-Id: I911bc54f5b7228af20a930234b34651a0042298d
---
M src/arch/amdgpu/common/tlb.cc
M src/arch/x86/isa/includes.isa
M src/arch/x86/microcode/system/msrs.ucode
M src/arch/x86/tlb.cc
M src/arch/x86/ucasmlib/arch/x86/microops/regop.py
M src/arch/x86/x86_traits.hh
6 files changed, 43 insertions(+), 38 deletions(-)



diff --git a/src/arch/amdgpu/common/tlb.cc b/src/arch/amdgpu/common/tlb.cc
index 0a00185..127aa02 100644
--- a/src/arch/amdgpu/common/tlb.cc
+++ b/src/arch/amdgpu/common/tlb.cc
@@ -303,22 +303,7 @@
 Addr vaddr = req->getVaddr();
 Addr prefix = (vaddr >> 3) & IntAddrPrefixMask;

-if (prefix == IntAddrPrefixMSR) {
-vaddr = (vaddr >> 3) & ~IntAddrPrefixMask;
-
-MiscRegIndex regNum;
-if (!msrAddrToIndex(regNum, vaddr))
-return std::make_shared(0);
-
-req->setLocalAccessor(
-[read,regNum](ThreadContext *tc, PacketPtr pkt)
-{
-return localMiscRegAccess(read, regNum, tc, pkt);
-}
-);
-
-return NoFault;
-} else if (prefix == IntAddrPrefixIO) {
+if (prefix == IntAddrPrefixIO) {
 // TODO If CPL > IOPL or in virtual mode, check the I/O  
permission

 // bitmap in the TSS.

diff --git a/src/arch/x86/isa/includes.isa b/src/arch/x86/isa/includes.isa
index 387319b..56b2f4c 100644
--- a/src/arch/x86/isa/includes.isa
+++ b/src/arch/x86/isa/includes.isa
@@ -117,6 +117,7 @@
 #include "arch/x86/memhelpers.hh"
 #include "arch/x86/pseudo_inst_abi.hh"
 #include "arch/x86/regs/misc.hh"
+#include "arch/x86/regs/msr.hh"
 #include "arch/x86/tlb.hh"
 #include "base/compiler.hh"
 #include "base/condcodes.hh"
diff --git a/src/arch/x86/microcode/system/msrs.ucode  
b/src/arch/x86/microcode/system/msrs.ucode

index 3ea64d3..deba922 100644
--- a/src/arch/x86/microcode/system/msrs.ucode
+++ b/src/arch/x86/microcode/system/msrs.ucode
@@ -38,8 +38,7 @@

 def macroop RDMSR
 {
-ld t2, intseg, [8, rcx, t0], "IntAddrPrefixMSR << 3", \
-dataSize=8, addressSize=8
+rdmsr t2, rcx
 mov rax, rax, t2, dataSize=4
 srli t2, t2, 32, dataSize=8
 mov rdx, rdx, t2, dataSize=4
@@ -51,8 +50,7 @@
 mov t2, t2, rax, dataSize=4
 slli t3, rdx, 32, dataSize=8
 or t2, t2, t3, dataSize=8
-st t2, intseg, [8, rcx, t0], "IntAddrPrefixMSR << 3", \
-dataSize=8, addressSize=8
+wrmsr rcx, t2
 };

 def macroop RDTSC
diff --git a/src/arch/x86/tlb.cc b/src/arch/x86/tlb.cc
index 024f880..0db17e5 100644
--- a/src/arch/x86/tlb.cc
+++ b/src/arch/x86/tlb.cc
@@ -200,23 +200,7 @@
 DPRINTF(TLB, "Addresses references internal memory.\n");
 Addr vaddr = req->getVaddr();
 Addr prefix = (vaddr >> 3) & IntAddrPrefixMask;
-if (prefix == IntAddrPrefixMSR) {
-vaddr = (vaddr >> 3) & ~IntAddrPrefixMask;
-
-MiscRegIndex regNum;
-if (!msrAddrToIndex(regNum, vaddr))
-return std::make_shared(0);
-
-req->setPaddr(req->getVaddr());
-req->setLocalAccessor(
-[read,regNum](ThreadContext *tc, PacketPtr pkt)
-{
-return localMiscRegAccess(read, regNum, tc, pkt);
-}
-);
-
-return NoFault;
-} else if (prefix == IntAddrPrefixIO) {
+if (prefix == IntAddrPrefixIO) {
 // TODO If CPL > IOPL or in virtual mode, check the I/O permission
 // bitmap in the TSS.

diff --git a/src/arch/x86/ucasmlib/arch/x86/microops/regop.py  
b/src/arch/x86/ucasmlib/arch/x86/microops/regop.py

index bca98e4..52c9ddf 100644
--- a/src/arch/x86/ucasmlib/arch/x86/microops/regop.py
+++ b/src/arch/x86/ucasmlib/arch/x86/microops/regop.py
@@ -1097,6 +1097,28 @@
 "DestReg = merge(DestReg, dest, ControlSrc1, dataSize);"
 big_code = rdcrCode % "DestReg = ControlSrc1 & mask(dataSize * 8);"

+class Wrmsr(RegOp):
+operand_types = (IntSrc1Op, IntSrc2Op)
+def __init__(self, msr_num, src):
+super().__init__(msr_num, src, flags=None, dataSize='8')
+code = '''
+MiscRegIndex reg_num;
+if (!X86ISA::msrAddrToIndex(reg_num, SrcReg1 & mask(32)))
+return std::make_shared(0);
+xc->tcBase()->setMiscReg(reg_num, SrcReg2);
+'''
+
+class Rdmsr(RegOp):
+operand_types = (IntDestOp, IntSrc1Op)
+ 

[gem5-dev] Change in gem5/gem5[develop]: cpu: Handle Request::NO_ACCESS flag in MinorCPU and O3CPU

2022-02-22 Thread Giacomo Travaglini (Gerrit) via gem5-dev
Giacomo Travaglini has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/56591 )


Change subject: cpu: Handle Request::NO_ACCESS flag in MinorCPU and O3CPU
..

cpu: Handle Request::NO_ACCESS flag in MinorCPU and O3CPU

The Request::NO_ACCESS flag instructs the cpu model to not issue
the request to the memory port.

While Atomic and Timing CPU models properly implement it [1], [2],

* MinorCPU is not looking at the flag
* O3CPU is looking at the flag only in case of a nested transaction
start/commit

This patch is extending NO_ACCESS support to all memory instructions.
This is achieved by using the localAccess callback in the Request object.

Handling of nested hardware transactions in the O3 LSQUnit is moved within
the local accessor callback

[1]: https://github.com/gem5/gem5/blob/v21.1.0.2/\
src/cpu/simple/timing.cc#L318
[2]: https://github.com/gem5/gem5/blob/v21.1.0.2/\
src/cpu/simple/atomic.cc#L396

Change-Id: Ifd5b388c53ead4fe358aa35d2197c12f1c5bb4f2
Signed-off-by: Giacomo Travaglini 
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56591
Tested-by: kokoro 
Reviewed-by: ZHENGRONG WANG 
Reviewed-by: Jason Lowe-Power 
Maintainer: Jason Lowe-Power 
---
M src/cpu/minor/lsq.cc
M src/cpu/o3/lsq.cc
M src/cpu/o3/lsq_unit.cc
3 files changed, 61 insertions(+), 33 deletions(-)

Approvals:
  Jason Lowe-Power: Looks good to me, approved; Looks good to me, approved
  ZHENGRONG WANG: Looks good to me, but someone else must approve
  kokoro: Regressions pass




diff --git a/src/cpu/minor/lsq.cc b/src/cpu/minor/lsq.cc
index e4c000b..17ae290 100644
--- a/src/cpu/minor/lsq.cc
+++ b/src/cpu/minor/lsq.cc
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2013-2014,2017-2018,2020 ARM Limited
+ * Copyright (c) 2013-2014,2017-2018,2020-2021 Arm Limited
  * All rights reserved
  *
  * The license below extends only to copyright in the software and shall
@@ -1651,6 +1651,14 @@
 inst->pc->instAddr(), std::move(amo_op));
 request->request->setByteEnable(byte_enable);

+/* If the request is marked as NO_ACCESS, setup a local access
+ * doing nothing */
+if (flags.isSet(Request::NO_ACCESS)) {
+assert(!request->request->isLocalAccess());
+request->request->setLocalAccessor(
+[] (ThreadContext *tc, PacketPtr pkt) { return Cycles(1); });
+}
+
 requests.push(request);
 inst->inLSQ = true;
 request->startAddrTranslation();
diff --git a/src/cpu/o3/lsq.cc b/src/cpu/o3/lsq.cc
index 72ecc52..c932a93 100644
--- a/src/cpu/o3/lsq.cc
+++ b/src/cpu/o3/lsq.cc
@@ -1089,6 +1089,23 @@
 _inst->pcState().instAddr(), _inst->contextId(),
 std::move(_amo_op));
 req->setByteEnable(byte_enable);
+
+/* If the request is marked as NO_ACCESS, setup a local access */
+if (_flags.isSet(Request::NO_ACCESS)) {
+req->setLocalAccessor(
+[this, req](gem5::ThreadContext *tc, PacketPtr pkt) ->  
Cycles

+{
+if ((req->isHTMStart() || req->isHTMCommit())) {
+auto& inst = this->instruction();
+assert(inst->inHtmTransactionalState());
+pkt->setHtmTransactional(
+inst->getHtmTransactionUid());
+}
+return Cycles(1);
+}
+);
+}
+
 _reqs.push_back(req);
 }
 }
diff --git a/src/cpu/o3/lsq_unit.cc b/src/cpu/o3/lsq_unit.cc
index 1541d2c..856cef3 100644
--- a/src/cpu/o3/lsq_unit.cc
+++ b/src/cpu/o3/lsq_unit.cc
@@ -1336,7 +1336,6 @@

 if (request->mainReq()->isLocalAccess()) {
 assert(!load_inst->memData);
-assert(!load_inst->inHtmTransactionalState());
 load_inst->memData = new uint8_t[MaxDataBytes];

 gem5::ThreadContext *thread = cpu->tcBase(lsqID);
@@ -1351,37 +1350,6 @@
 return NoFault;
 }

-// hardware transactional memory
-if (request->mainReq()->isHTMStart() ||  
request->mainReq()->isHTMCommit())

-{
-// don't want to send nested transactionStarts and
-// transactionStops outside of core, e.g. to Ruby
-if (request->mainReq()->getFlags().isSet(Request::NO_ACCESS)) {
-Cycles delay(0);
-PacketPtr data_pkt =
-new Packet(request->mainReq(), MemCmd::ReadReq);
-
-// Allocate memory if this is the first time a load is issued.
-if (!load_inst->memData) {
-load_inst->memData =
-new uint8_t[request->mainReq()->getSize()];
-// sanity checks espect zero in request's data
-memset(load_inst->memData, 0,  
request->mainReq()->getSize());

-}
-
-data_pkt->dataStatic(load_inst->memData);
-if (load_inst->inHtmTransactionalState()) {
-