date:20210424

[gem5-dev] Change in gem5/gem5[develop]: configs: apu_se.py hotfix

2021-04-24 Thread Matthew Poremba (Gerrit) via gem5-dev

Matthew Poremba has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/44825 )


Change subject: configs: apu_se.py hotfix
..

configs: apu_se.py hotfix

Missed two optparse -> argparse changes. Square runs.

Change-Id: I3a652380e4c4202a376413602fa3698a28ff9206
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/44825
Maintainer: Matthew Poremba 
Reviewed-by: Daniel Carvalho 
Tested-by: kokoro 
---
M configs/example/apu_se.py
1 file changed, 2 insertions(+), 2 deletions(-)

Approvals:
  Daniel Carvalho: Looks good to me, approved
  Matthew Poremba: Looks good to me, approved
  kokoro: Regressions pass



diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py
index b0da8df..f779df3 100644
--- a/configs/example/apu_se.py
+++ b/configs/example/apu_se.py
@@ -181,14 +181,14 @@
 #-- 0   1   x   UC_L2   (Uncached_GL2)
 #-- 0   0   x   UC_All  (Uncached_All_Load)
 # default value: 5/C_RO_S (only allow caching in GL2 for read. Shared)
-parser.add_argument("--m-type", type='int', default=5,
+parser.add_argument("--m-type", type=int, default=5,
 help="Default Mtype for GPU memory accesses.  This is  
the "
 "value used for all memory accesses on an APU and is  
the "
 "default mode for dGPU unless explicitly overwritten  
by "

 "the driver on a per-page basis.  Valid values are "
 "between 0-7")

-parser.add_argument("--gfx-version", type="string", default='gfx801',
+parser.add_argument("--gfx-version", type=str, default='gfx801',
 help="Gfx version for gpu: gfx801, gfx803, gfx900")

 Ruby.define_options(parser)

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/44825
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I3a652380e4c4202a376413602fa3698a28ff9206
Gerrit-Change-Number: 44825
Gerrit-PatchSet: 2
Gerrit-Owner: Matthew Poremba 
Gerrit-Reviewer: Daniel Carvalho 
Gerrit-Reviewer: Jason Lowe-Power 
Gerrit-Reviewer: Matthew Poremba 
Gerrit-Reviewer: kokoro 
Gerrit-MessageType: merged
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: configs: apu_se.py hotfix

2021-04-24 Thread Matthew Poremba (Gerrit) via gem5-dev

Matthew Poremba has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/44825 )



Change subject: configs: apu_se.py hotfix
..

configs: apu_se.py hotfix

Missed two optparse -> argparse changes. Square runs.

Change-Id: I3a652380e4c4202a376413602fa3698a28ff9206
---
M configs/example/apu_se.py
1 file changed, 2 insertions(+), 2 deletions(-)



diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py
index b0da8df..f779df3 100644
--- a/configs/example/apu_se.py
+++ b/configs/example/apu_se.py
@@ -181,14 +181,14 @@
 #-- 0   1   x   UC_L2   (Uncached_GL2)
 #-- 0   0   x   UC_All  (Uncached_All_Load)
 # default value: 5/C_RO_S (only allow caching in GL2 for read. Shared)
-parser.add_argument("--m-type", type='int', default=5,
+parser.add_argument("--m-type", type=int, default=5,
 help="Default Mtype for GPU memory accesses.  This is  
the "
 "value used for all memory accesses on an APU and is  
the "
 "default mode for dGPU unless explicitly overwritten  
by "

 "the driver on a per-page basis.  Valid values are "
 "between 0-7")

-parser.add_argument("--gfx-version", type="string", default='gfx801',
+parser.add_argument("--gfx-version", type=str, default='gfx801',
 help="Gfx version for gpu: gfx801, gfx803, gfx900")

 Ruby.define_options(parser)

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/44825
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I3a652380e4c4202a376413602fa3698a28ff9206
Gerrit-Change-Number: 44825
Gerrit-PatchSet: 1
Gerrit-Owner: Matthew Poremba 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: configs, gpu-compute: Add option to specify gfx version

2021-04-24 Thread Matthew Poremba (Gerrit) via gem5-dev

Matthew Poremba has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/42217 )


Change subject: configs, gpu-compute: Add option to specify gfx version
..

configs, gpu-compute: Add option to specify gfx version

Currently uses gfx801, gfx803, gfx900 for Carrizo, Fiji,
and Vega respectively

Change-Id: I62758914b6a60f16dd4f2141a23c0a9141a4e1a0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42217
Maintainer: Matthew Poremba 
Maintainer: Matt Sinclair 
Reviewed-by: Matt Sinclair 
Tested-by: kokoro 
---
M configs/example/apu_se.py
M configs/example/hsaTopology.py
M src/gpu-compute/GPU.py
M src/gpu-compute/gpu_compute_driver.cc
M src/gpu-compute/gpu_compute_driver.hh
5 files changed, 248 insertions(+), 10 deletions(-)

Approvals:
  Matt Sinclair: Looks good to me, approved; Looks good to me, approved
  Matthew Poremba: Looks good to me, approved
  kokoro: Regressions pass



diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py
index da49efa..b0da8df 100644
--- a/configs/example/apu_se.py
+++ b/configs/example/apu_se.py
@@ -188,6 +188,9 @@
 "the driver on a per-page basis.  Valid values are "
 "between 0-7")

+parser.add_argument("--gfx-version", type="string", default='gfx801',
+help="Gfx version for gpu: gfx801, gfx803, gfx900")
+
 Ruby.define_options(parser)

 # add TLB options to the parser
@@ -430,6 +433,7 @@

 # HSA kernel mode driver
 gpu_driver = GPUComputeDriver(filename = "kfd", isdGPU = args.dgpu,
+  gfxVersion = args.gfx_version,
   dGPUPoolID = 1, m_type = args.m_type)

 # Creating the GPU kernel launching components: that is the HSA
@@ -667,8 +671,15 @@
 # Create the /sys/devices filesystem for the simulator so that the HSA  
Runtime

 # knows what type of GPU hardware we are simulating
 if args.dgpu:
-hsaTopology.createFijiTopology(args)
+assert (args.gfx_version in ['gfx803', 'gfx900']),\
+"Incorrect gfx version for dGPU"
+if args.gfx_version == 'gfx803':
+hsaTopology.createFijiTopology(args)
+elif args.gfx_version == 'gfx900':
+hsaTopology.createVegaTopology(args)
 else:
+assert (args.gfx_version in ['gfx801']),\
+"Incorrect gfx version for APU"
 hsaTopology.createCarrizoTopology(args)

 m5.ticks.setGlobalFrequency('1THz')
diff --git a/configs/example/hsaTopology.py b/configs/example/hsaTopology.py
index a5e0d44..51585de 100644
--- a/configs/example/hsaTopology.py
+++ b/configs/example/hsaTopology.py
@@ -49,6 +49,183 @@
 rmtree(path)
 makedirs(path)

+# This fakes out a dGPU setup so the runtime operates correctly.  The  
spoofed

+# system has a single dGPU and a single socket CPU.  Note that more complex
+# topologies (multi-GPU, multi-socket CPUs) need to have a different setup
+# here or the runtime won't be able to issue Memcpies from one node to  
another.

+#
+# TODO: There is way too much hardcoded here.  It doesn't effect anything  
in
+# our current ROCm stack (1.6), but it is highly possible that it will in  
the

+# future.  We might need to scrub through this and extract the appropriate
+# fields from the simulator in the future.
+def createVegaTopology(options):
+topology_dir = joinpath(m5.options.outdir, \
+'fs/sys/devices/virtual/kfd/kfd/topology')
+remake_dir(topology_dir)
+
+amdgpu_dir = joinpath(m5.options.outdir, \
+'fs/sys/module/amdgpu/parameters')
+remake_dir(amdgpu_dir)
+
+pci_ids_dir = joinpath(m5.options.outdir, \
+'fs/usr/share/hwdata/')
+remake_dir(pci_ids_dir)
+
+# Vega reported VM size in GB.  Used to reserve an allocation from CPU
+# to implement SVM (i.e. GPUVM64 pointers and X86 pointers agree)
+file_append((amdgpu_dir, 'vm_size'), 256)
+
+# Ripped from real Vega platform to appease KMT version checks
+file_append((topology_dir, 'generation_id'), 2)
+
+# Set up system properties.  Regiter as ast-rocm server
+sys_prop = 'platform_oem 35498446626881\n' + \
+   'platform_id 71791775140929\n' + \
+   'platform_rev 2\n'
+file_append((topology_dir, 'system_properties'), sys_prop)
+
+# Populate the topology tree
+# Our dGPU system is two nodes.  Node 0 is a CPU and Node 1 is a dGPU
+node_dir = joinpath(topology_dir, 'nodes/0')
+remake_dir(node_dir)
+
+# Register as a CPU
+file_append((node_dir, 'gpu_id'), 0)
+file_append((node_dir, 'name'), '')
+
+# CPU links.  Only thing that matters is we tell the runtime that GPU  
is

+# connected through PCIe to CPU socket 0.
+io_links = 1
+io_dir = joinpath(node_dir, 'io_links/0')
+remake_dir(io_dir)
+io_prop = 'type 2\n'+ \
+  'version_major 0\n'   + \
+  'version_minor 0\n'

[gem5-dev] Change in gem5/gem5[develop]: gpu-compute, dev-hsa: Fix doorbell for gfx900

2021-04-24 Thread Matthew Poremba (Gerrit) via gem5-dev

Matthew Poremba has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/42220 )


Change subject: gpu-compute, dev-hsa: Fix doorbell for gfx900
..

gpu-compute, dev-hsa: Fix doorbell for gfx900

gfx9 changed the size of the doorbell, and what the write index
is when the doorbell is rang. --gfx-version flag is used to set
the doorbell size

Change-Id: I48e4e57dc1c80a08133b17cdf3f92533b541f7c3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42220
Tested-by: kokoro 
Reviewed-by: Matt Sinclair 
Maintainer: Matt Sinclair 
---
M src/dev/hsa/hsa_packet_processor.cc
M src/dev/hsa/hsa_packet_processor.hh
M src/dev/hsa/hw_scheduler.cc
M src/dev/hsa/hw_scheduler.hh
M src/gpu-compute/gpu_compute_driver.cc
M src/gpu-compute/gpu_compute_driver.hh
6 files changed, 47 insertions(+), 31 deletions(-)

Approvals:
  Matt Sinclair: Looks good to me, approved; Looks good to me, approved
  kokoro: Regressions pass



diff --git a/src/dev/hsa/hsa_packet_processor.cc  
b/src/dev/hsa/hsa_packet_processor.cc

index 9721d0e..cd6ef5a 100644
--- a/src/dev/hsa/hsa_packet_processor.cc
+++ b/src/dev/hsa/hsa_packet_processor.cc
@@ -91,22 +91,22 @@
 }

 void
-HSAPacketProcessor::unsetDeviceQueueDesc(uint64_t queue_id)
+HSAPacketProcessor::unsetDeviceQueueDesc(uint64_t queue_id, int  
doorbellSize)

 {
-hwSchdlr->unregisterQueue(queue_id);
+hwSchdlr->unregisterQueue(queue_id, doorbellSize);
 }

 void
 HSAPacketProcessor::setDeviceQueueDesc(uint64_t hostReadIndexPointer,
uint64_t basePointer,
uint64_t queue_id,
-   uint32_t size)
+   uint32_t size, int doorbellSize)
 {
 DPRINTF(HSAPacketProcessor,
  "%s:base = %p, qID = %d, ze = %d\n", __FUNCTION__,
  (void *)basePointer, queue_id, size);
 hwSchdlr->registerNewQueue(hostReadIndexPointer,
-   basePointer, queue_id, size);
+   basePointer, queue_id, size, doorbellSize);
 }

 AddrRangeList
@@ -133,7 +133,16 @@
   "%s: write of size %d to reg-offset %d (0x%x)\n",
   __FUNCTION__, pkt->getSize(), daddr, daddr);

-uint32_t doorbell_reg = pkt->getLE();
+int doorbellSize = gpu_device->driver()->doorbellSize();
+assert(doorbellSize == pkt->getSize());
+
+uint64_t doorbell_reg(0);
+if (pkt->getSize() == 8)
+doorbell_reg = pkt->getLE() + 1;
+else if (pkt->getSize() == 4)
+doorbell_reg = pkt->getLE();
+else
+fatal("invalid db size");

 DPRINTF(HSAPacketProcessor,
 "%s: write data 0x%x to offset %d (0x%x)\n",
diff --git a/src/dev/hsa/hsa_packet_processor.hh  
b/src/dev/hsa/hsa_packet_processor.hh

index 7e8f6a5..fe71612 100644
--- a/src/dev/hsa/hsa_packet_processor.hh
+++ b/src/dev/hsa/hsa_packet_processor.hh
@@ -331,8 +331,8 @@
 void setDeviceQueueDesc(uint64_t hostReadIndexPointer,
 uint64_t basePointer,
 uint64_t queue_id,
-uint32_t size);
-void unsetDeviceQueueDesc(uint64_t queue_id);
+uint32_t size, int doorbellSize);
+void unsetDeviceQueueDesc(uint64_t queue_id, int doorbellSize);
 void setDevice(GPUCommandProcessor * dev);
 void updateReadIndex(int, uint32_t);
 void getCommandsFromHost(int pid, uint32_t rl_idx);
diff --git a/src/dev/hsa/hw_scheduler.cc b/src/dev/hsa/hw_scheduler.cc
index b52e592..00dff99 100644
--- a/src/dev/hsa/hw_scheduler.cc
+++ b/src/dev/hsa/hw_scheduler.cc
@@ -84,18 +84,13 @@
 HWScheduler::registerNewQueue(uint64_t hostReadIndexPointer,
   uint64_t basePointer,
   uint64_t queue_id,
-  uint32_t size)
+  uint32_t size, int doorbellSize)
 {
 assert(queue_id < MAX_ACTIVE_QUEUES);
 // Map queue ID to doorbell.
 // We are only using offset to pio base address as doorbell
 // We use the same mapping function used by hsa runtime to do this  
mapping

-//
-// Originally
-// #define VOID_PTR_ADD32(ptr,n)
-// (void*)((uint32_t*)(ptr) + n)/*ptr + offset*/
-// (Addr)VOID_PTR_ADD32(0, queue_id)
-Addr db_offset = sizeof(uint32_t)*queue_id;
+Addr db_offset = queue_id * doorbellSize;
 if (dbMap.find(db_offset) != dbMap.end()) {
 panic("Creating an already existing queue (queueID %d)", queue_id);
 }
@@ -318,7 +313,7 @@
 }

 void
-HWScheduler::write(Addr db_addr, uint32_t doorbell_reg)
+HWScheduler::write(Addr db_addr, uint64_t doorbell_reg)
 {
 auto dbmap_iter = dbMap.find(db_addr);
 if (dbmap_iter == dbMap.end()) {
@@ -335,17 +330,9 @@
 }

 void
-HWScheduler::unregisterQueue(uint64_t queue_id)
+HWScheduler::unregisterQueue(uint64_t

[gem5-dev] Change in gem5/gem5[develop]: dev-hsa: Fix doorbell mmap for APU

2021-04-24 Thread Matthew Poremba (Gerrit) via gem5-dev

Matthew Poremba has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/42218 )


Change subject: dev-hsa: Fix doorbell mmap for APU
..

dev-hsa: Fix doorbell mmap for APU

Commit id ef44dc9a removed mmap-based doorbell allocation since dGPUs
use ioctl's instead.  However, APUs still need this to work correctly.
Add that logic back in as well as some new logic to distinguish doorbells
mmaps from other types. Also add some additional commentary regarding
Event page mmaps.

Change-Id: I8507ac85c8f07886d0fb4f95bde5e18a7790eab8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42218
Tested-by: kokoro 
Reviewed-by: Matt Sinclair 
Reviewed-by: Matthew Poremba 
Maintainer: Matt Sinclair 
---
M src/dev/hsa/hsa_driver.cc
M src/dev/hsa/kfd_event_defines.h
2 files changed, 42 insertions(+), 38 deletions(-)

Approvals:
  Matthew Poremba: Looks good to me, approved
  Matt Sinclair: Looks good to me, but someone else must approve; Looks  
good to me, approved

  kokoro: Regressions pass



diff --git a/src/dev/hsa/hsa_driver.cc b/src/dev/hsa/hsa_driver.cc
index f2db436..40f00f3 100644
--- a/src/dev/hsa/hsa_driver.cc
+++ b/src/dev/hsa/hsa_driver.cc
@@ -70,48 +70,41 @@
 HSADriver::mmap(ThreadContext *tc, Addr start, uint64_t length, int prot,
 int tgt_flags, int tgt_fd, off_t offset)
 {
-// Is this a signal event mmap
-bool is_event_mmap = false;
-// If addr == 0, then we may need to do mmap.
-bool should_mmap = (start == 0);
 auto process = tc->getProcessPtr();
 auto mem_state = process->memState;
-// Check if mmap is for signal events first
-if (((offset >> PAGE_SHIFT) & KFD_MMAP_TYPE_MASK) ==
-KFD_MMAP_TYPE_EVENTS) {
-is_event_mmap = true;
-DPRINTF(HSADriver, "amdkfd mmap for events(start: %p, length:  
0x%x,"

-"offset: 0x%x,  )\n", start, length, offset);
-panic_if(start != 0,
- "Start address should be provided by KFD\n");
-panic_if(length != 8 * KFD_SIGNAL_EVENT_LIMIT,
- "Requested length %d, expected length %d; length  
mismatch\n",

-  length, 8 * KFD_SIGNAL_EVENT_LIMIT);
-// For signal event, do mmap only is eventPage is uninitialized
-should_mmap = (!eventPage);
-} else {
-DPRINTF(HSADriver, "amdkfd doorbell mmap (start: %p, length: 0x%x,"
-"offset: 0x%x)\n", start, length, offset);
-}

-// Extend global mmap region if necessary.
-if (should_mmap) {
-// Assume mmap grows down, as in x86 Linux
-start = mem_state->getMmapEnd() - length;
-mem_state->setMmapEnd(start);
-}
+Addr pg_off = offset >> PAGE_SHIFT;
+Addr mmap_type = pg_off & KFD_MMAP_TYPE_MASK;
+DPRINTF(HSADriver, "amdkfd mmap (start: %p, length: 0x%x,"
+"offset: 0x%x)\n", start, length, offset);

-if (is_event_mmap) {
- if (should_mmap) {
- eventPage = start;
- }
-} else {
-// Now map this virtual address to our PIO doorbell interface
-// in the page tables (non-cacheable)
-process->pTable->map(start, device->hsaPacketProc().pioAddr,
- length, false);
-
-DPRINTF(HSADriver, "amdkfd doorbell mapped to %xp\n", start);
+switch (mmap_type) {
+case KFD_MMAP_TYPE_DOORBELL:
+DPRINTF(HSADriver, "amdkfd mmap type DOORBELL offset\n");
+start = mem_state->extendMmap(length);
+process->pTable->map(start, device->hsaPacketProc().pioAddr,
+length, false);
+break;
+case KFD_MMAP_TYPE_EVENTS:
+DPRINTF(HSADriver, "amdkfd mmap type EVENTS offset\n");
+panic_if(start != 0,
+ "Start address should be provided by KFD\n");
+panic_if(length != 8 * KFD_SIGNAL_EVENT_LIMIT,
+ "Requested length %d, expected length %d; length "
+ "mismatch\n", length, 8 * KFD_SIGNAL_EVENT_LIMIT);
+/**
+ * We don't actually access these pages.  We just need to  
reserve

+ * some VA space.  See commit id 5ce8abce for details on how
+ * events are currently implemented.
+ */
+ if (!eventPage) {
+eventPage = mem_state->extendMmap(length);
+start = eventPage;
+ }
+ break;
+default:
+warn_once("Unrecognized kfd mmap type %llx\n", mmap_type);
+break;
 }

 return start;
@@ -133,6 +126,9 @@
 fatal("%s: Exceeded maximum number of HSA queues allowed\n",  
name());

 }

+args->doorbell_offset = (KFD_MMAP_TYPE_DOORBELL |
+KFD_MMAP_GPU_ID(args->gpu_id)) << PAGE_SHIFT;
+
 args->queue_id = queueId++;
 auto _pp = device->hsaPacketProc();

[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Implement per-request MTYPEs

2021-04-24 Thread Matthew Poremba (Gerrit) via gem5-dev

Matthew Poremba has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/42216 )


Change subject: gpu-compute: Implement per-request MTYPEs
..

gpu-compute: Implement per-request MTYPEs

GPU MTYPE is currently set using a global config passed to the
PACoalescer.  This patch enables MTYPE to be set by the shader on a
per-request bases.  In real hardware, the MTYPE is extracted from a
GPUVM PTE during address translation.  However, our current simulator
only models x86 page tables which do not have the appropriate bits for
GPU MTYPES.  Rather than hacking non-x86 bits into our x86 page table
models, this patch instead keeps an interval tree of all pages that
request custom MTYPES in the driver itself.  This is currently
only used to map host pages to the GPU as uncacheable, but is easily
extensible to other MTYPES.

Change-Id: I7daab0ffae42084b9131a67c85cd0aa4bbbfc8d6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42216
Maintainer: Matthew Poremba 
Reviewed-by: Matt Sinclair 
Tested-by: kokoro 
---
M configs/example/apu_se.py
M src/gpu-compute/GPU.py
M src/gpu-compute/compute_unit.cc
M src/gpu-compute/gpu_command_processor.cc
M src/gpu-compute/gpu_command_processor.hh
M src/gpu-compute/gpu_compute_driver.cc
M src/gpu-compute/gpu_compute_driver.hh
M src/mem/request.hh
M src/sim/mem_state.hh
9 files changed, 274 insertions(+), 23 deletions(-)

Approvals:
  Matt Sinclair: Looks good to me, approved
  Matthew Poremba: Looks good to me, approved
  kokoro: Regressions pass



diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py
index 01213bd..da49efa 100644
--- a/configs/example/apu_se.py
+++ b/configs/example/apu_se.py
@@ -173,6 +173,21 @@
 "transfered from host to device memory using runtime  
calls "

 "that copy data over a PCIe-like IO bus.")

+# Mtype option
+#-- 1   1   1   C_RW_S  (Cached-ReadWrite-Shared)
+#-- 1   1   0   C_RW_US (Cached-ReadWrite-Unshared)
+#-- 1   0   1   C_RO_S  (Cached-ReadOnly-Shared)
+#-- 1   0   0   C_RO_US (Cached-ReadOnly-Unshared)
+#-- 0   1   x   UC_L2   (Uncached_GL2)
+#-- 0   0   x   UC_All  (Uncached_All_Load)
+# default value: 5/C_RO_S (only allow caching in GL2 for read. Shared)
+parser.add_argument("--m-type", type='int', default=5,
+help="Default Mtype for GPU memory accesses.  This is  
the "
+"value used for all memory accesses on an APU and is  
the "
+"default mode for dGPU unless explicitly overwritten  
by "

+"the driver on a per-page basis.  Valid values are "
+"between 0-7")
+
 Ruby.define_options(parser)

 # add TLB options to the parser
@@ -407,8 +422,15 @@
 hsapp_gpu_map_size = 0x1000
 hsapp_gpu_map_paddr = int(Addr(args.mem_size))

+if args.dgpu:
+# Default --m-type for dGPU is write-back gl2 with system coherence
+# (coherence at the level of the system directory between other dGPUs  
and

+# CPUs) managed by kernel boundary flush operations targeting the gl2.
+args.m_type = 6
+
 # HSA kernel mode driver
-gpu_driver = GPUComputeDriver(filename = "kfd", isdGPU = args.dgpu)
+gpu_driver = GPUComputeDriver(filename = "kfd", isdGPU = args.dgpu,
+  dGPUPoolID = 1, m_type = args.m_type)

 # Creating the GPU kernel launching components: that is the HSA
 # packet processor (HSAPP), GPU command processor (CP), and the
diff --git a/src/gpu-compute/GPU.py b/src/gpu-compute/GPU.py
index e548823..091fdde 100644
--- a/src/gpu-compute/GPU.py
+++ b/src/gpu-compute/GPU.py
@@ -237,6 +237,16 @@
 type = 'GPUComputeDriver'
 cxx_header = 'gpu-compute/gpu_compute_driver.hh'
 isdGPU = Param.Bool(False, 'Driver is for a dGPU')
+dGPUPoolID = Param.Int(False, 'Pool ID for dGPU.')
+# Default Mtype for caches
+#-- 1   1   1   C_RW_S  (Cached-ReadWrite-Shared)
+#-- 1   1   0   C_RW_US (Cached-ReadWrite-Unshared)
+#-- 1   0   1   C_RO_S  (Cached-ReadOnly-Shared)
+#-- 1   0   0   C_RO_US (Cached-ReadOnly-Unshared)
+#-- 0   1   x   UC_L2   (Uncached_GL2)
+#-- 0   0   x   UC_All  (Uncached_All_Load)
+# default value: 5/C_RO_S (only allow caching in GL2 for read. Shared)
+m_type = Param.Int("Default MTYPE for cache. Valid values between  
0-7");


 class GPUDispatcher(SimObject):
 type = 'GPUDispatcher'
diff --git a/src/gpu-compute/compute_unit.cc  
b/src/gpu-compute/compute_unit.cc

index bb1480b..cff04c1 100644
--- a/src/gpu-compute/compute_unit.cc
+++ b/src/gpu-compute/compute_unit.cc
@@ -48,6 +48,7 @@
 #include "debug/GPUSync.hh"
 #include "debug/GPUTLB.hh"
 #include "gpu-compute/dispatcher.hh"
+#include "gpu-compute/gpu_command_processor.hh"
 #include "gpu-compute/gpu_dyn_inst.hh"
 #include "gpu-compute/gpu_static_inst.hh"
 #include "gpu-compute/scalar_register_file.hh"
@@

[gem5-dev] Change in gem5/gem5[develop]: tests: Make the ISA-dependent tests run

2021-04-24 Thread Daniel Carvalho (Gerrit) via gem5-dev

Daniel Carvalho has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/44367 )


Change subject: tests: Make the ISA-dependent tests run
..

tests: Make the ISA-dependent tests run

dev/ has unit tests, but they are not run when
using the NULL ISA. The currently existing tests
are not ISA-specific, so the tests were set to
be run at an ARM environment.

As of now this is enough, but when ISA-specific
tests from ISAs other than ARM are added one will
need to change to cover them too.

Change-Id: I18df0141d415286325463afa759459b04ac8a92f
Signed-off-by: Daniel R. Carvalho 
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/44367
Reviewed-by: Bobby R. Bruce 
Maintainer: Bobby R. Bruce 
Tested-by: kokoro 
---
M tests/jenkins/presubmit-stage2.sh
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Bobby R. Bruce: Looks good to me, approved; Looks good to me, approved
  kokoro: Regressions pass



diff --git a/tests/jenkins/presubmit-stage2.sh  
b/tests/jenkins/presubmit-stage2.sh

index aed60fd..d4d5841 100755
--- a/tests/jenkins/presubmit-stage2.sh
+++ b/tests/jenkins/presubmit-stage2.sh
@@ -47,4 +47,4 @@
 # Once complete, run the Google Tests
 cd tests
 ./main.py run -j4 -t4 gem5 && scons -C .. --no-compress-debug \
-build/NULL/unittests.opt
+build/ARM/unittests.opt

--
To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/44367
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I18df0141d415286325463afa759459b04ac8a92f
Gerrit-Change-Number: 44367
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Carvalho 
Gerrit-Reviewer: Bobby R. Bruce 
Gerrit-Reviewer: Daniel Carvalho 
Gerrit-Reviewer: kokoro 
Gerrit-MessageType: merged
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-dev] Change in gem5/gem5[develop]: configs: apu_se.py hotfix

[gem5-dev] Change in gem5/gem5[develop]: configs: apu_se.py hotfix

[gem5-dev] Change in gem5/gem5[develop]: configs, gpu-compute: Add option to specify gfx version

[gem5-dev] Change in gem5/gem5[develop]: gpu-compute, dev-hsa: Fix doorbell for gfx900

[gem5-dev] Change in gem5/gem5[develop]: dev-hsa: Fix doorbell mmap for APU

[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Implement per-request MTYPEs

[gem5-dev] Change in gem5/gem5[develop]: tests: Make the ISA-dependent tests run

7 matches

Site Navigation

Mail list logo

Footer information