That error in #2 means MIOpen can't find the kernel again.  Did you change the 
number of CUs to 128 (or whatever number of CUs you are using) when you 
generated the cachefiles?

Matt

From: David Fong via gem5-users <gem5-users@gem5.org>
Sent: Wednesday, March 9, 2022 12:50 PM
To: Poremba, Matthew <matthew.pore...@amd.com>; gem5 users mailing list 
<gem5-users@gem5.org>
Cc: David Fong <da...@chronostech.com>
Subject: [gem5-users] Re: gem5 : X86 + APU (gfx801) with CUs128 error with 
DNNMark test_fwd_softmax

Hi Matt,

Thanks for your quick response.
The hack is not working.

  1.  I had to start from scratch or I get same error
  2.  After running the same steps + the hack before gem5 compile, I'm getting 
these error messages
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
sh: 1: Cannot fork
MIOpen Error: /root/driver/MLOpen/src/hipoc/hipoc_program.cpp:195: Cant find 
file: /tmp/miopen-MIOpenSoftmax.cl-96e7-d3d7-ce59-9759/MIOpenSoftmax.cl.o
MIOpen Error: 7 at 
/home/dfong/work/ext_ips/gem5-apu-cu128-dnn/gem5/gem5-resources/src/gpu/DNNMark/core/include/dnn_wrapper.h485Ticks:
 574458882500

Am I missing some other setting ?

David

FULL MESSAGE WITH . . . TO REDUCE SIZE

docker run --rm -v ${PWD}:${PWD} -v 
${PWD}/gem5/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0 
-w ${PWD} gcr.io/gem5-test/gcn-gpu:v21-2 gem5/build/GCN3_X86/gem5.opt 
gem5/configs/example/apu_se.py --num-compute-units 128 -n3 
--benchmark-root=gem5/gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_softmax
 -cdnnmark_test_fwd_softmax --options="-config 
gem5/gem5-resources/src/gpu/DNNMark/config_example/softmax_config.dnnmark -mmap 
gem5/gem5-resources/src/gpu/DNNMark/mmap.bin" |& tee 
gem5_apu_cu128_run_dnnmark_test_fwd_softmax_50latency.log
Global frequency set at 1000000000000 ticks per second
build/GCN3_X86/mem/mem_interface.cc:791: warn: DRAM device capacity (8192 
Mbytes) does not match the address range assigned (512 Mbytes)
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (5) does not divide 
range [1:75] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:10] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:64] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1e+06] into equal-sized buckets. Rounding up.
. . .
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
. . .
Forcing maxCoalescedReqs to 32 (TLB assoc.)
Forcing maxCoalescedReqs to 32 (TLB assoc.)
Forcing maxCoalescedReqs to 32 (TLB assoc.)
Forcing maxCoalescedReqs to 32 (TLB assoc.)
. . .
build/GCN3_X86/base/remote_gdb.cc:381: warn: Sockets disabled, not accepting 
gdb connections
warn: dir_cntrl0.memory is deprecated. The request port for Ruby memory output 
to the main memory is now called `memory_out_port`
warn: system.ruby.network adopting orphan SimObject param 'ext_links'
warn: system.ruby.network adopting orphan SimObject param 'int_links'
warn: failed to generate dot output from m5out/config.dot
build/GCN3_X86/sim/simulate.cc:194: info: Entering event queue @ 0.  Starting 
simulation...
build/GCN3_X86/mem/ruby/system/Sequencer.cc:573: warn: Replacement policy 
updates recently became the responsibility of SLICC state machines. Make sure 
to setMRU() near callbacks in .sm files!
gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 version 21.2.1.0
gem5 compiled Mar  9 2022 18:21:02
gem5 started Mar  9 2022 18:27:12
gem5 executing on dc013b3a89f5, pid 1
command line: gem5/build/GCN3_X86/gem5.opt gem5/configs/example/apu_se.py 
--num-compute-units 128 -n3 
--benchmark-root=gem5/gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_softmax
 -cdnnmark_test_fwd_softmax '--options=-config 
gem5/gem5-resources/src/gpu/DNNMark/config_example/softmax_config.dnnmark -mmap 
gem5/gem5-resources/src/gpu/DNNMark/mmap.bin'

info: Standard input is not a terminal, disabling listeners.
Num SQC =  32 Num scalar caches =  32 Num CU =  128
incrementing idx on  4
incrementing idx on  8
incrementing idx on  12
incrementing idx on  16
incrementing idx on  20
incrementing idx on  24
incrementing idx on  28
incrementing idx on  32
incrementing idx on  36
incrementing idx on  40
incrementing idx on  44
incrementing idx on  48
incrementing idx on  52
incrementing idx on  56
incrementing idx on  60
incrementing idx on  64
incrementing idx on  68
incrementing idx on  72
incrementing idx on  76
incrementing idx on  80
incrementing idx on  84
incrementing idx on  88
incrementing idx on  92
incrementing idx on  96
incrementing idx on  100
incrementing idx on  104
incrementing idx on  108
incrementing idx on  112
incrementing idx on  116
incrementing idx on  120
incrementing idx on  124
. . .
"dot" with args ['-Tsvg', '/tmp/tmped75d08r'] returned code: 1

stdout, stderr:
b''
b'Error: /tmp/tmped75d08r: syntax error in line 119533 scanning a quoted string 
(missing endquote? longer than 16384?)\nString 
starting:"clk_domain&#61;system.ruby.clk_domain&#10;\\eventq_index&#61;0&#10;\\latency&#61;1\n'

build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one page.
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
. . .
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall 
set_robust_list(...)
build/GCN3_X86/sim/syscall_emul.cc:85: warn: ignoring syscall rt_sigaction(...)
      (further warnings will be suppressed)
build/GCN3_X86/sim/syscall_emul.cc:85: warn: ignoring syscall 
rt_sigprocmask(...)
      (further warnings will be suppressed)
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall get_mempolicy(...)
build/GCN3_X86/arch/generic/debugfaults.hh:145: warn: MOVNTDQ: Ignoring 
non-temporal hint, modeling as cacheable!
build/GCN3_X86/arch/x86/generated/exec-ns.cc.inc:27: warn: instruction 
'frndint' unimplemented
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one page.
build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:704: warn: unimplemented 
ioctl: AMDKFD_IOC_ACQUIRE_VM
build/GCN3_X86/sim/syscall_emul.hh:1862: warn: mmap: writing to shared mmap 
region is currently unsupported. The write succeeds on the target, but it will 
not be propagated to the host or shared mappings
build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:455: warn: Signal events are 
only supported currently
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/power_state.cc:105: warn: PowerState: Already in the 
requested power state, request ignored
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall 
set_robust_list(...)
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:599: warn: unimplemented 
ioctl: AMDKFD_IOC_SET_SCRATCH_BACKING_VA
build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:609: warn: unimplemented 
ioctl: AMDKFD_IOC_SET_TRAP_HANDLER
build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one page.
build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one page.
build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one page.
build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall fdatasync(...)
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall fdatasync(...)
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall fdatasync(...)
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall fdatasync(...)
. . .
build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one page.
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
sh: 1: Cannot fork
MIOpen Error: /root/driver/MLOpen/src/hipoc/hipoc_program.cpp:195: Cant find 
file: /tmp/miopen-MIOpenSoftmax.cl-96e7-d3d7-ce59-9759/MIOpenSoftmax.cl.o
MIOpen Error: 7 at 
/home/dfong/work/ext_ips/gem5-apu-cu128-dnn/gem5/gem5-resources/src/gpu/DNNMark/core/include/dnn_wrapper.h485Ticks:
 574458882500
Exiting because  exiting with last active thread context

David


From: Poremba, Matthew <matthew.pore...@amd.com<mailto:matthew.pore...@amd.com>>
Sent: Tuesday, March 8, 2022 4:23 PM
To: gem5 users mailing list <gem5-users@gem5.org<mailto:gem5-users@gem5.org>>
Cc: David Fong <da...@chronostech.com<mailto:da...@chronostech.com>>
Subject: RE: gem5 : X86 + APU (gfx801) with CUs128 error with DNNMark 
test_fwd_softmax


[Public]

Hi David,


You are hitting the limit on the number of same MachineTypes in a Ruby network. 
 You can change this by modifying the `build_opts/GCN_X86` file and adding a 
new line with `NUMBER_BITS_PER_SET = '128'`, or higher, and then recompile 
gem5.  As far as I know there is not a limit on the number of CUs.


-Matt

From: David Fong via gem5-users 
<gem5-users@gem5.org<mailto:gem5-users@gem5.org>>
Sent: Tuesday, March 8, 2022 3:51 PM
To: David Fong via gem5-users <gem5-users@gem5.org<mailto:gem5-users@gem5.org>>
Cc: David Fong <da...@chronostech.com<mailto:da...@chronostech.com>>
Subject: [gem5-users] gem5 : X86 + APU (gfx801) with CUs128 error with DNNMark 
test_fwd_softmax

[CAUTION: External Email]
Hi,

I built gem5 with X86 and APU (gfx801) with CUS=128 to run DNNMark 
test_fwd_softmax showing steps below and message outputs from the run

Is there a limitation on number of CUs (compute units) for the APU (gfx801) or 
do I need to add the number of compute units (128) on one of the cmd-lines 
below ?

Thanks,

David



git clone 
https://gem5.googlesource.com/public/gem5<https://urldefense.proofpoint.com/v2/url?u=https-3A__nam11.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fgem5.googlesource.com-252Fpublic-252Fgem5-26data-3D04-257C01-257Cmatthew.poremba-2540amd.com-257C43a4c2768a7b409609ca08da015ebddc-257C3dd8961fe4884e608e11a82d994e183d-257C0-257C0-257C637823803685522602-257CUnknown-257CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0-253D-257C3000-26sdata-3DE6QPfUhM7qFb3gobEkSzCp2HdvVKXuQuGSgxRREcNkc-253D-26reserved-3D0&d=DwMFAg&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=OkH-8nM02VdNPRt_miVO36vI9580zW1SgNQ4MzWRfqc&m=F21ZFu946IONLFjaIYSOXbhp72fP4psEV1yX4oaNmfA&s=4STq7Q1VfHpQCUuTTRNemzSiZeGr1r0hUDLBAidD46E&e=>
git clone 
https://gem5.googlesource.com/public/gem5-resources<https://urldefense.proofpoint.com/v2/url?u=https-3A__nam11.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fgem5.googlesource.com-252Fpublic-252Fgem5-2Dresources-26data-3D04-257C01-257Cmatthew.poremba-2540amd.com-257C43a4c2768a7b409609ca08da015ebddc-257C3dd8961fe4884e608e11a82d994e183d-257C0-257C0-257C637823803685522602-257CUnknown-257CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0-253D-257C3000-26sdata-3DqIXdStZk2TYrUHFxTKXguFios5oKN6eQ6WL59RA8sAc-253D-26reserved-3D0&d=DwMFAg&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=OkH-8nM02VdNPRt_miVO36vI9580zW1SgNQ4MzWRfqc&m=F21ZFu946IONLFjaIYSOXbhp72fP4psEV1yX4oaNmfA&s=56gjdqaVCOChrWuZOZ2nDT-soU7aTZ6-flU90R58dQg&e=>
 gem5/gem5-resources

# COMPILE DNNMARK TESTS
cd gem5/gem5-resources/src/gpu/DNNMark
docker run --rm -v ${PWD}:${PWD} -w ${PWD} -u $UID:$GID 
gcr.io/gem5-test/gcn-gpu:v21-2 ./setup.sh HIP
docker run --rm -v ${PWD}:${PWD} -w ${PWD}/build -u $UID:$GID 
gcr.io/gem5-test/gcn-gpu:v21-2 make
docker run --rm -v ${PWD}:${PWD} -v${PWD}/cachefiles:/root/.cache/miopen/2.9.0 
-w ${PWD} gcr.io/gem5-test/gcn-gpu:v21-2 python3 generate_cachefiles.py 
cachefiles.csv --gfx-version=gfx801 --num-cus=128
g++ -std=c++0x generate_rand_data.cpp -o generate_rand_data
./generate_rand_data
# BUILD GEM5
cd ../../../..
docker run --rm -v ${PWD}:${PWD} -w ${PWD} -u $UID:$GID 
gcr.io/gem5-test/gcn-gpu:v21-2 scons -sQ -j$(nproc) build/GCN3_X86/gem5.opt
# RUN TEST
cd ../
docker run --rm -v ${PWD}:${PWD} -v 
${PWD}/gem5/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0 
-w ${PWD} gcr.io/gem5-test/gcn-gpu:v21-2 gem5/build/GCN3_X86/gem5.opt 
gem5/configs/example/apu_se.py --num-compute-units 128 -n3 
--benchmark-root=gem5/gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_softmax
 -cdnnmark_test_fwd_softmax --options="-config 
gem5/gem5-resources/src/gpu/DNNMark/config_example/softmax_config.dnnmark -mmap 
gem5/gem5-resources/src/gpu/DNNMark/mmap.bin" |& tee 
gem5_apu_cu128_run_dnnmark_test_fwd_softmax_50latency.log
Global frequency set at 1000000000000 ticks per second
build/GCN3_X86/mem/mem_interface.cc:791: warn: DRAM device capacity (8192 
Mbytes) does not match the address range assigned (512 Mbytes)
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (5) does not divide 
range [1:75] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:10] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (2) does not divide 
range [1:64] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1e+06] into equal-sized buckets. Rounding up.
. . .
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/stats/storage.hh:279: warn: Bucket size (10000) does not 
divide range [1:1.6e+06] into equal-sized buckets. Rounding up.
build/GCN3_X86/base/statistics.hh:280: warn: One of the stats is a legacy stat. 
Legacy stat is a stat that does not belong to any statistics::Group. Legacy 
stat is deprecated.
. . .
Forcing maxCoalescedReqs to 32 (TLB assoc.)
Forcing maxCoalescedReqs to 32 (TLB assoc.)
Forcing maxCoalescedReqs to 32 (TLB assoc.)
Forcing maxCoalescedReqs to 32 (TLB assoc.)
. . .
build/GCN3_X86/base/statistics.hh:280: warn: One of the stats is a legacy stat. 
Legacy stat is a stat that does not belong to any statistics::Group. Legacy 
stat is deprecated.
build/GCN3_X86/mem/ruby/common/Set.hh:214: fatal: Number of bits(64) < size 
specified(65). Increase the number of bits and recompile.
Memory Usage: 2359940 Kbytes

_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to