Collating responses to emails since you all type faster than me

- Imad: glad to hear things work with the updates Matt P proposed!
- documentation: Matt P, yes we did update the documentation here:
https://resources.gem5.org/ (e.g.,
https://resources.gem5.org/resources/square), but apparently didn't
propagate those updates to the webpage Imad was using.  I will add that to
my list for the week.  Bobby, I see you did part of this already.  I
believe there is more that needs to be cleaned up based on what Imad/Matt P
said, but I will wait until your version is checked in (imminently) before
re-reading and updating.
- apt repos: Matt P, you must be right about rocblas updating
something.  *Kyle,
can you please take care of updating the docker to use the specific rocblas
version we need?*

Matt

On Wed, Sep 22, 2021 at 1:03 PM Bobby Bruce via gem5-users <
gem5-users@gem5.org> wrote:

> Just jumping in here,
>
> I can confirm I can't build the image anymore. I had assumed this was just
> a problem on my end before reading these emails. However, the image hosted
> at http://gcr.io/gem5-test/gcn-gpu should be the most up-to-date version
> of this Docker prior to this build error being introduced. It should work.
>
> I've updated the website script here:
> https://gem5-review.googlesource.com/c/public/gem5-website/+/50807.
> Apologies, our documentation could definitely do with some tidying up :).
>
> --
> Dr. Bobby R. Bruce
> Room 3050,
> Kemper Hall, UC Davis
> Davis,
> CA, 95616
>
> web: https://www.bobbybruce.net
>
>
> On Wed, Sep 22, 2021 at 10:02 AM Imad Al Assir via gem5-users <
> gem5-users@gem5.org> wrote:
>
>> Dear Matt,
>>
>> Many thanks for catching this error! It did indeed solve the problem; I
>> was able to successfully run square and other applications from hip-samples
>> on both, the manually built dockerfile with everything related to rocBLAS
>> and MIOpen commented, and the pre-built docker image which I believe has
>> rocBLAS and MIOpen installed (based on its size).
>>
>> Many thanks again,
>> Imad
>>
>> On Sep 22 2021, at 6:48 pm, Poremba, Matthew <matthew.pore...@amd.com>
>> wrote:
>>
>>
>> [AMD Official Use Only]
>>
>>
>>
>> Hi Imad,
>>
>>
>>
>>
>>
>> Yes, the docker seems to have broken in the past few days.
>>
>>
>>
>> Regarding the benchmark not completing, please change your command to use
>> 3 CPUs:
>>
>>
>>
>>
>>
>> docker run --rm -v $PWD/gem5:/gem5 -v $PWD/gem5-resources:/gem5-resources
>> \
>>
>>                 -w /gem5 gcr.io/gem5-test/gcn-gpu \
>>
>>                 build/GCN3_X86/gem5.opt configs/example/apu_se.py -n3 \
>>
>>                 --benchmark-root=/gem5-resources/src/gpu/square/bin \
>>
>>                 -c square
>>
>>
>>
>> ROCm 4.0 requires 3 CPUs to run now.  I thought we had updated the
>> README.md and website before gem5 21.1 release to reflect this but looks
>> like they are not up to date.
>>
>>
>>
>>
>>
>> -Matt
>>
>>
>>
>> *From:* Imad Al Assir via gem5-users <gem5-users@gem5.org>
>> *Sent:* Wednesday, September 22, 2021 9:31 AM
>> *To:* Matt Sinclair <sincl...@cs.wisc.edu>
>> *Cc:* gem5 users mailing list <gem5-users@gem5.org>; Kyle Roarty <
>> kroa...@wisc.edu>; Imad Al Assir <imad.al.as...@upc.edu>
>> *Subject:* [gem5-users] Re: gem5 GCN GPU docker error
>>
>>
>> [CAUTION: External Email]
>>
>> Hello,
>> Thank you for your reply. I was simply following the documentation on the
>> gem5 website:
>> https://www.gem5.org/documentation/general_docs/gpu_models/GCN3
>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.gem5.org%2Fdocumentation%2Fgeneral_docs%2Fgpu_models%2FGCN3&data=04%7C01%7Cmatthew.poremba%40amd.com%7C2675554a18524cefdd0008d97de67d9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679251172742925%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=izNVhdZSvEH7gisG849pkXAdKu2MtDMOt3aBbn9J26o%3D&reserved=0>
>> In other words, to build the image, I used:
>>  docker build -t gcn-gpu .
>>
>>
>> This command didn't complete and was interrupted by the error I pasted in
>> the previous mail.
>>
>>
>> I was also using the command in the documentation to compile square:
>> docker run --rm -v $PWD/gem5-resources:$PWD/gem5-resources -w
>> $PWD/gem5-resources/src/gpu/square gcr.io/gem5-test/gcn-gpu make square
>>
>>
>> NOT "make gfx8-apu", as written in the documentation, which caused an
>> error: "no rule to make target 'gfx8-apu' ", and I assumed was a typo.
>>
>>
>> To run it, I also used the command in the doc:
>> docker run --rm -v $PWD/gem5:/gem5 -v $PWD/gem5-resources:/gem5-resources
>> \
>>                 -w /gem5 gcr.io/gem5-test/gcn-gpu \
>>                 build/GCN3_X86/gem5.opt configs/example/apu_se.py -n2 \
>>                 --benchmark-root=/gem5-resources/src/gpu/square/bin \
>>                 -c square
>>
>>
>> Note that in these commands, I modified the path of square to '
>> gem5-resources/src/gpu/square' instead of 'gem5-resources/src/square',
>> because that's where I found the code for it.
>> Also note that I tried downloading the pre-built binary of square (from
>> the gem5-resources website: http://resources.gem5.org/README
>> <https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fresources.gem5.org%2FREADME&data=04%7C01%7Cmatthew.poremba%40amd.com%7C2675554a18524cefdd0008d97de67d9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679251172752910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aoZN7pZU%2Be9m0dvaemraGLb0MEulGMRH%2FVExbRdyllI%3D&reserved=0>),
>> but the result was the same: application running indefinitely.
>>
>>
>> Thanks again for your help,
>> Imad
>>
>>
>> PS: If it helps, here are the last things printed when running square in
>> gem5 in the pre-built docker image:
>>
>>
>> [...] just warnings
>>
>>
>> gem5 Simulator System.  http://gem5.org
>> gem5 is copyrighted software; use the --copyright option for details.
>>
>>
>> gem5 version 21.1.0.1
>> gem5 compiled Sep 21 2021 14:52:55
>> gem5 started Sep 22 2021 15:26:26
>> gem5 executing on 8d532399b09e, pid 1
>> command line: build/GCN3_X86/gem5.opt configs/example/apu_se.py -n2
>> --benchmark-root=/gem5-resources/src/gpu/square/bin -c square
>>
>>
>> info: Standard input is not a terminal, disabling listeners.
>> Num SQC =  1 Num scalar caches =  1 Num CU =  4
>> coalescer.slave is deprecated. `slave` is now called `in_ports`
>> warn: coalescer.slave is deprecated. `slave` is now called `in_ports`
>> warn: coalescer.slave is deprecated. `slave` is now called `in_ports`
>>
>>
>> [...] same warning as the one right above this line, repeated multiple
>> times
>>
>>
>> warn: system.ruby.network adopting orphan SimObject param 'ext_links'
>> warn: system.ruby.network adopting orphan SimObject param 'int_links'
>> build/GCN3_X86/sim/simulate.cc:107: info: Entering event queue @ 0.
>> Starting simulation...
>> build/GCN3_X86/mem/ruby/system/Sequencer.cc:573: warn: Replacement policy
>> updates recently became the responsibility of SLICC state machines. Make
>> sure to setMRU() near callbacks in .sm files!
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall access(...)
>> build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one
>> page.
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> mprotect(...)
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> mprotect(...)
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> mprotect(...)
>>
>>
>> [...] same warning as above repeated multiple times
>>
>>
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> mprotect(...)
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> mprotect(...)
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> set_robust_list(...)
>> build/GCN3_X86/sim/syscall_emul.cc:84: warn: ignoring syscall
>> rt_sigaction(...)
>>       (further warnings will be suppressed)
>> build/GCN3_X86/sim/syscall_emul.cc:84: warn: ignoring syscall
>> rt_sigprocmask(...)
>>       (further warnings will be suppressed)
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> get_mempolicy(...)
>> build/GCN3_X86/arch/generic/debugfaults.hh:144: warn: MOVNTDQ: Ignoring
>> non-temporal hint, modeling as cacheable!
>> build/GCN3_X86/arch/x86/generated/exec-ns.cc.inc:27: warn: instruction
>> 'frndint' unimplemented
>> build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one
>> page.
>> build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:699: warn: unimplemented
>> ioctl: AMDKFD_IOC_ACQUIRE_VM
>> build/GCN3_X86/sim/syscall_emul.hh:1676: warn: mmap: writing to shared
>> mmap region is currently unsupported. The write succeeds on the target, but
>> it will not be propagated to the host or shared mappings
>> build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one
>> page.
>> build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:450: warn: Signal events
>> are only supported currently
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> mprotect(...)
>> build/GCN3_X86/sim/power_state.cc:105: warn: PowerState: Already in the
>> requested power state, request ignored
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> set_robust_list(...)
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> mprotect(...)
>> build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:594: warn: unimplemented
>> ioctl: AMDKFD_IOC_SET_SCRATCH_BACKING_VA
>> build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:604: warn: unimplemented
>> ioctl: AMDKFD_IOC_SET_TRAP_HANDLER
>> info: running on device
>> info: architecture on AMD GPU device is: 801
>> info: allocate host and device mem (  7.63 MB)
>> info: launch 'vector_square' kernel
>> build/GCN3_X86/sim/syscall_emul.cc:84: warn: ignoring syscall
>> sched_yield(...)
>>       (further warnings will be suppressed)
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> mprotect(...)
>> build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall
>> mprotect(...)
>>
>>
>> On Sep 22 2021, at 5:17 pm, Matt Sinclair <sincl...@cs.wisc.edu> wrote:
>>
>> Hi Imad,
>>
>> I just built the docker earlier this week and did not have any problems
>> (e.g., I ran square and it completed in < 2 hours).  How are you trying to
>> build it?  And how are you running the applications you mentioned?
>>
>> Thanks,
>> Matt
>>
>>
>> On Wed, Sep 22, 2021 at 12:31 AM Imad Al Assir via gem5-users <
>> gem5-users@gem5.org> wrote:
>>
>> Hello,
>> Is there a problem with the most recent gcn-gpu docker file?
>> I tried building it several times on Ubuntu 20.04 and 18.04 but it kept
>> giving me this error:
>>
>> [...]
>> Unpacking rocblas (2.32.0-cc18d25f) ...
>> dpkg: dependency problems prevent configuration of rocblas:
>>  rocblas depends on rocm-core; however:
>>   Package rocm-core is not installed.
>>
>>
>> dpkg: error processing package rocblas (--install):
>>  dependency problems - leaving unconfigured
>> dpkg: dependency problems prevent configuration of rocblas-dev:
>>  rocblas-dev depends on rocblas (>= 2.32.0); however:
>>   Package rocblas is not configured yet.
>>
>>
>> dpkg: error processing package rocblas-dev (--install):
>>  dependency problems - leaving unconfigured
>> Errors were encountered while processing:
>>  rocblas
>>  rocblas-dev
>> + check_exit_code 1
>> + ((  1 != 0  ))
>> + exit 1
>> The command '/bin/sh -c ./install.sh -d -a all -i' returned a non-zero
>> code: 1
>>
>>
>> I also tried downloading the pre-built docker image (
>> gcr.io/gem5-test/gcn-gpu
>> <https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgcr.io%2Fgem5-test%2Fgcn-gpu&data=04%7C01%7Cmatthew.poremba%40amd.com%7C2675554a18524cefdd0008d97de67d9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679251172752910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=y4gP%2BilM5v7tnvFpeOmXkXfgTdeI0PryYxQg3FCwsu0%3D&reserved=0>)
>> and built gem5 supposedly with no errors (but with a warning about
>> deprecated namespaces not being supported by the compiler). Then when I
>> tried running the 'square' sample application and other ones from
>> gem5-resources/src/gpu/hip-samples (e.g. MatrixTranspose, dynamic_shared,
>> inline_asm, etc.), they just kept running indefinitely (> 2 hours), and I
>> had to kill them to stop them.
>>
>>
>> May you please try building the latest version of the gcn-gpu dockerfile
>> and/or running a sample application on the pre-built docker image, and
>> inform us if it works, and if not, how to fix the problem?
>>
>>
>> Thanks in advance,
>> Imad Al Assir
>> _______________________________________________
>> gem5-users mailing list -- gem5-users@gem5.org
>> To unsubscribe send an email to gem5-users-le...@gem5.org
>> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
>>
>> _______________________________________________
>> gem5-users mailing list -- gem5-users@gem5.org
>> To unsubscribe send an email to gem5-users-le...@gem5.org
>> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
>
> _______________________________________________
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-le...@gem5.org
> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to