Hi Richard,

You wrote  below:

“As noted earlier, mpirun in will fail in gem5 without a list of hosts.”

This should not happen. Without a list of hosts, mpirun should launch all the 
mpi processes on the ‘localhost’ (i.e. where mpirun is running). mpirun is 
using ssh to start new processes.

Make sure that ssh is working before you try mpirun. E.g. you try to run from 
your rcS script:
‘ssh localhost ls’
to check if ssh works locally and
‘ssh  10.0.0.X ls’
To check if ssh can launch a new process on a remote host. If ssh does not 
work, check if sshd is started from the linux init.

- Gabor


From: gem5-users <gem5-users-boun...@gem5.org> on behalf of "Afoakwa, Richard" 
<rafoa...@ur.rochester.edu>
Reply-To: gem5 users mailing list <gem5-users@gem5.org>
Date: Friday, 7 September 2018 at 20:10
To: gem5 users mailing list <gem5-users@gem5.org>
Subject: Re: [gem5-users] dist-gem5 panic - No 32bit reads implemented for this 
device.

Gabor,

Thanks very much for your response. Using the vanilla version, it appears to me 
that the error message was due to the fact that I was not including the host ip 
address(es) with my mpirun calls. Thanks again.

As a secondary question. I am trying to understand the basic framework of 
dist-gem5. From what I infer "gem5-dist.sh" script launches gem5 FS  processes 
(using the same *.rcS script and linux image) onto dedicated machines. Using 
the *.rcS script, each gem5 process updates the network configuration of the 
"image". For example, in the tutorials, this is done using the line;

/sbin/ifconfig eth0 hw ether 00:90:00:00:00:${MY_ADDR_PADDED} 10.0.0.${MY_ADDR}

Subsequently, the base gem5 process (the one with RANK = 0), can ping the other 
processes (as evident in the tutorial screenshot). Assuming all this works 
without error, and am trying to run an mpi application, the RANK0 gem5 process 
needs a list of hosts to execute mpirun. As noted earlier, mpirun in will fail 
in gem5 without a list of hosts.

For this purpose, pass the list of host ip address, 10.0.0.${MY_ADDR}, to 
mpirun. But I keep receiving connection refused error messages after mpirun 
starts. Trying different ports does not work either.

I would be grateful if anyone can provide some direction on this. Thanks.

Richard


________________________________
From: gem5-users <gem5-users-boun...@gem5.org> on behalf of Gabor Dozsa 
<gabor.do...@arm.com>
Sent: Thursday, August 30, 2018 12:20:53 PM
To: gem5 users mailing list
Subject: Re: [gem5-users] dist-gem5 panic - No 32bit reads implemented for this 
device.


Hi Richard,



I would suggest you to try to run the same MPI app on a single simulated system 
first to see if it is a dist-gem5 specific issue or not.  Simply use vanilla 
gem5 instead of dist-gem5 with exactly the same configuration (e.g. gem5 flags, 
kernel, disk image, etc.). You will need to remove the dist-gem5 and ethernet 
config commands from the bootscript but the

mpirun command line should just work as it is.



- Gabor



From: gem5-users <gem5-users-boun...@gem5.org> on behalf of "Afoakwa, Richard" 
<rafoa...@ur.rochester.edu>
Reply-To: gem5 users mailing list <gem5-users@gem5.org>
Date: Thursday, 30 August 2018 at 15:57
To: "gem5-users@gem5.org" <gem5-users@gem5.org>
Subject: [gem5-users] dist-gem5 panic - No 32bit reads implemented for this 
device.



Hi all, this is my first time using dist-gem5, but I have a working knowledge 
of gem5.



I have everything setup correctly, I think, but I keep getting the following 
panic message: "No 32bit reads implemented for this device. Offset 0x44", and I 
have run out of ideas to fix or work around it.



The testsys.terminal outputs suggests that the images are all loaded correctly 
and things run fine until it gets to calling executing application. I have 
updated the image to include the mpi libraries so that I can call mpirun 
(armv8-linux-gnueabi-mpirun). When I boot the image in a VM, I can run the 
application just fine with mpirun. But it keep getting this panic message when 
it's run inside dist-gem5.



I am using arm64 setup. The image is aarch64-ubuntu-trysty-headless.img, the vm 
is vmlinux.aarch64.20140821, and the dtb is express.aarch64.20140821.dtb.



Here are the text outputs;



***** rcS *****



# --------------------------------------------

#  ------ Start your tests below ... ---------

# --------------------------------------------

## Start workload

NUM_CORES=$(/sbin/m5 initparam num-cpus)

echo "Num-Cores: $NUM_CORES"



echo "[RKA] Load modules and set omp threads..."

export OMP_NUM_THREADS=$NUM_CORES  #Number of threads to use



echo "[RKA] Start work..."



if [ "$MY_RANK" == "0" ]

then

    echo "[RKA] Stats dump and rest..."

    /sbin/m5 dumpstats

    /sbin/m5 resetstats



    echo "[RKA] Starting workload..."



    cd /benchmarks/lulesh



    mpirun -np ${MY_SIZE} ./lulesh2.0 -s 5 -i 10



    /sbin/m5 exit 1

else

    printf "Wait for main to finish ...\n"

    while /bin/true

    do

sleep 5

printf "."

    done

fi



***** m5out.0/testsys.terminal *****



[RKA] bootscript.rcS running

[RKA] Rank: 0

[RKA] Size: 2

[RKA] Address: 02

[RKA] Set ethernet config...

[    3.600382] CPU3: failed to come online

[RKA] Display updated config...

eth0      Link encap:Ethernet  HWaddr 00:90:00:00:00:02

          inet addr:192.168.0.2  Bcast:192.168.0.255  Mask:255.255.255.0

          UP BROADCAST MULTICAST  MTU:1500  Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)



lo        Link encap:Local Loopback

          inet addr:127.0.0.1  Mask:255.0.0.0

          UP LOOPBACK RUNNING  MTU:65536  Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)



Preparing hosts for mpirun. Rank: 0 of 2

PING 192.168.0.2 (192.168.0.2) 56(84) bytes of data.

64 bytes from 192.168.0.2: icmp_seq=1 ttl=64 time=0.003 ms



--- 192.168.0.2 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 0.003/0.003/0.003/0.000 ms

PING 192.168.0.3 (192.168.0.3) 56(84) bytes of data.

64 bytes from 192.168.0.3: icmp_seq=1 ttl=64 time=997 ms



--- 192.168.0.3 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 997.900/997.900/997.900/0.000 ms

Num-Cores: 2

[RKA] Load modules and set omp threads...

[RKA] Start work...

[RKA] Stats dump and rest...

[RKA] Starting workload...

[    4.620381] CPU2: failed to come online





***** m5out.1/testsys.terminal *****



[RKA] bootscript.rcS is running

[RKA] Rank: 1

[RKA] Size: 2

[RKA] Address: 03

[RKA] Set ethernet config...

[    3.600382] CPU3: failed to come online

[RKA] Display updated config...

eth0      Link encap:Ethernet  HWaddr 00:90:00:00:00:03

          inet addr:192.168.0.3  Bcast:192.168.0.255  Mask:255.255.255.0

          UP BROADCAST MULTICAST  MTU:1500  Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)



lo        Link encap:Local Loopback

          inet addr:127.0.0.1  Mask:255.0.0.0

          UP LOOPBACK RUNNING  MTU:65536  Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)



Preparing hosts for mpirun. Rank: 1 of 2

Num-Cores: 2

[RKA] Load modules and set omp threads...

[RKA] Start work...

Wait for main to finish ...

[    4.620382] CPU2: failed to come online



***** Log.0 *****



command line: gem5-dist/000.init/util/dist/test/./../../../build/ARM/gem5.opt 
-d gem5-dist/000.init/util/dist/test/m5out.0 
--debug-flags=EthernetAll,DistEthernetAll 
gem5-dist/000.init/util/dist/test/./../../../configs/example/fs.py 
--cpu-type=AtomicSimpleCPU --num-cpus=2 --machine-type=VExpress_EMM64 
--disk-image=aarch64-ubuntu-trusty-headless.img 
--kernel=vmlinux.aarch64.20140821 --dtb-filename=vexpress.aarch64.20140821.dtb 
--script=gem5-dist/000.init/util/dist/test/./../../../util/dist/test/bootscript.rcS
 --checkpoint-dir=gem5-dist/000.init/util/dist/test/m5out.0 --dist 
--dist-rank=0 --dist-size=2 --dist-server-name=bhx0062 --dist-server-port=2200



info: Standard input is not a terminal, disabling listeners.

Global frequency set at 1000000000000 ticks per second

      0: etherlink: Switch Link created. Delay: 10000000, Speed: 800

      0: global: DistIface() ctor rank:0

warn: DRAM device capacity (8192 Mbytes) does not match the address range 
assigned (512 Mbytes)

info: kernel located at: gem5-dist/full_system/binaries/vmlinux.aarch64.20140821

warn: Highest ARM exception-level set to AArch32 but bootloader is for AArch64. 
Assuming you wanted these to match.

warn: Sockets disabled, not accepting vnc client connections

warn: Sockets disabled, not accepting terminal connections

      0: etherlink: DistEtherLink::init() called

…

…

…

18290945047000: testsys.realview.ethernet: Checking interrupts icr: 0 imr: 0x9d

18290945047000: testsys.realview.ethernet: Mask cleaned all interrupts

18290945047000: testsys.realview.ethernet: ITR = 0XC3 itr.interval = 0XC3

panic: No 32bit reads implemented for this device. Offset 0x44

Memory Usage: 1243356 KBytes

Program aborted at tick 18372912712000

--- BEGIN LIBC BACKTRACE ---



***** log.1 *****



command line: gem5-dist/000.init/util/dist/test/./../../../build/ARM/gem5.opt 
-d gem5-dist/000.init/util/dist/test/m5out.1 
--debug-flags=EthernetAll,DistEthernetAll 
gem5-dist/000.init/util/dist/test/./../../../configs/example/fs.py 
--cpu-type=AtomicSimpleCPU --num-cpus=2 --machine-type=VExpress_EMM64 
--disk-image=aarch64-ubuntu-trusty-headless.img 
--kernel=vmlinux.aarch64.20140821 --dtb-filename=vexpress.aarch64.20140821.dtb 
--script=gem5-dist/000.init/util/dist/test/./../../../util/dist/test/bootscript.rcS
 --checkpoint-dir=gem5-dist/000.init/util/dist/test/m5out.1 --dist 
--dist-rank=1 --dist-size=2 --dist-server-name=bhx0062 --dist-server-port=2200



info: Standard input is not a terminal, disabling listeners.

Global frequency set at 1000000000000 ticks per second

      0: etherlink: Switch Link created. Delay: 10000000, Speed: 800

      0: global: DistIface() ctor rank:1

warn: DRAM device capacity (8192 Mbytes) does not match the address range 
assigned (512 Mbytes)

info: kernel located at: gem5-dist/full_system/binaries/vmlinux.aarch64.20140821

warn: Highest ARM exception-level set to AArch32 but bootloader is for AArch64. 
Assuming you wanted these to match.

warn: Sockets disabled, not accepting vnc client connections

warn: Sockets disabled, not accepting terminal connections

      0: etherlink: DistEtherLink::init() called



…

…

…

18290981199500: testsys.realview.ethernet: ITR = 0XCD itr.interval = 0XCD

18290982340500: testsys.realview.ethernet: Checking interrupts icr: 0 imr: 0x9d

18290982340500: testsys.realview.ethernet: Mask cleaned all interrupts

18290982340500: testsys.realview.ethernet: ITR = 0XC3 itr.interval = 0XC3

info: recv(): Connection closed

Exiting @ tick 18372920000000 because connection to gem5 peer got closed





Any help would be appreciated.



Thanks,

Richard
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to