Re: [gem5-users] Simulation core of gem5
Do you mean that gem5 is not cycle accurate? if so, how the execution and evaluation order are determined? If there is a notion of quantum to synchronize the events, this is not event-driven. am I right? I think I'm missing something. Regards, Fela PhD candidate ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Simulation core of gem5
fela via gem5-users gem5-users at gem5.org writes: I mean gem5 do not support delta cycles like systemC, Am I right? Regards, Fela PhD candidate ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Ruby stats information
Hi all, I runned a gem5 + ruby/garnet configuration, as in the command: ./build/ALPHA_MOESI_hammer/gem5.opt configs/example/fs.py --ruby --caches --num-cpus=16 --topology=Mesh --mesh-rows=4 --garnet-network=flexible --script=./scripts/blackscholes_16c_simsmall.rcS First, I would like to know why no ruby.stats was generated. I can look at stats.txt and retrieve some info. too, just wondering why ruby.stats does not appear. Second, what means each one of the stats lines below? What is the unit of these values? I'm a little bit confused. system.ruby.L1Cache_Controller.L1_to_L2::total 2979968 system.ruby.L1Cache.hit_mach_latency_hist::total 1220365855 system.ruby.L1Cache.miss_mach_latency_hist::total58641 system.ruby.L2Cache.hit_mach_latency_hist::total 2943140 And why there is not a system.ruby.L2Cache.miss_mach_latency_hist::total (miss info about L2Cache) Thank you all! Best regards Matheus -- Atenciosamente, Matheus Alcântara Souza ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Simulation core of gem5
Hi Fela, gem5 supports scheduling events at the current tick (this is used in quite a few places). As such you can do as much as you want before shifting to the next point in time. Also, to answer your previous question, you can build ticked models in gem5, and make than as accurate as you want. The way I see it there is one fundamental difference between gem5¹s event model and the event model of e.g. SystemC or Verilog/VHDL, and that is the built-in notify-update concept in the latter ones. In gem5, actions take immediate effect, and when you for example send a packet you call a function on the receiving side straight away. In SystemC you split this into a posting and acting phase so that all modules have a chance of computing the next output before acting on new input. If you check out the Minor CPU you can see how it is still possible to achieve something similar using gem5¹s event model. In the end the models currently available in gem5 are a mix of ticked (an event every cycle), cleverly ticked (an event every cycle there is something to do), and event-on-action only. I hope that provides some additional clarity. Andreas On 15/10/2014 13:43, fela via gem5-users gem5-users@gem5.org wrote: fela via gem5-users gem5-users at gem5.org writes: I mean gem5 do not support delta cycles like systemC, Am I right? Regards, Fela PhD candidate ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England Wales, Company No: 2548782 ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Problem to run Asimbench
Hi all, I have downloaded the Full System Android files for AsimBench and I followed the instructions and modified the FSConfig.py file as shown in the wiki. I launched a simulation in FS mode with atomic cpu and using m5term just to check if the apps of Asimbench run correctly. It seems the kernel booted correctly. Then I used the command lines in the provided rcS files to run some apps : root@android:/ # am start -W com.fsck.k9/com.fsck.k9.activity.Accounts(for k9mail benhcmark) root@android:/ # am start -n com.adobe.reader/com.adobe.reader.AdobeReader file:///mnt/sdcard/app_data/IISWC-2011.pdf(for adobe benchmark) However, I am getting the following message for both commands: Error type 2 android.util.AndroidException: Can't connect to activity manager; is the system running? at com.android.commands.am.Am.run(Am.java:99) at com.android.commands.am.Am.main(Am.java:80) at com.android.internal.os.RuntimeInit.finishInit(Native Method) at com.android.internal.os.RuntimeInit.main(RuntimeInit.java:238) at dalvik.system.NativeStart.main(Native Method) Has anyone faced to the same issue ? Or does someone managed to run AsimBench in gem5 ? Maybe I am missing something ? The command I used in gem5 : build/ARM/gem5.opt configs/example/fs.py --kernel=/dist/m5/system/binaries/vmlinux.smp.ics.arm.asimbench.2.6.35 --disk-image=/dist/m5/system/disks/ARMv7a-ICS-Android.SMP.Asimbench-v3.img --mem-size=256MB Thanks for your help. -- Cordialement / Best Regards SENNI Sophiane Ph.D. candidate - Microelectronics LIRMM - www.lirmm.fr ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Some issues about creating checkpoints and restoring it on parsec benchmarks with ruby, timing cpu and detailed cpu
Hi all, I use the two commands: /mnt/nokrb/tangdj/gem5-new/build/X86/gem5.opt -d /mnt/nokrb/tangdj/gem5-new/Output/parsec_4_freqmine /mnt/nokrb/tangdj/gem5-new/configs/example/fs.py --script=/mnt/nokrb/tangdj/gem5-new/parsec/freqmine_4c_simlarge.rcS --cpu-type=timing --caches --l2cache --kernel=/mnt/nokrb/tangdj/full-system_images/binaries/x86_64-vmlinux-2.6.22.9.smp --l1d_size=16kB --l1i_size=16kB --mem-size=2400MB -n 4 --l2_size=256kB --take-checkpoints=20 --at-instruction --max-checkpoints=1 --l2_size=256kB --ruby --disk-image=/mnt/nokrb/tangdj/full-system_images/disks/x86root-parsec.img /mnt/nokrb/tangdj/gem5-new/build/X86/gem5.opt -d /mnt/nokrb/tangdj/gem5-new/Output/parsec_4_freqmine /mnt/nokrb/tangdj/gem5-new/configs/example/fs.py --script=/mnt/nokrb/tangdj/gem5-new/parsec/freqmine_4c_simlarge.rcS --cpu-type=detailed --caches --l2cache --kernel=/mnt/nokrb/tangdj/full-system_images/binaries/x86_64-vmlinux-2.6.22.9.smp --l1d_size=16kB --l1i_size=16kB --mem-size=2400MB -n 4 --l2_size=256kB --checkpoint-restore=20 --at-instruction --maxinsts=1 --restore-with-cpu=timing --ruby --disk-image=/mnt/nokrb/tangdj/full-system_images/disks/x86root-parsec.img It wrote checkpoint and created checkpoint very normal; but when it finish running, I found that in the system.pc.com_1.terminal, there is nothing in this file. So what's the problem; is this because when the second command started running, it rewrite the output file ? Thank you very much Best regards, Dongjie Tang ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Questions on DRAM Controller model
Thanks Andreas. On Tue, Oct 14, 2014 at 4:22 PM, Andreas Hansson andreas.hans...@arm.com wrote: Hello Prathap, I do not dare say, but perhaps some interaction between your generated access sequence and the O3 model (parameters) restrict the number of outstanding L1 misses? There are plenty debug flags to help in drilling down on this issue. Have a look in src/cpu/o3/Sconscript for the O3 related debug flags and src/mem/cache/Sconscript for the cache flags. Andreas From: Prathap Kolakkampadath kvprat...@gmail.com Date: Tuesday, October 14, 2014 at 9:21 PM To: Andreas Hansson andreas.hans...@arm.com Cc: gem5 users mailing list gem5-users@gem5.org Subject: Re: [gem5-users] Questions on DRAM Controller model Hello Andreas Whenever i switch to O3 Cpu from a checkpoint, i could see from config.ini that CPU is getting switched but the mem_mode is still set to atomic. However when booting in O3 CPU itself(without restoring from a checkpoint) the mem_mode is set to timing. Not sure why. Anyhow i could run my tests on O3 CPU with mem_mode timing(as verified from config.ini) When i run one memory-intensive tests, which generates cache miss on every read, in parallel with a pointer chasing test(one outstanding request at a time) and both the cpu's share the same bank of DRAM Controller. In my setup, as # of L1 MSHRs are 10, memory-intensive test can generate up to 10 Outstanding requests at a time. Since CPU speed is much faster than DRAM controller, can generate outstanding requests and all the requests are targeted to same bank, i expect to see the DRAM queue size to be 10 all the time when there is a request coming from pointer chasing test. If this assumption is correct i could see a better interference in model as i could see in real platforms. Don't you think DRAM queue size would get filled up to the size of number of L1 MSHRs according to above scenario. And what could be the case in order to fill the DRAM up to the size of # of L1 MSHRs. Thanks, Prathap Kumar Valsan Research Assistant University of Kansas On Tue, Oct 14, 2014 at 2:30 AM, Andreas Hansson andreas.hans...@arm.com wrote: Hi Prathap, The O3 CPU only works with the memory system in timing mode, so I do not understand what two points you are comparing when you say the results are exactly the same. The read queue is likely to never fill up unless all these transactions are generated at once. While the first one is being served by the memory controller you may have more coming in etc, but I do not understand why you think it would ever fill up. For “debugging” make sure that the config.ini actually captures what you think you are simulating. Also, you have a lot of DRAM-related stats in the stats.txt output. Andreas From: Prathap Kolakkampadath kvprat...@gmail.com Date: Tuesday, 14 October 2014 04:33 To: Andreas Hansson andreas.hans...@arm.com Cc: gem5 users mailing list gem5-users@gem5.org Subject: Re: [gem5-users] Questions on DRAM Controller model Hi Andreas, users I ran the test with ARM O3 cpu(--cpu-type=detailed) , mem_mode=timing, the results are exactly the same compared to mem_mode=atomic. I have partitioned the DRAM banks using software. Both the benchmarks- latency-sensitive and bandwidth -sensitive (both generates only reads) running in parallel using the same DRAM bank. From status file, i observe expected number L2 misses and DRAM requests are getting generated. In my system, the number of L1 MSHRs are 10 and number of L2 MSHR's are 32. So i expect that when a request from a latency-sensitive benchmark comes to DRAM, the readQ size has to be 10. However what i am observing is most of the time the Queue is not getting filled and hence there is less queueing latency and interference. I am using classic memory system with default DRAM controller,DDR3_1600_x64. Addressing map is RoRaBaChCo, page policy-open_adaptive, and frfcfs scheduler. Do you have any thoughts on this? How could i debug this further? Appreciate your help. Thanks, Prathap Kumar Valsan Research Assistant University of Kansas On Mon, Oct 13, 2014 at 4:21 AM, Andreas Hansson andreas.hans...@arm.com wrote: Hi Prathap, Indeed. The atomic mode is for fast-forwarding only. Once you actually want to get some representative performance numbers you have to run in timing mode with either the O3 or Minor CPU model. Andreas From: Prathap Kolakkampadath kvprat...@gmail.com Date: Monday, 13 October 2014 10:19 To: Andreas Hansson andreas.hans...@arm.com Cc: gem5 users mailing list gem5-users@gem5.org Subject: Re: [gem5-users] Questions on DRAM Controller model Thanks for your reply. The memory mode which I used is atomic. I think, I need to run the tests in timing More. I believe which shows up interference and queueing delay similar to real platforms. Prathap On Oct 13, 2014 2:55 AM, Andreas Hansson
[gem5-users] number of cores can only be power of 2 and no bigger than 64
Hi all, I am using gem5/garnet to do some simulation about on-chip network. Firstly I used the example scripts to run the simulation, it worked well. = ./build/ALPHA/gem5.debug --debug-flags=NetworkTest --outdir=m5out/test_bc configs/example/ruby_network_test.py --num-cpus=16 --num-dir=16 --topology=Mesh --mesh-rows=4 --sim-cycles=1000 --injectionrate=0.1 --synthetic=0 --maxpackets=1 --garnet-network=fixed = then I tried to go further with more simulations, I found the following problems: 1. the num-cpus can only be power of 2: 2 ,4 ,8 ,16, 32, 64 2. the num-cpus can't be bigger than 64, otherwise will get error number of cores xxx limited to xxx because of false sharing, with the error message, if figured out that in the python file ruby_network_test.py there are the following code: = block_size = 64 if options.num_cpus block_size: print Error: Number of cores %d limited to %d because of false sharing \ % (options.num_cpus, block_size) sys.exit(1) = to make the simulation run, I changed the block_size to 256 then run with num-cpus=256, I got error messages: segmentation fault, core dumped. Any idea or suggestion will be appreciated. Thank you -- Warmly Regards, Shawn Zhang ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users