Re: [gem5-users] Questions on DRAM Controller model
Hello Andreas, Even after configuring the model like the actual hardware, i still not seeing enough interference to the read request under consideration. I am using the classic memory system model. Since it uses atomic and functional Packet allocation protocol, I would like to switch to Ruby( I think it more resembles with real platform). I am hitting in to below problem when i use ruby. /build/ARM/gem5.opt --stats-file=cr1A1.txt configs/example/fs.py --caches --l2cache --l1d_size=32kB --l1i_size=32kB --l2_size=1MB --num-cpus=4 --mem-size=512MB --kernel=/home/prathap/WorkSpace/linux-linaro-tracking-gem5/vmlinux --disk-image=/home/prathap/WorkSpace/gem5/fullsystem/disks/arm-ubuntu-natty-headless.img --machine-type=VExpress_EMM --dtb-file=/home/prathap/WorkSpace/linux-linaro-tracking-gem5/arch/arm/boot/dts/vexpress-v2p-ca15-tc1-gem5_4cpus.dtb --cpu-type=detailed --ruby --mem-type=ddr3_1600_x64 Traceback (most recent call last): File string, line 1, in module File /home/prathap/WorkSpace/gem5/src/python/m5/main.py, line 388, in main exec filecode in scope File configs/example/fs.py, line 302, in module test_sys = build_test_system(np) File configs/example/fs.py, line 138, in build_test_system Ruby.create_system(options, test_sys, test_sys.iobus, test_sys._dma_ports) File /home/prathap/WorkSpace/gem5/src/python/m5/SimObject.py, line 825, in __getattr__ raise AttributeError, err_string AttributeError: object 'LinuxArmSystem' has no attribute '_dma_ports' (C++ object is not yet constructed, so wrapped C++ methods are unavailable.) What could be the cause of this? Thanks, Prathap On Tue, Sep 9, 2014 at 1:35 PM, Andreas Hansson andreas.hans...@arm.com wrote: Hi Prathap, There are many possible reasons for the discrepancy, and obviously there are many ways of building a memory controller :-). Have you configured the model to look like the actual hardware? The most obvious differences would be in terms of buffer sizes, the page policy, arbitration policy, the threshold before closing a page, the read/write switching, actual timings etc. It is also worth checking if the controller hardware treats writes the same way the model does (early responses, minimise switching). Andreas From: Prathap Kolakkampadath kvprat...@gmail.com Date: Tuesday, 9 September 2014 18:56 To: Andreas Hansson andreas.hans...@arm.com Cc: gem5 users mailing list gem5-users@gem5.org Subject: Re: [gem5-users] Questions on DRAM Controller model Hello Andreas, Thanks for your reply. I read your ISPASS paper and got a fair understanding about the architecture. I am trying to reproduce the results, collected from running synthetic benchmarks (latency and bandwidth) on real hardware, in Simulator Environment.However, i could see variations in the results and i am trying to understand the reasons. The experiment has latency(memory non-intensive with random access) as the primary task and bandwidth(memory intensive with sequential access) as the co-runner task. On real hardware case 1 - 0 corunner : latency of the test is 74.88ns and b/w 854.74MB/s case 2 - 1 corunner : latency of the test is 225.95ns and b/w 283.24MB/s On simulator case 1 - 0 corunner : latency of the test is 76.08ns and b/w 802.25MB/s case 2 - 1 corunner : latency of the test is 93.69ns and b/w 651.57MB/s Case 1 where latency test run alone(0 corunner), the results matches on both environment. However Case 2, when run with bandwidth(1 corunner), the results varies a lot. Do you have any thoughts about this? Thanks, Prathap On Mon, Sep 8, 2014 at 1:46 PM, Andreas Hansson andreas.hans...@arm.com wrote: Hi Prathap, Have you read our ISPASS paper from last year? It’s referenced in the header file, as well as on gem5.org. 1. Yes and no. Two different buffers are used in the model are used, but they are random access, so you can treat the entries any way you want. 2. Yes and no. It’s a C++ model, so the scheduler executes in 0 time. Thus, when looking at the various requests it effectively sees all the banks. 3. Yes and no. See above. Remember that this is a model. The goal is not to be representative down to every last element of an RTL design. The goal is to be representative of a real design, and then be fast. Both of these goals are delivered upon by the model. I hope that explains it. IF there is anything in the results you do not agree with, please do say so. Thanks, Andreas From: Prathap Kolakkampadath via gem5-users gem5-users@gem5.org Reply-To: Prathap Kolakkampadath kvprat...@gmail.com, gem5 users mailing list gem5-users@gem5.org Date: Monday, 8 September 2014 18:38 To: gem5 users mailing list gem5-users@gem5.org Subject: [gem5-users] Questions on DRAM Controller model Hello Everybody, I am using DDR3_1600_x64. I am trying to understand the memory controller design and have few doubts about it. 1) Do the memory
Re: [gem5-users] Questions on DRAM Controller model
Hello Andreas/Users, I used to create a checkpoint until linux boot using Atomic Simple CPU and then restore from this checkpoint to detailed O3 cpu before running the test. I notice that the mem-mode is set to atomic and not timing. Will that be the reason for less contention in memory bus i am observing? Thanks, Prathap On Sun, Oct 12, 2014 at 4:56 PM, Prathap Kolakkampadath kvprat...@gmail.com wrote: Hello Andreas, Even after configuring the model like the actual hardware, i still not seeing enough interference to the read request under consideration. I am using the classic memory system model. Since it uses atomic and functional Packet allocation protocol, I would like to switch to Ruby( I think it more resembles with real platform). I am hitting in to below problem when i use ruby. /build/ARM/gem5.opt --stats-file=cr1A1.txt configs/example/fs.py --caches --l2cache --l1d_size=32kB --l1i_size=32kB --l2_size=1MB --num-cpus=4 --mem-size=512MB --kernel=/home/prathap/WorkSpace/linux-linaro-tracking-gem5/vmlinux --disk-image=/home/prathap/WorkSpace/gem5/fullsystem/disks/arm-ubuntu-natty-headless.img --machine-type=VExpress_EMM --dtb-file=/home/prathap/WorkSpace/linux-linaro-tracking-gem5/arch/arm/boot/dts/vexpress-v2p-ca15-tc1-gem5_4cpus.dtb --cpu-type=detailed --ruby --mem-type=ddr3_1600_x64 Traceback (most recent call last): File string, line 1, in module File /home/prathap/WorkSpace/gem5/src/python/m5/main.py, line 388, in main exec filecode in scope File configs/example/fs.py, line 302, in module test_sys = build_test_system(np) File configs/example/fs.py, line 138, in build_test_system Ruby.create_system(options, test_sys, test_sys.iobus, test_sys._dma_ports) File /home/prathap/WorkSpace/gem5/src/python/m5/SimObject.py, line 825, in __getattr__ raise AttributeError, err_string AttributeError: object 'LinuxArmSystem' has no attribute '_dma_ports' (C++ object is not yet constructed, so wrapped C++ methods are unavailable.) What could be the cause of this? Thanks, Prathap On Tue, Sep 9, 2014 at 1:35 PM, Andreas Hansson andreas.hans...@arm.com wrote: Hi Prathap, There are many possible reasons for the discrepancy, and obviously there are many ways of building a memory controller :-). Have you configured the model to look like the actual hardware? The most obvious differences would be in terms of buffer sizes, the page policy, arbitration policy, the threshold before closing a page, the read/write switching, actual timings etc. It is also worth checking if the controller hardware treats writes the same way the model does (early responses, minimise switching). Andreas From: Prathap Kolakkampadath kvprat...@gmail.com Date: Tuesday, 9 September 2014 18:56 To: Andreas Hansson andreas.hans...@arm.com Cc: gem5 users mailing list gem5-users@gem5.org Subject: Re: [gem5-users] Questions on DRAM Controller model Hello Andreas, Thanks for your reply. I read your ISPASS paper and got a fair understanding about the architecture. I am trying to reproduce the results, collected from running synthetic benchmarks (latency and bandwidth) on real hardware, in Simulator Environment.However, i could see variations in the results and i am trying to understand the reasons. The experiment has latency(memory non-intensive with random access) as the primary task and bandwidth(memory intensive with sequential access) as the co-runner task. On real hardware case 1 - 0 corunner : latency of the test is 74.88ns and b/w 854.74MB/s case 2 - 1 corunner : latency of the test is 225.95ns and b/w 283.24MB/s On simulator case 1 - 0 corunner : latency of the test is 76.08ns and b/w 802.25MB/s case 2 - 1 corunner : latency of the test is 93.69ns and b/w 651.57MB/s Case 1 where latency test run alone(0 corunner), the results matches on both environment. However Case 2, when run with bandwidth(1 corunner), the results varies a lot. Do you have any thoughts about this? Thanks, Prathap On Mon, Sep 8, 2014 at 1:46 PM, Andreas Hansson andreas.hans...@arm.com wrote: Hi Prathap, Have you read our ISPASS paper from last year? It’s referenced in the header file, as well as on gem5.org. 1. Yes and no. Two different buffers are used in the model are used, but they are random access, so you can treat the entries any way you want. 2. Yes and no. It’s a C++ model, so the scheduler executes in 0 time. Thus, when looking at the various requests it effectively sees all the banks. 3. Yes and no. See above. Remember that this is a model. The goal is not to be representative down to every last element of an RTL design. The goal is to be representative of a real design, and then be fast. Both of these goals are delivered upon by the model. I hope that explains it. IF there is anything in the results you do not agree with, please do say so. Thanks, Andreas From: Prathap