Thanks Andreas.
On Tue, Oct 14, 2014 at 4:22 PM, Andreas Hansson <andreas.hans...@arm.com> wrote: > Hello Prathap, > > I do not dare say, but perhaps some interaction between your generated > access sequence and the O3 model (parameters) restrict the number of > outstanding L1 misses? There are plenty debug flags to help in drilling > down on this issue. Have a look in src/cpu/o3/Sconscript for the O3 related > debug flags and src/mem/cache/Sconscript for the cache flags. > > Andreas > > From: Prathap Kolakkampadath <kvprat...@gmail.com> > Date: Tuesday, October 14, 2014 at 9:21 PM > > To: Andreas Hansson <andreas.hans...@arm.com> > Cc: gem5 users mailing list <gem5-users@gem5.org> > Subject: Re: [gem5-users] Questions on DRAM Controller model > > Hello Andreas > > Whenever i switch to O3 Cpu from a checkpoint, i could see from > config.ini that CPU is getting switched but the mem_mode is still set to > atomic. However when booting in O3 CPU itself(without restoring from a > checkpoint) the mem_mode is set to timing. Not sure why. Anyhow i could run > my tests on O3 CPU with mem_mode timing(as verified from config.ini) > > When i run one memory-intensive tests, which generates cache miss on > every read, in parallel with a pointer chasing test(one outstanding request > at a time) and both the cpu's share the same bank of DRAM Controller. In my > setup, as # of L1 MSHRs are 10, memory-intensive test can generate up to 10 > Outstanding requests at a time. Since CPU speed is much faster than DRAM > controller, can generate outstanding requests and all the requests are > targeted to same bank, i expect to see the DRAM queue size to be 10 all the > time when there is a request coming from pointer chasing test. If this > assumption is correct i could see a better interference in model as i could > see in real platforms. > > Don't you think DRAM queue size would get filled up to the size of > number of L1 MSHRs according to above scenario. And what could be the case > in order to fill the DRAM up to the size of # of L1 MSHRs. > > Thanks, > Prathap Kumar Valsan > Research Assistant > University of Kansas > > On Tue, Oct 14, 2014 at 2:30 AM, Andreas Hansson <andreas.hans...@arm.com> > wrote: > >> Hi Prathap, >> >> The O3 CPU only works with the memory system in timing mode, so I do >> not understand what two points you are comparing when you say the results >> are exactly the same. >> >> The read queue is likely to never fill up unless all these transactions >> are generated at once. While the first one is being served by the memory >> controller you may have more coming in etc, but I do not understand why you >> think it would ever fill up. >> >> For “debugging” make sure that the config.ini actually captures what >> you think you are simulating. Also, you have a lot of DRAM-related stats in >> the stats.txt output. >> >> Andreas >> >> From: Prathap Kolakkampadath <kvprat...@gmail.com> >> Date: Tuesday, 14 October 2014 04:33 >> >> To: Andreas Hansson <andreas.hans...@arm.com> >> Cc: gem5 users mailing list <gem5-users@gem5.org> >> Subject: Re: [gem5-users] Questions on DRAM Controller model >> >> Hi Andreas, users >> >> I ran the test with ARM O3 cpu(--cpu-type=detailed) , mem_mode=timing, >> the results are exactly the same compared to mem_mode=atomic. >> I have partitioned the DRAM banks using software. Both the benchmarks- >> latency-sensitive and bandwidth -sensitive (both generates only reads) >> running in parallel using the same DRAM bank. >> From status file, i observe expected number L2 misses and DRAM requests >> are getting generated. >> In my system, the number of L1 MSHRs are 10 and number of L2 MSHR's are >> 32. So i expect that when a request from a latency-sensitive benchmark >> comes to DRAM, the readQ size has to be 10. However what i am observing is >> most of the time the Queue is not getting filled and hence there is less >> queueing latency and interference. >> >> I am using classic memory system with default DRAM >> controller,DDR3_1600_x64. Addressing map is RoRaBaChCo, page >> policy-open_adaptive, and frfcfs scheduler. >> >> Do you have any thoughts on this? How could i debug this further? >> >> Appreciate your help. >> >> Thanks, >> Prathap Kumar Valsan >> Research Assistant >> University of Kansas >> >> On Mon, Oct 13, 2014 at 4:21 AM, Andreas Hansson <andreas.hans...@arm.com >> > wrote: >> >>> Hi Prathap, >>> >>> Indeed. The atomic mode is for fast-forwarding only. Once you actually >>> want to get some representative performance numbers you have to run in >>> timing mode with either the O3 or Minor CPU model. >>> >>> Andreas >>> >>> From: Prathap Kolakkampadath <kvprat...@gmail.com> >>> Date: Monday, 13 October 2014 10:19 >>> >>> To: Andreas Hansson <andreas.hans...@arm.com> >>> Cc: gem5 users mailing list <gem5-users@gem5.org> >>> Subject: Re: [gem5-users] Questions on DRAM Controller model >>> >>> Thanks for your reply. The memory mode which I used is atomic. I >>> think, I need to run the tests in timing More. I believe which shows up >>> interference and queueing delay similar to real platforms. >>> >>> Prathap >>> On Oct 13, 2014 2:55 AM, "Andreas Hansson" <andreas.hans...@arm.com> >>> wrote: >>> >>>> Hi Prathap, >>>> >>>> I don’t dare say exactly what is going wrong in your setup, but I am >>>> confident that Ruby will not magically make things more representative (it >>>> will likely give you a whole lot more problems though). In the end it is >>>> all about configuring the building blocks to match the system you want to >>>> capture. The crossbars and caches in the classic memory system do make some >>>> simplifications, but I have not yet seen a case when they are not >>>> sufficiently accurate. >>>> >>>> Have you looked at the various policy settings in the DRAM >>>> controller, e.g. the page policy and address mapping? If you’re trying to >>>> correlate with a real platform, also see Anthony’s ISPASS paper from last >>>> year for some sensible steps in simplifying the problem and dividing it >>>> into manageable chunks. >>>> >>>> Good luck. >>>> >>>> Andreas >>>> >>>> From: Prathap Kolakkampadath <kvprat...@gmail.com> >>>> Date: Monday, 13 October 2014 00:29 >>>> To: Andreas Hansson <andreas.hans...@arm.com> >>>> Cc: gem5 users mailing list <gem5-users@gem5.org> >>>> Subject: Re: [gem5-users] Questions on DRAM Controller model >>>> >>>> Hello Andreas/Users, >>>> >>>> I used to create a checkpoint until linux boot using Atomic Simple CPU >>>> and then restore from this checkpoint to detailed O3 cpu before running the >>>> test. I notice that the mem-mode is set to atomic and not timing. Will >>>> that be the reason for less contention in memory bus i am observing? >>>> >>>> Thanks, >>>> Prathap >>>> >>>> On Sun, Oct 12, 2014 at 4:56 PM, Prathap Kolakkampadath < >>>> kvprat...@gmail.com> wrote: >>>> >>>>> Hello Andreas, >>>>> >>>>> Even after configuring the model like the actual hardware, i still >>>>> not seeing enough interference to the read request under consideration. I >>>>> am using the classic memory system model. Since it uses atomic and >>>>> functional >>>>> Packet allocation protocol, I would like to switch to Ruby( I think it >>>>> more resembles with real platform). >>>>> >>>>> >>>>> I am hitting in to below problem when i use ruby. >>>>> >>>>> /build/ARM/gem5.opt --stats-file=cr1A1.txt configs/example/fs.py >>>>> --caches --l2cache --l1d_size=32kB --l1i_size=32kB --l2_size=1MB >>>>> --num-cpus=4 --mem-size=512MB >>>>> --kernel=/home/prathap/WorkSpace/linux-linaro-tracking-gem5/vmlinux >>>>> --disk-image=/home/prathap/WorkSpace/gem5/fullsystem/disks/arm-ubuntu-natty-headless.img >>>>> --machine-type=VExpress_EMM >>>>> --dtb-file=/home/prathap/WorkSpace/linux-linaro-tracking-gem5/arch/arm/boot/dts/vexpress-v2p-ca15-tc1-gem5_4cpus.dtb >>>>> --cpu-type=detailed --ruby --mem-type=ddr3_1600_x64 >>>>> >>>>> Traceback (most recent call last): >>>>> File "<string>", line 1, in <module> >>>>> File "/home/prathap/WorkSpace/gem5/src/python/m5/main.py", line 388, >>>>> in main >>>>> exec filecode in scope >>>>> File "configs/example/fs.py", line 302, in <module> >>>>> test_sys = build_test_system(np) >>>>> File "configs/example/fs.py", line 138, in build_test_system >>>>> Ruby.create_system(options, test_sys, test_sys.iobus, >>>>> test_sys._dma_ports) >>>>> File "/home/prathap/WorkSpace/gem5/src/python/m5/SimObject.py", line >>>>> 825, in __getattr__ >>>>> raise AttributeError, err_string >>>>> AttributeError: object 'LinuxArmSystem' has no attribute '_dma_ports' >>>>> (C++ object is not yet constructed, so wrapped C++ methods are >>>>> unavailable.) >>>>> >>>>> What could be the cause of this? >>>>> >>>>> Thanks, >>>>> Prathap >>>>> >>>>> >>>>> >>>>> On Tue, Sep 9, 2014 at 1:35 PM, Andreas Hansson < >>>>> andreas.hans...@arm.com> wrote: >>>>> >>>>>> Hi Prathap, >>>>>> >>>>>> There are many possible reasons for the discrepancy, and obviously >>>>>> there are many ways of building a memory controller :-). Have you >>>>>> configured the model to look like the actual hardware? The most obvious >>>>>> differences would be in terms of buffer sizes, the page policy, >>>>>> arbitration >>>>>> policy, the threshold before closing a page, the read/write switching, >>>>>> actual timings etc. It is also worth checking if the controller hardware >>>>>> treats writes the same way the model does (early responses, minimise >>>>>> switching). >>>>>> >>>>>> Andreas >>>>>> >>>>>> From: Prathap Kolakkampadath <kvprat...@gmail.com> >>>>>> Date: Tuesday, 9 September 2014 18:56 >>>>>> To: Andreas Hansson <andreas.hans...@arm.com> >>>>>> Cc: gem5 users mailing list <gem5-users@gem5.org> >>>>>> Subject: Re: [gem5-users] Questions on DRAM Controller model >>>>>> >>>>>> Hello Andreas, >>>>>> >>>>>> Thanks for your reply. I read your ISPASS paper and got a fair >>>>>> understanding about the architecture. >>>>>> I am trying to reproduce the results, collected from running >>>>>> synthetic benchmarks (latency and bandwidth) on real hardware, in >>>>>> Simulator >>>>>> Environment.However, i could see variations in the results and i am >>>>>> trying >>>>>> to understand the reasons. >>>>>> >>>>>> The experiment has latency(memory non-intensive with random access) >>>>>> as the primary task and bandwidth(memory intensive with sequential >>>>>> access) >>>>>> as the co-runner task. >>>>>> >>>>>> >>>>>> On real hardware >>>>>> case 1 - 0 corunner : latency of the test is 74.88ns and b/w >>>>>> 854.74MB/s >>>>>> case 2 - 1 corunner : latency of the test is 225.95ns and b/w >>>>>> 283.24MB/s >>>>>> >>>>>> On simulator >>>>>> case 1 - 0 corunner : latency of the test is 76.08ns and b/w >>>>>> 802.25MB/s >>>>>> case 2 - 1 corunner : latency of the test is 93.69ns and b/w >>>>>> 651.57MB/s >>>>>> >>>>>> >>>>>> Case 1 where latency test run alone(0 corunner), the results >>>>>> matches on both environment. However Case 2, when run with bandwidth(1 >>>>>> corunner), the results varies a lot. >>>>>> Do you have any thoughts about this? >>>>>> Thanks, >>>>>> Prathap >>>>>> >>>>>> On Mon, Sep 8, 2014 at 1:46 PM, Andreas Hansson < >>>>>> andreas.hans...@arm.com> wrote: >>>>>> >>>>>>> Hi Prathap, >>>>>>> >>>>>>> Have you read our ISPASS paper from last year? It’s referenced in >>>>>>> the header file, as well as on gem5.org. >>>>>>> >>>>>>> 1. Yes and no. Two different buffers are used in the model are >>>>>>> used, but they are random access, so you can treat the entries any >>>>>>> way you >>>>>>> want. >>>>>>> 2. Yes and no. It’s a C++ model, so the scheduler executes in 0 >>>>>>> time. Thus, when looking at the various requests it effectively sees >>>>>>> all >>>>>>> the banks. >>>>>>> 3. Yes and no. See above. >>>>>>> >>>>>>> Remember that this is a model. The goal is not to be representative >>>>>>> down to every last element of an RTL design. The goal is to be >>>>>>> representative of a real design, and then be fast. Both of these goals >>>>>>> are >>>>>>> delivered upon by the model. >>>>>>> >>>>>>> I hope that explains it. IF there is anything in the results you >>>>>>> do not agree with, please do say so. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Andreas >>>>>>> >>>>>>> From: Prathap Kolakkampadath via gem5-users <gem5-users@gem5.org> >>>>>>> Reply-To: Prathap Kolakkampadath <kvprat...@gmail.com>, gem5 users >>>>>>> mailing list <gem5-users@gem5.org> >>>>>>> Date: Monday, 8 September 2014 18:38 >>>>>>> To: gem5 users mailing list <gem5-users@gem5.org> >>>>>>> Subject: [gem5-users] Questions on DRAM Controller model >>>>>>> >>>>>>> Hello Everybody, >>>>>>> >>>>>>> I am using DDR3_1600_x64. I am trying to understand the memory >>>>>>> controller design and have few doubts about it. >>>>>>> >>>>>>> 1) Do the memory controller has a separate Bank request buffer >>>>>>> (read and write buffer) for each bank or just a global queue? >>>>>>> 2) Is there a scheduler per bank which arbitrates between different >>>>>>> queue requests parallel with other bank schedulers? >>>>>>> 3) Is there DRAM bus scheduler that arbitrates between different >>>>>>> bank requests? >>>>>>> >>>>>>> Thanks, >>>>>>> Prathap >>>>>>> >>>>>>> -- IMPORTANT NOTICE: The contents of this email and any attachments >>>>>>> are confidential and may also be privileged. If you are not the intended >>>>>>> recipient, please notify the sender immediately and do not disclose the >>>>>>> contents to any other person, use it for any purpose, or store or copy >>>>>>> the >>>>>>> information in any medium. Thank you. >>>>>>> >>>>>>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, >>>>>>> Registered in England & Wales, Company No: 2557590 >>>>>>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 >>>>>>> 9NJ, Registered in England & Wales, Company No: 2548782 >>>>>>> >>>>>> >>>>>> >>>>>> -- IMPORTANT NOTICE: The contents of this email and any attachments >>>>>> are confidential and may also be privileged. If you are not the intended >>>>>> recipient, please notify the sender immediately and do not disclose the >>>>>> contents to any other person, use it for any purpose, or store or copy >>>>>> the >>>>>> information in any medium. Thank you. >>>>>> >>>>>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, >>>>>> Registered in England & Wales, Company No: 2557590 >>>>>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 >>>>>> 9NJ, Registered in England & Wales, Company No: 2548782 >>>>>> >>>>> >>>>> >>>> >>>> -- IMPORTANT NOTICE: The contents of this email and any attachments are >>>> confidential and may also be privileged. If you are not the intended >>>> recipient, please notify the sender immediately and do not disclose the >>>> contents to any other person, use it for any purpose, or store or copy the >>>> information in any medium. Thank you. >>>> >>>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, >>>> Registered in England & Wales, Company No: 2557590 >>>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 >>>> 9NJ, Registered in England & Wales, Company No: 2548782 >>>> >>> >>> -- IMPORTANT NOTICE: The contents of this email and any attachments are >>> confidential and may also be privileged. If you are not the intended >>> recipient, please notify the sender immediately and do not disclose the >>> contents to any other person, use it for any purpose, or store or copy the >>> information in any medium. Thank you. >>> >>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, >>> Registered in England & Wales, Company No: 2557590 >>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 >>> 9NJ, Registered in England & Wales, Company No: 2548782 >>> >> >> >> -- IMPORTANT NOTICE: The contents of this email and any attachments are >> confidential and may also be privileged. If you are not the intended >> recipient, please notify the sender immediately and do not disclose the >> contents to any other person, use it for any purpose, or store or copy the >> information in any medium. Thank you. >> >> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, >> Registered in England & Wales, Company No: 2557590 >> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, >> Registered in England & Wales, Company No: 2548782 >> > > > -- IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > > ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, > Registered in England & Wales, Company No: 2557590 > ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, > Registered in England & Wales, Company No: 2548782 >
_______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users