Hello Andreas

Whenever i switch to O3 Cpu from a checkpoint, i could see from config.ini
that CPU is getting switched but the mem_mode is still set to atomic.
However when booting in O3 CPU itself(without restoring from a checkpoint)
the mem_mode is set to timing. Not sure why. Anyhow i could run my tests on
O3 CPU with mem_mode timing(as verified from config.ini)

When i run one memory-intensive tests, which generates cache miss on every
read, in parallel with a pointer chasing test(one outstanding request at a
time) and both the cpu's share the same bank of DRAM Controller. In my
setup, as # of L1 MSHRs are 10, memory-intensive test can generate up to 10
Outstanding requests at a time. Since CPU speed is much faster than DRAM
controller, can generate outstanding requests and all the requests are
targeted to same bank, i expect to see the DRAM queue size to be 10 all the
time when there is a request coming from pointer chasing test. If this
assumption is correct i could see a better interference in model as i could
see in real platforms.

Don't you think DRAM queue size would get  filled up to the size of number
of L1 MSHRs according to above scenario. And what could be the case in
order to fill the DRAM up to the size of # of L1 MSHRs.

Thanks,
Prathap Kumar Valsan
Research Assistant
University of Kansas

On Tue, Oct 14, 2014 at 2:30 AM, Andreas Hansson <andreas.hans...@arm.com>
wrote:

>  Hi Prathap,
>
>  The O3 CPU only works with the memory system in timing mode, so I do not
> understand what two points you are comparing when you say the results are
> exactly the same.
>
>  The read queue is likely to never fill up unless all these transactions
> are generated at once. While the first one is being served by the memory
> controller you may have more coming in etc, but I do not understand why you
> think it would ever fill up.
>
>  For “debugging” make sure that the config.ini actually captures what you
> think you are simulating. Also, you have a lot of DRAM-related stats in the
> stats.txt output.
>
>  Andreas
>
>   From: Prathap Kolakkampadath <kvprat...@gmail.com>
> Date: Tuesday, 14 October 2014 04:33
>
> To: Andreas Hansson <andreas.hans...@arm.com>
> Cc: gem5 users mailing list <gem5-users@gem5.org>
> Subject: Re: [gem5-users] Questions on DRAM Controller model
>
>    Hi Andreas, users
>
>  I ran the test with ARM O3 cpu(--cpu-type=detailed) , mem_mode=timing,
> the results are exactly the same compared to mem_mode=atomic.
>  I have partitioned the DRAM banks using software. Both the benchmarks-
> latency-sensitive and bandwidth -sensitive (both generates only reads)
> running in parallel using the same DRAM bank.
> From status file, i observe expected number L2 misses and DRAM requests
> are getting generated.
> In my system, the number of L1 MSHRs are 10 and number of L2 MSHR's are
> 32. So i expect that when a request from a latency-sensitive benchmark
> comes to DRAM, the readQ size has to be 10. However what i am observing is
> most of the time the Queue is not getting filled and hence there is less
> queueing latency and interference.
>
>  I am using classic memory system with default DRAM
> controller,DDR3_1600_x64. Addressing map is RoRaBaChCo, page
> policy-open_adaptive, and frfcfs scheduler.
>
>  Do you have any thoughts on this? How could i debug this further?
>
>  Appreciate your help.
>
>  Thanks,
>  Prathap Kumar Valsan
>  Research Assistant
>  University of Kansas
>
> On Mon, Oct 13, 2014 at 4:21 AM, Andreas Hansson <andreas.hans...@arm.com>
> wrote:
>
>>  Hi Prathap,
>>
>>  Indeed. The atomic mode is for fast-forwarding only. Once you actually
>> want to get some representative performance numbers you have to run in
>> timing mode with either the O3 or Minor CPU model.
>>
>>  Andreas
>>
>>   From: Prathap Kolakkampadath <kvprat...@gmail.com>
>> Date: Monday, 13 October 2014 10:19
>>
>> To: Andreas Hansson <andreas.hans...@arm.com>
>> Cc: gem5 users mailing list <gem5-users@gem5.org>
>> Subject: Re: [gem5-users] Questions on DRAM Controller model
>>
>>  Thanks for your reply. The memory mode which I used is atomic. I think,
>> I need to run the tests in timing More. I believe which shows up
>> interference and queueing delay similar to real platforms.
>>
>> Prathap
>> On Oct 13, 2014 2:55 AM, "Andreas Hansson" <andreas.hans...@arm.com>
>> wrote:
>>
>>>  Hi Prathap,
>>>
>>>  I don’t dare say exactly what is going wrong in your setup, but I am
>>> confident that Ruby will not magically make things more representative (it
>>> will likely give you a whole lot more problems though). In the end it is
>>> all about configuring the building blocks to match the system you want to
>>> capture. The crossbars and caches in the classic memory system do make some
>>> simplifications, but I have not yet seen a case when they are not
>>> sufficiently accurate.
>>>
>>>  Have you looked at the various policy settings in the DRAM controller,
>>> e.g. the page policy and address mapping? If you’re trying to correlate
>>> with a real platform, also see Anthony’s ISPASS paper from last year for
>>> some sensible steps in simplifying the problem and dividing it into
>>> manageable chunks.
>>>
>>>  Good luck.
>>>
>>>  Andreas
>>>
>>>   From: Prathap Kolakkampadath <kvprat...@gmail.com>
>>> Date: Monday, 13 October 2014 00:29
>>> To: Andreas Hansson <andreas.hans...@arm.com>
>>> Cc: gem5 users mailing list <gem5-users@gem5.org>
>>> Subject: Re: [gem5-users] Questions on DRAM Controller model
>>>
>>>   Hello Andreas/Users,
>>>
>>> I used to create a checkpoint until linux boot using Atomic Simple CPU
>>> and then restore from this checkpoint to detailed O3 cpu before running the
>>> test. I notice that the mem-mode is  set to atomic and not timing. Will
>>> that be the reason for less contention in memory bus i am observing?
>>>
>>>  Thanks,
>>>  Prathap
>>>
>>> On Sun, Oct 12, 2014 at 4:56 PM, Prathap Kolakkampadath <
>>> kvprat...@gmail.com> wrote:
>>>
>>>>  Hello Andreas,
>>>>
>>>>  Even after configuring the model like the actual hardware, i still not
>>>> seeing enough interference to the read request under consideration. I am
>>>> using the classic memory system model. Since it uses atomic and functional
>>>> Packet allocation protocol, I would like to switch to Ruby( I think it
>>>> more resembles with real platform).
>>>>
>>>>
>>>>  I am hitting in to below problem when i use ruby.
>>>>
>>>> /build/ARM/gem5.opt --stats-file=cr1A1.txt configs/example/fs.py
>>>> --caches --l2cache --l1d_size=32kB --l1i_size=32kB --l2_size=1MB
>>>> --num-cpus=4 --mem-size=512MB
>>>> --kernel=/home/prathap/WorkSpace/linux-linaro-tracking-gem5/vmlinux
>>>> --disk-image=/home/prathap/WorkSpace/gem5/fullsystem/disks/arm-ubuntu-natty-headless.img
>>>> --machine-type=VExpress_EMM
>>>> --dtb-file=/home/prathap/WorkSpace/linux-linaro-tracking-gem5/arch/arm/boot/dts/vexpress-v2p-ca15-tc1-gem5_4cpus.dtb
>>>> --cpu-type=detailed --ruby --mem-type=ddr3_1600_x64
>>>>
>>>> Traceback (most recent call last):
>>>>   File "<string>", line 1, in <module>
>>>>   File "/home/prathap/WorkSpace/gem5/src/python/m5/main.py", line 388,
>>>> in main
>>>>     exec filecode in scope
>>>>   File "configs/example/fs.py", line 302, in <module>
>>>>     test_sys = build_test_system(np)
>>>>   File "configs/example/fs.py", line 138, in build_test_system
>>>>     Ruby.create_system(options, test_sys, test_sys.iobus,
>>>> test_sys._dma_ports)
>>>>   File "/home/prathap/WorkSpace/gem5/src/python/m5/SimObject.py", line
>>>> 825, in __getattr__
>>>>     raise AttributeError, err_string
>>>> AttributeError: object 'LinuxArmSystem' has no attribute '_dma_ports'
>>>>   (C++ object is not yet constructed, so wrapped C++ methods are
>>>> unavailable.)
>>>>
>>>>  What could be the cause of this?
>>>>
>>>>  Thanks,
>>>> Prathap
>>>>
>>>>
>>>>
>>>> On Tue, Sep 9, 2014 at 1:35 PM, Andreas Hansson <
>>>> andreas.hans...@arm.com> wrote:
>>>>
>>>>>  Hi Prathap,
>>>>>
>>>>>  There are many possible reasons for the discrepancy, and obviously
>>>>> there are many ways of building a memory controller :-). Have you
>>>>> configured the model to look like the actual hardware? The most obvious
>>>>> differences would be in terms of buffer sizes, the page policy, 
>>>>> arbitration
>>>>> policy, the threshold before closing a page, the read/write switching,
>>>>> actual timings etc. It is also worth checking if the controller hardware
>>>>> treats writes the same way the model does (early responses, minimise
>>>>> switching).
>>>>>
>>>>>  Andreas
>>>>>
>>>>>   From: Prathap Kolakkampadath <kvprat...@gmail.com>
>>>>> Date: Tuesday, 9 September 2014 18:56
>>>>> To: Andreas Hansson <andreas.hans...@arm.com>
>>>>> Cc: gem5 users mailing list <gem5-users@gem5.org>
>>>>> Subject: Re: [gem5-users] Questions on DRAM Controller model
>>>>>
>>>>>  Hello Andreas,
>>>>>
>>>>>  Thanks for your reply. I read your ISPASS paper and got a fair
>>>>> understanding about the architecture.
>>>>> I am trying to reproduce the results, collected from running synthetic
>>>>> benchmarks (latency and bandwidth) on real hardware, in Simulator
>>>>> Environment.However, i could see variations in the results and i am trying
>>>>> to understand the reasons.
>>>>>
>>>>>  The experiment has latency(memory non-intensive with random access)
>>>>> as the primary task and bandwidth(memory intensive with sequential access)
>>>>> as the co-runner task.
>>>>>
>>>>>
>>>>>  On real hardware
>>>>> case 1 - 0 corunner : latency of the test is 74.88ns and b/w
>>>>> 854.74MB/s
>>>>> case 2 - 1 corunner : latency of the test is 225.95ns and b/w
>>>>> 283.24MB/s
>>>>>
>>>>>  On simulator
>>>>>  case 1 - 0 corunner : latency of the test is 76.08ns and b/w
>>>>> 802.25MB/s
>>>>> case 2 - 1 corunner : latency of the test is 93.69ns and b/w
>>>>> 651.57MB/s
>>>>>
>>>>>
>>>>>  Case 1 where latency test run alone(0 corunner), the results matches
>>>>> on both environment. However Case 2, when run with bandwidth(1 corunner),
>>>>> the results varies a lot.
>>>>> Do you have any thoughts about this?
>>>>> Thanks,
>>>>> Prathap
>>>>>
>>>>> On Mon, Sep 8, 2014 at 1:46 PM, Andreas Hansson <
>>>>> andreas.hans...@arm.com> wrote:
>>>>>
>>>>>>  Hi Prathap,
>>>>>>
>>>>>>  Have you read our ISPASS paper from last year? It’s referenced in
>>>>>> the header file, as well as on gem5.org.
>>>>>>
>>>>>>    1. Yes and no. Two different buffers are used in the model are
>>>>>>    used, but they are random access, so you can treat the entries any 
>>>>>> way you
>>>>>>    want.
>>>>>>    2. Yes and no. It’s a C++ model, so the scheduler executes in 0
>>>>>>    time. Thus, when looking at the various requests it effectively sees 
>>>>>> all
>>>>>>    the banks.
>>>>>>    3. Yes and no. See above.
>>>>>>
>>>>>> Remember that this is a model. The goal is not to be representative
>>>>>> down to every last element of an RTL design. The goal is to be
>>>>>> representative of a real design, and then be fast. Both of these goals 
>>>>>> are
>>>>>> delivered upon by the model.
>>>>>>
>>>>>>  I hope that explains it. IF there is anything in the results you do
>>>>>> not agree with, please do say so.
>>>>>>
>>>>>>  Thanks,
>>>>>>
>>>>>>  Andreas
>>>>>>
>>>>>>   From: Prathap Kolakkampadath via gem5-users <gem5-users@gem5.org>
>>>>>> Reply-To: Prathap Kolakkampadath <kvprat...@gmail.com>, gem5 users
>>>>>> mailing list <gem5-users@gem5.org>
>>>>>> Date: Monday, 8 September 2014 18:38
>>>>>> To: gem5 users mailing list <gem5-users@gem5.org>
>>>>>> Subject: [gem5-users] Questions on DRAM Controller model
>>>>>>
>>>>>>  Hello Everybody,
>>>>>>
>>>>>> I am using DDR3_1600_x64. I am trying to understand the memory
>>>>>> controller design and  have few doubts about it.
>>>>>>
>>>>>> 1) Do the memory controller has a separate  Bank request buffer (read
>>>>>> and write buffer) for each bank or just a global queue?
>>>>>> 2) Is there a scheduler per bank which arbitrates between different
>>>>>> queue requests parallel with other bank schedulers?
>>>>>> 3) Is there DRAM bus scheduler that arbitrates between different bank
>>>>>> requests?
>>>>>>
>>>>>> Thanks,
>>>>>> Prathap
>>>>>>
>>>>>> -- IMPORTANT NOTICE: The contents of this email and any attachments
>>>>>> are confidential and may also be privileged. If you are not the intended
>>>>>> recipient, please notify the sender immediately and do not disclose the
>>>>>> contents to any other person, use it for any purpose, or store or copy 
>>>>>> the
>>>>>> information in any medium. Thank you.
>>>>>>
>>>>>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
>>>>>> Registered in England & Wales, Company No: 2557590
>>>>>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1
>>>>>> 9NJ, Registered in England & Wales, Company No: 2548782
>>>>>>
>>>>>
>>>>>
>>>>> -- IMPORTANT NOTICE: The contents of this email and any attachments
>>>>> are confidential and may also be privileged. If you are not the intended
>>>>> recipient, please notify the sender immediately and do not disclose the
>>>>> contents to any other person, use it for any purpose, or store or copy the
>>>>> information in any medium. Thank you.
>>>>>
>>>>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
>>>>> Registered in England & Wales, Company No: 2557590
>>>>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1
>>>>> 9NJ, Registered in England & Wales, Company No: 2548782
>>>>>
>>>>
>>>>
>>>
>>> -- IMPORTANT NOTICE: The contents of this email and any attachments are
>>> confidential and may also be privileged. If you are not the intended
>>> recipient, please notify the sender immediately and do not disclose the
>>> contents to any other person, use it for any purpose, or store or copy the
>>> information in any medium. Thank you.
>>>
>>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
>>> Registered in England & Wales, Company No: 2557590
>>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1
>>> 9NJ, Registered in England & Wales, Company No: 2548782
>>>
>>
>> -- IMPORTANT NOTICE: The contents of this email and any attachments are
>> confidential and may also be privileged. If you are not the intended
>> recipient, please notify the sender immediately and do not disclose the
>> contents to any other person, use it for any purpose, or store or copy the
>> information in any medium. Thank you.
>>
>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
>> Registered in England & Wales, Company No: 2557590
>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
>> Registered in England & Wales, Company No: 2548782
>>
>
>
> -- IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.
>
> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
> Registered in England & Wales, Company No: 2557590
> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
> Registered in England & Wales, Company No: 2548782
>
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to