Re: [gem5-users] DRAM memory access latency

Prathap Kolakkampadath via gem5-users Thu, 06 Nov 2014 09:48:19 -0800

Hello Andreas,

Thanks for your reply.



Ok. I got that the memory access latency indeed includes the queueing
latency. And for the read/write request that miss the buffer has a static
latency of  Static frontend latency + Static backend latency.


To summarize, the test i run is a latency benchmark which is a pointer
chasing test(only one request at a time) , generate reads to a specific
DRAM bank (Bank partitioned).This test is running on cpu0 of 4 cpu
arm_detailed running at 1GHZ frequency with 1MB shared L2 cache and  single
channel LPDDR3 x32 DRAM. The bank used by cpu0 is not shared between other
cpu's.

Test statistics:

system.mem_ctrls.avgQLat
               43816.35                       # Average queueing delay per
DRAM burst
system.mem_ctrls.avgBusLat                    5000.00
# Average bus latency per DRAM burst
system.mem_ctrls.avgMemAccLat                63816.35
# Average memory access latency per DRAM burst
system.mem_ctrls.avgRdQLen                       2.00
# Average read queue length when enqueuing
system.mem_ctrls.avgGap                     136814.25
# Average gap between requests
system.l2.ReadReq_avg_miss_latency::switch_cpus0.data
114767.654811                       # average ReadReq miss latency

Based on above test statistics:

avgMemAccLat is 63ns, which i presume the sum of tRP(15ns)+tRCD(15ns)
+tCL(15ns)+static latency(20ns).
Is this breakup correct?

However the l2.ReadReq_avg_miss_atency is 114ns which is ~50 ns more than
the avgMemAccLat. I couldn't figure out the components contributing to this
50ns latency. Your thoughts on this is much appreciated.

Regards,
Prathap




On Thu, Nov 6, 2014 at 3:03 AM, Andreas Hansson <andreas.hans...@arm.com>
wrote:

>  Hi Prathap,
>
>  The avgMemAccLat does indeed include any queueing latency. For the
> precise components included in the various latencies I would suggest
> checking the source code.
>
>  Note that the controller is not just accounting for the static (and
> dynamic) DRAM latency, but also the static controller pipeline latency (and
> dynamic queueing latency). The controller static latency is two parameters
> that are by default also adding a few 10’s of nanoseconds.
>
>  Let me know if you need more help breaking out the various components.
>
>  Andreas
>
>   From: Prathap Kolakkampadath via gem5-users <gem5-users@gem5.org>
> Reply-To: Prathap Kolakkampadath <kvprat...@gmail.com>, gem5 users
> mailing list <gem5-users@gem5.org>
> Date: Wednesday, 5 November 2014 05:36
> To: Tao Zhang <tao.zhang.0...@gmail.com>, gem5 users mailing list <
> gem5-users@gem5.org>, Amin Farmahini <amin...@gmail.com>
> Subject: Re: [gem5-users] DRAM memory access latency
>
>  Hi Tao,Amin,
>
>  According to gem5 source, MemAccLat is the time difference between the
> packet enters in the controller and packet leaves the controller. I presume
>  this added with BusLatency and static backend latency should match with
> system.l2.ReadReq_avg_miss_latency. However i see a difference of approx
> 50ns.
>
>
>  As mentioned above if MemAccLat is the time a packet spends in memory
> controller, then it should include the queuing latency too. In that case
> the value of  avgQLat looks suspicious. Is the avgQlat part of
> avgMemAccLat?
>
>  Thanks,
> Prathap
>
>
>
> On Tue, Nov 4, 2014 at 3:11 PM, Tao Zhang <tao.zhang.0...@gmail.com>
> wrote:
>
>>  From the stats, I'd like to use system.mem_ctrls.avgMemAccLat as the
>> overall average memory latency. It is 63.816ns, which is very close to 60ns
>> as you calculated. I guess the extra 3.816ns is due to the refresh penalty.
>>
>> -Tao
>>
>> On Tue, Nov 4, 2014 at 12:10 PM, Prathap Kolakkampadath <
>> kvprat...@gmail.com> wrote:
>>
>>>  Hi Toa, Amin,
>>>
>>>
>>>  Thanks for your reply.
>>>
>>>  To discard interbank interference and queueing delay, i have
>>> partitioned the banks so that the latency benchmark has exclusive access to
>>> a bank. Also latency benchmark is a pointer chasing benchmark, which will
>>> generate a single read request at a time.
>>>
>>>
>>>  stats.txt says this:
>>>
>>> system.mem_ctrls.avgQLat
>>> 43816.35                       # Average queueing delay per DRAM burst
>>> system.mem_ctrls.avgBusLat
>>> 5000.00                       # Average bus latency per DRAM burst
>>> system.mem_ctrls.avgMemAccLat
>>> 63816.35                       # Average memory access latency per DRAM
>>> burst
>>> system.mem_ctrls.avgRdQLen
>>> 2.00                       # Average read queue length when enqueuing
>>> system.mem_ctrls.avgGap
>>> 136814.25                       # Average gap between requests
>>> system.l2.ReadReq_avg_miss_latency::switch_cpus0.data
>>> 114767.654811                       # average ReadReq miss latency
>>>
>>>  The average Gap between requests is equal to the L2 latency + DRAM
>>> Latency for this test. Also avgRdQLen is 2 because cache line size is 64
>>> and DRAM interface is x32.
>>>
>>>  Is the final latency sum of avgQLat + avgBusLat + avgMemAccLat ?
>>> Also when avgRdQLen is 2, i am not sure what amounts to high queueing
>>> latency?
>>>
>>>  Regards,
>>>  Prathap
>>>
>>>
>>>
>>> On Tue, Nov 4, 2014 at 1:38 PM, Amin Farmahini <amin...@gmail.com>
>>> wrote:
>>>
>>>>  Prathap,
>>>>
>>>>  You are probably missing DRAM queuing latency (major reason) and other
>>>> on-chip latencies (such as bus latency) if any.
>>>>
>>>>  Thanks,
>>>> Amin
>>>>
>>>>  On Tue, Nov 4, 2014 at 1:28 PM, Prathap Kolakkampadath via gem5-users
>>>> <gem5-users@gem5.org> wrote:
>>>>
>>>>>    Hello Users,
>>>>>
>>>>>  I am measuring DRAM worst case memory access latency(tRP+tRCD
>>>>> +tCL+tBURST) using a latency benchmark on arm_detailed(1Ghz) with 1MB
>>>>> shared L2 cache and  LPDDR3 x32 DRAM.
>>>>>
>>>>>  According to DRAM timing parameters, tRP = '15ns, tRCD = '15ns', tCL
>>>>> = '15ns', tBURST = '5ns'. Latency measured by the benchmark on cache hit 
>>>>> is
>>>>> 22 ns and on cache miss is  132ns. Which means DRAM memory access latency 
>>>>> ~
>>>>> 110ns. However according to calculation it should  be
>>>>> tRP+tRCD+tCL+tBurst+static_backend_latency(10ns) = 60ns.
>>>>>
>>>>>
>>>>>  The latency what i observe is almost 50ns higher than what it is
>>>>> supposed to be. Is there anything which I am missing? Do any one know what
>>>>> else could add to the DRAM memory access latency?
>>>>>
>>>>>  Thanks,
>>>>>  Prathap
>>>>>
>>>>>
>>>>>  _______________________________________________
>>>>> gem5-users mailing list
>>>>> gem5-users@gem5.org
>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>
>>>>
>>>>
>>>
>>
>
> -- IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.
>
> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
> Registered in England & Wales, Company No: 2557590
> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
> Registered in England & Wales, Company No: 2548782
>

_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] DRAM memory access latency

Reply via email to