> On July 1, 2015, 3:07 p.m., Andreas Hansson wrote:
> > src/mem/DRAMCtrl.py, line 460
> > <http://reviews.gem5.org/r/2932/diff/1/?file=47349#file47349line460>
> >
> >     Really? I was under the impression that the vault controllers were 
> > simple closed page. See for example 
> > http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7095154

In fact, vault controllers are simple closed page policy. But I faced a problem 
with the simple closed policy in the DRAMCtrl class: If we issue a packet 
longet than burst_length (for example a burst of 128Bytes or 256Bytes which is 
supported by HMC), DRAMCtrl will open and close the row for every 
subtransaction. While the close_adaptive keeps the row open until all 
subtransactions complete.
Another alternative solution would be to set the page policy to simple closed, 
but increase burst_length to maximum supported by HMC (128 or 256), but then 
the problem will be that smaller transactions incur a large tBURST latency.
I hope that this makes sense.


> On July 1, 2015, 3:07 p.m., Andreas Hansson wrote:
> > src/mem/DRAMCtrl.py, line 469
> > <http://reviews.gem5.org/r/2932/diff/1/?file=47349#file47349line469>
> >
> >     I am somewhat surprised that we need values this high. With a 32 vault 
> > cube this would amount to an astonishing number of queued transactions (and 
> > there is probably no system out there that would even get close to even 
> > using 2048 outstanding transactions).

I performed some accuracy comparisons between the gem5 model and our 
cycle-accurate HMC model, and I realized that the the 'buffer_size' parameter 
in gem5 does not correspond directly to the number of flip-flops in the system. 
This is because gem5's standard memory system does not work based on flits, and 
it only approximates them. This is acceptable, but we should remember that from 
the 'buffer_size' in gem5 we cannot draw conclusions about the number of 
hardware flip-flops in that components.
Here is how I obtained this value:
I injected high pressure (identical) traffic to both models (gem5 and the 
Cycle-accurate model) and tried to match their 'delivered bandwidth' and 
'execution time' with a high accuracy (less than 5% difference).
I observed that if 'buffer_size' is not large enough, gem5 does not deliver the 
intended bandwidth. So I increased this value and gem5's bandwidth matched the 
CA simulation really well.
This issue also stems from the fact that in real hardware, most components 
interact with request/grant or valid/stall handshaking FIFOs which regulate 
bandwidth, but in gem5's standard memory system these concepts cannot be 
directly modeled. For this reason the buffer_size in the bridge and the 
DRAMCtrl components should be adjusted to achieve the desired bandwidth and 
execution time.


- Erfan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2932/#review6656
-----------------------------------------------------------


On July 1, 2015, 10:25 a.m., Erfan Azarkhish wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2932/
> -----------------------------------------------------------
> 
> (Updated July 1, 2015, 10:25 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> -------
> 
> Minor fixes in the HMC vault model
> This patch and the subsequent chain of patches aim to model a simple Hybrid 
> Memory Cube device in gem5
> Please apply the complete chain before running the simulation.
> 
> 
> Diffs
> -----
> 
>   src/mem/DRAMCtrl.py 73d4798871a5 
> 
> Diff: http://reviews.gem5.org/r/2932/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Erfan Azarkhish
> 
>

_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to