Re: [gem5-users] How to use stack distance calculator in gem5.

2015-11-11 Thread Bhaskar Kalita
Hi Andreas;

I installed pydot. Now am more clear with the topology. I want to put the
monitor "before L2". So, tried changing BaseCPU.py as:

   #self.toL2Bus.master = self.l2cache.cpu_side
self.monitor = CommMonitor()
self.monitor.StackDist = StackDistProbe(verify = True)
self.toL2Bus.master = self.monitor.slave
self.monitor.master = self.l2cache.cpu_side
self._cached_ports = ['l2cache.mem_side']

I recompiled and and trier to run an example and received error as:

fatal: system.monitor.stackdist without default or user set value

And I also cannot see the CommMonitor anywhere in the in the dot files.

Do I also need to make change in se.py and CommMonitor.py??? I have
attached the three files for reference.

Thanks,

Bhaskar







On Sun, Nov 8, 2015 at 10:34 PM, Andreas Hansson 
wrote:

> Hi Bhaskar,
>
> Something is not quite right in the topology you are expressing in these
> lines. Have you looked at the graphical output (make sure you have py-dot
> installed)?
>
> You want to trace _after_ the l2? If so, I would suggest to connect the L2
> cache as usual. Then instantiate and connect the monitor:
>
> self.monitor = CommMonitor()
> self.monitor.slave = self.l2cache.mem_side
> self._cached_ports = [‘monitor.master’]
>
> This will leave the CommMonitor as the “exposed” port being connected
> downwards.
>
> Make sure this is all working before you start fiddling with the probes.
> The graphical output is your friend…
>
> Once the above is working, it should just be a matter of adding a line:
>
> self.monitor.sdprobe = StackDistProbe()
>
> Andreas
>
> From: gem5-users  on behalf of Bhaskar
> Kalita 
> Reply-To: gem5 users mailing list 
> Date: Sunday, 8 November 2015 at 12:16
>
> To: gem5 users mailing list 
> Subject: Re: [gem5-users] How to use stack distance calculator in gem5.
>
> Hi Andreas,
>
> You did not respond to my previous mail, hope you reply to this. I went
> through the mail archive regarding CommMonitor. I tried again changing
> BaseCPU.py as:
>
> self.toL2Bus.master = self.l2cache.cpu_side
> self._cached_ports = ['l2cache.mem_side']
> self.monitor = CommMonitor()
> self.monitor.stackdist = StackDistProbe(verify = True)
> self.l2cache.cpu_side = self.monitor.master
> self.monitor.slave = self.l2cache.mem_side
>
> I re-compiled and tried to run an example but received the following error:
>
> fatal: system.monitor.stackdist without default or user set value
>
> For reference my command line was:
>
> build/X86/gem5.debug --debug-flag=StackDist --stats-file=hello.txt
> --dump-config=hello.ini --json-config=hello.json configs/example/se.py
> --num-cpus=1 --cpu-type=DerivO3CPU --caches --l1i_size=32kB --l1d_size=32kB
> --l2cache --num-l2caches=1 --l2_size=256kB --l2_assoc=4 -c
> 'tests/test-progs/hello/bin/x86/linux/hello;'
>
> Can you help me out where am making the mistake.
>
> Thanks,
> Bhaskar
>
>
>
>
>
> On Fri, Nov 6, 2015 at 5:03 AM, Bhaskar Kalita 
> wrote:
>
>> Hi Andreas
>>
>> I want to measure the stack distance for l2 cache. So, tried to place the
>> CommMonitor between toL2Bus.master and l2cache.cpu_side in BaseCPU.py as:
>>
>>#self.toL2Bus.master = self.l2cache.cpu_side
>>#self._cached_ports = ['l2cache.mem_side']
>> self.l2MONITOR = CommMonitor()
>> self.l2MONITOR.stackdist = StackDistProbe(verify = True)
>> self.toL2Bus.master = self.l2MONITOR.slave
>> self.l2MONITOR.master = self.l2cache.cpu_side
>> self._cached_ports = ['l2cache.mem_side']
>>
>> I tried assigning a StackDistProbe to the comm monitor as:
>>
>>  stackdist = Param.StackDistProbe(NULL)
>>
>> I re-compiled as scons build/X86/gem5.debug. It worked fine. The error I
>> got while trying to run an example was
>>
>>Traceback (most recent call last):
>>   File "", line 1, in 
>>   File
>> "/home/bhaskar/Downloads/gem5-stable-a48faafdb3bf/src/python/m5/main.py",
>> line 389, in main
>> exec filecode in scope
>>   File "configs/example/se.py", line 286, in 
>> Simulation.run(options, root, system, FutureClass)
>>   File
>> "/home/bhaskar/Downloads/gem5-stable-a48faafdb3bf/configs/common/Simulation.py",
>> line 583, in run
>> m5.instantiate(checkpoint_dir)
>>   File
>> "/home/bhaskar/Downloads/gem5-stable-a48faafdb3bf/src/python/m5/simulate.py",
>> line 114, in instantiate
>> for obj in root.descendants(): obj.createCCObject()
>>   File
>> "/home/bhaskar/Downloads/gem5-stable-a48faafdb3bf/src/python/m5/SimObject.py",
>> line 1453, in createCCObject
>> self.getCCParams()
>>   File
>> "/home/bhaskar/Downloads/gem5-stable-a48faafdb3bf/src/python/m5/SimObject.py",
>> line 1400, in getCCParams
>> value = value.getValue()
>>   File
>> "/home/bhaskar/Downloads/gem5-stable-a48faafdb3bf/src/python/m5/SimObject.py",
>> line 1457, in getValue
>> return self.getCCObject()
>>   

Re: [gem5-users] Getting segfault upon restoring a startup of arm-detailed Full system simulation

2015-11-11 Thread rahul shrivastava
Hi All,

I have following confusions in the understanding of the checkpointing
feature. I am stuck here for quite a while.
1) The page *http://www.m5sim.org/Checkpoints
 *says that during restore, gem5 assumes
that the checkpoint is taken using atomicSimpleCPU, so we have to mention
the cpu-type using --restore-with-cpu. This means that while taking
checkpoint also, I should mention the cpu-type. Is my statement correct?
2) Some of the mail chains says that we can take the checkpoint using
AtomicSimpleCPU and then upon restore, we can mention the cpu-type along
with the cache-configuration.

The above two statements seems contradictory.
I tried both, but none of the above two seems to work for me. Both the
techniques gives segfault. Can you please help me here?


Regards
Rahul


On Tue, Nov 10, 2015 at 2:12 AM, rahul shrivastava 
wrote:

> Hi All,
>
> Some of the links suggested that checkpoint  should be created using the
> default CPU and without any cache configuration as input. I made the
> suggested changes in the  command, but it still, on restoring, I am getting
> the same error. Following is the command that I am executing to  take
> checkpoint
>
>  ./build/ARM/gem5.fast configs/example/fs.py
> --script=./configs/boot/hack_back_ckpt.rcS -n 4 --machine-type=VExpress_EMM
> --kernel=../linux-linaro-tracking-gem5/vmlinux
> --dtb-filename=../linux-linaro-tracking-gem5/arch/arm/boot/dts/vexpress-v2p-ca15-tc1-gem5_dvfs_per_core_4cpus.dtb
> --disk-image=../disks/arm-ubuntu-natty-headless.img --cpu-clock=[1 GHz,750
> MHz,500 MHz]
>
>
> Could you please shed some light?
>
>
> Regards
> Rahul
>
>
>
>
> On Mon, Nov 9, 2015 at 6:51 PM, rahul shrivastava  > wrote:
>
>> Hi All,
>>
>> I am checkpointing the startup for arm_detailed full system simulation
>> using hack_back_ckpt.rcS script. I can see that the checkpoint is created
>> successfully, but the restore fails with a seg fault.
>>
>> : system.remote_gdb.listener: listening for remote gdb on port 7006
>> 0: system.remote_gdb.listener: listening for remote gdb on port 7007
>> Switch at curTick count:1
>> info: Entering event queue @ 3748342464000.  Starting simulation...
>> Switched CPUS @ tick 3748342474000
>> switching cpus
>> info: Entering event queue @ 3748342474000.  Starting simulation...
>> Segmentation fault (core dumped)
>>
>>
>> The command to take checkpoint is
>> *M5_PATH=$(pwd)/.. ./build/ARM/gem5.fast configs/example/fs.py
>> --script=./configs/boot/hack_back_ckpt.rcS --cpu-type=arm_detailed --caches
>> -n 4 --l1d_size=32kB --l1i_size=32kB --machine-type=VExpress_EMM
>> --kernel=../linux-linaro-tracking-gem5/vmlinux
>> --dtb-filename=../linux-linaro-tracking-gem5/arch/arm/boot/dts/vexpress-v2p-ca15-tc1-gem5_dvfs_per_core_4cpus.dtb
>> --disk-image=../disks/arm-ubuntu-natty-headless.img --cpu-clock=\['1
>> GHz','750 MHz','500 MHz'\]*
>>
>> The command to restore is
>> *M5_PATH=$(pwd)/.. ./build/ARM/gem5.fast configs/example/fs.py
>> --checkpoint-restore=1 --restore-with-cpu arm_detailed --caches -n 4
>> --l1d_size=32kB --l1i_size=32kB --machine-type=VExpress_EMM
>> --kernel=../linux-linaro-tracking-gem5/vmlinux
>> --dtb-filename=../linux-linaro-tracking-gem5/arch/arm/boot/dts/vexpress-v2p-ca15-tc1-gem5_dvfs_per_core_4cpus.dtb
>> --disk-image=../disks/arm-ubuntu-natty-headless.img --cpu-clock=\['1
>> GHz','750 MHz','500 MHz'\]*
>>
>>
>>
>> Am I giving some wrong option either while checkpointing or restoring?
>> Can you please help me here?
>>
>>
>> Regards
>> Rahul
>>
>
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Modelling command bus contention in DRAM controller

2015-11-11 Thread Andreas Hansson
Hi Prathap,

Could you elaborate on why you think this line is causing problems. It sounds 
like you are suggesting this line is too restrictive?

It simply enforces a minimum col-to-col timing, there could still be other 
constraints that are more restrictive.

Andreas

From: gem5-users 
> on behalf of 
Prathap Kolakkampadath >
Reply-To: gem5 users mailing list 
>
Date: Tuesday, 10 November 2015 at 21:30
To: gem5 users mailing list >
Subject: Re: [gem5-users] Modelling command bus contention in DRAM controller

Hi Andreas,

To be more precise, I believe the below code snippet in doDRAMAccess(), should 
be called only  for the Row Hit request. For a Row Miss request why do we have 
to update the bank.colAllowedAt for all the Banks?

// update the time for the next read/write burst for each


 // bank (add a max with tCCD/tCCD_L here)

 ranks[j]->banks[i].colAllowedAt = std::max(cmd_at + 
cmd_dly,ranks[j]->banks[i].colAllowedAt)


Thanks,

Prathap


On Tue, Nov 10, 2015 at 12:13 PM, Prathap Kolakkampadath 
> wrote:
Hi Andreas,

As you said all the act-act are taken in to account.
All col-to-col is taken in to account except, if there is a open request(Hit) 
after a closed request(Miss).
If i am using FCFS scheduler, and there are two requests in the queue Request1 
and Request2 like below, according
to the current implementation CAS of Request2 is only issued after CAS of 
Request1.  Is that correct?
I don't see in doDramAccess(), where the CAS of second request is updated ahead 
of CAS of first request.

Request1@Bank1 (PRE-ACT-CAS) --> Request2@Bank2 (CAS)

Could you please clarify?

I will also take a look into the util/dram_sweep_plot.py.

Thanks,
Prathap

On Tue, Nov 10, 2015 at 9:41 AM, Andreas Hansson 
> wrote:
Hi Prathap,

All the col-to-col, act-to-act etc are taken into account, just not command-bus 
contention. Have a look at util/dram_sweep_plot.py for a graphical “test bench” 
for the DRAM controller. As you will see, it never exceeds the theoretical max. 
This script relies on the configs/dram/sweep.py for the actual generation of 
data.

Andreas

From: gem5-users 
> on behalf of 
Prathap Kolakkampadath >
Reply-To: gem5 users mailing list 
>
Date: Monday, 9 November 2015 at 21:53
To: gem5 users mailing list >
Subject: Re: [gem5-users] Modelling command bus contention in DRAM controller

Hello Andreas,

One problem could be when there is a Miss request followed by a Hit request. 
Taking the below example, initially queue has only one request R1(Miss), as 
soon as the this request is selected there
is another request in the queue R2(Hit). Here CAS of R2 is ready and can be 
issued right away in the next clock cycle. However,  i believe in the 
simulator, while it computes the ready time of R1, it also recomputes the
next CAS that can be issued to other Banks. Thus the CAS of R2 can now be 
issued only after the CAS of R1.  If i am right, this could be a problem?

Request1@Bank1 (PRE-ACT-CAS) --> Request2@Bank2 (CAS)

Thanks,
Prathap

On Mon, Nov 9, 2015 at 1:27 PM, Andreas Hansson 
> wrote:
Hi Prathap,

Command-bus contention is intentionally not modelled. The main reason for this 
is to keep the model performant. Moreover, in real devices the command bus is 
typically designed to _not_ be a bottleneck. Admittedly this choice could be 
reassessed if needed.

Andreas

From: gem5-users 
> on behalf of 
Prathap Kolakkampadath >
Reply-To: gem5 users mailing list 
>
Date: Monday, 9 November 2015 at 18:25
To: gem5 users mailing list >
Subject: [gem5-users] Modelling command bus contention in DRAM controller


Hello Users,

After closely looking at the doDRAMAccess() of dram controller implementation 
in GEM5, i suspect that the current implementation may not be taking in to 
account the command bus contention that could happen if DRAM timing constraints 
take particular values.

For example in the below scenario, the queue has two closed requests one to 
Bank1 and other to Bank2.

Request1@Bank1 (PRE-ACT-CAS) --> Request2@Bank2 (PRE-ACT-CAS)

Lets say tRP(8cycles), tRCD(8cycles), tCL(8cycles), and tRRD(8 cycles). In this 
case ACT of R2 and CAS of R1 becomes active at the same time.
At this point one command needs to be 

Re: [gem5-users] Modelling command bus contention in DRAM controller

2015-11-11 Thread Prathap Kolakkampadath
Hello Andreas,

Please see my comments below

Thanks,
Prathap

On Wed, Nov 11, 2015 at 12:38 PM, Andreas Hansson 
wrote:

> Hi Prathap,
>
> I don’t quite understand the statement about the second CAS being issued
> before the first one. FCFS by construction won’t do that (in any case,
> please do not use FCFS for anything besides debugging, it’s really not
> representative).
>

 This could happen even in fr-fcfs, incase a hit request arrives soon
after a miss request has been selected by the scheduler.

>
> The latency you quote for access (2), is that taking the colAllowedAt and
> busBusyUntil into account? Remember that doDRAMAccess is not necessarily
> coinciding with then this access actually starts.
>

 My point here is a  CAS to a bank has to be issued as soon as the bank
is available. In that case, the request 2 should be ready before request
one. However, in the current implementation, "all CAS are strictly ordered".

>
> It could very well be that there is a bug, and if there is we should get
> it sorted.
>
 I believe that this could be a bug.

>
> Andreas
>
> From: gem5-users  on behalf of Prathap
> Kolakkampadath 
> Reply-To: gem5 users mailing list 
> Date: Wednesday, 11 November 2015 at 17:43
>
> To: gem5 users mailing list 
> Subject: Re: [gem5-users] Modelling command bus contention in DRAM
> controller
>
> Hello Andreas,
>
> I believe it is restrictive.
> Below is the DRAM trace under fcfs scheduler for two requests, where first
> request is a RowMiss request to Bank0
> and second request is a RowHit request to Bank1.
>
> 1) *Memory access latency of first miss request*.
> From the trace, the Memory access latency of the first miss request is
> 52.5ns (tRP(15) + tRCD(15) + tCL(15) + tBURST(7.5)).
> This is expected.
> 2) *Memory access latency of second request, which is a Hit to a
> different Bank.*
>From the trace, the memory access latency for the second request is
> also 52.5ns
>This is unexpected. CAS of this ready request should have issued before
> the CAS of the first Miss request.
>
> In doDRAMAccess() the miss request is updating the next read/write burst
> of all banks, thus the CAS of Ready request
> can now be issued only after the CAS of the Miss Request.
>
> 321190719635810: system.mem_ctrls: Timing access to addr 4291233984,
> rank/bank/row 0 0 65422
> 321190719635810: system.mem_ctrls: RowMiss:READ
> 321190719635810: system.mem_ctrls: Access to 4291233984, ready at
> 321190719688310 bus busy until 321190719688310.
> 321190719643310: system.mem_ctrls: Timing access to addr 3983119872,
> rank/bank/row 0 1 56019
> 321190719643310: system.mem_ctrls: RowHit:READ
> 321190719643310: system.mem_ctrls: Access to 3983119872, ready at
> 321190719695810 bus busy until 321190719695810.
>
> Please let me know what you think.
>
> Thanks,
> Prathap
>
>
> On Wed, Nov 11, 2015 at 3:00 AM, Andreas Hansson 
> wrote:
>
>> Hi Prathap,
>>
>> Could you elaborate on why you think this line is causing problems. It
>> sounds like you are suggesting this line is too restrictive?
>>
>> It simply enforces a minimum col-to-col timing, there could still be
>> other constraints that are more restrictive.
>>
>> Andreas
>>
>> From: gem5-users  on behalf of Prathap
>> Kolakkampadath 
>> Reply-To: gem5 users mailing list 
>> Date: Tuesday, 10 November 2015 at 21:30
>>
>> To: gem5 users mailing list 
>> Subject: Re: [gem5-users] Modelling command bus contention in DRAM
>> controller
>>
>> Hi Andreas,
>>
>> To be more precise, I believe the below code snippet in doDRAMAccess(),
>> should be called only  for the Row Hit request. For a Row Miss request why
>> do we have to update the bank.colAllowedAt for all the Banks?
>>
>> // update the time for the next read/write burst for each
>>
>>  // bank (add a max with tCCD/tCCD_L here)
>>
>>  ranks[j]->banks[i].colAllowedAt = std::max(cmd_at + 
>> cmd_dly,ranks[j]->banks[i].colAllowedAt)
>>
>>
>> Thanks,
>>
>> Prathap
>>
>>
>>
>> On Tue, Nov 10, 2015 at 12:13 PM, Prathap Kolakkampadath <
>> kvprat...@gmail.com> wrote:
>>
>>> Hi Andreas,
>>>
>>> As you said all the act-act are taken in to account.
>>> All col-to-col is taken in to account except, if there is a open
>>> request(Hit) after a closed request(Miss).
>>> If i am using* FCFS* scheduler, and there are two requests in the queue
>>> Request1 and Request2 like below, according
>>> to the current implementation CAS of Request2 is only issued after CAS
>>> of Request1.  Is that correct?
>>> I don't see in doDramAccess(), where the CAS of second request is
>>> updated ahead of CAS of first request.
>>>
>>> *Request1@Bank1 (PRE-ACT-CAS) --> Request2@Bank2 (CAS)*
>>>
>>> Could you please clarify?
>>>
>>> I will also take a look into the 

Re: [gem5-users] Modelling command bus contention in DRAM controller

2015-11-11 Thread Andreas Hansson
Hi Prathap,

I don’t quite understand the statement about the second CAS being issued before 
the first one. FCFS by construction won’t do that (in any case, please do not 
use FCFS for anything besides debugging, it’s really not representative).

The latency you quote for access (2), is that taking the colAllowedAt and 
busBusyUntil into account? Remember that doDRAMAccess is not necessarily 
coinciding with then this access actually starts.

It could very well be that there is a bug, and if there is we should get it 
sorted.

Andreas

From: gem5-users 
> on behalf of 
Prathap Kolakkampadath >
Reply-To: gem5 users mailing list 
>
Date: Wednesday, 11 November 2015 at 17:43
To: gem5 users mailing list >
Subject: Re: [gem5-users] Modelling command bus contention in DRAM controller

Hello Andreas,

I believe it is restrictive.
Below is the DRAM trace under fcfs scheduler for two requests, where first 
request is a RowMiss request to Bank0
and second request is a RowHit request to Bank1.

1) Memory access latency of first miss request.
From the trace, the Memory access latency of the first miss request is 
52.5ns (tRP(15) + tRCD(15) + tCL(15) + tBURST(7.5)).
This is expected.
2) Memory access latency of second request, which is a Hit to a different Bank.
   From the trace, the memory access latency for the second request is also 
52.5ns
   This is unexpected. CAS of this ready request should have issued before the 
CAS of the first Miss request.

In doDRAMAccess() the miss request is updating the next read/write burst of all 
banks, thus the CAS of Ready request
can now be issued only after the CAS of the Miss Request.

321190719635810: system.mem_ctrls: Timing access to addr 4291233984, 
rank/bank/row 0 0 65422
321190719635810: system.mem_ctrls: RowMiss:READ
321190719635810: system.mem_ctrls: Access to 4291233984, ready at 
321190719688310 bus busy until 321190719688310.
321190719643310: system.mem_ctrls: Timing access to addr 3983119872, 
rank/bank/row 0 1 56019
321190719643310: system.mem_ctrls: RowHit:READ
321190719643310: system.mem_ctrls: Access to 3983119872, ready at 
321190719695810 bus busy until 321190719695810.

Please let me know what you think.

Thanks,
Prathap


On Wed, Nov 11, 2015 at 3:00 AM, Andreas Hansson 
> wrote:
Hi Prathap,

Could you elaborate on why you think this line is causing problems. It sounds 
like you are suggesting this line is too restrictive?

It simply enforces a minimum col-to-col timing, there could still be other 
constraints that are more restrictive.

Andreas

From: gem5-users 
> on behalf of 
Prathap Kolakkampadath >
Reply-To: gem5 users mailing list 
>
Date: Tuesday, 10 November 2015 at 21:30

To: gem5 users mailing list >
Subject: Re: [gem5-users] Modelling command bus contention in DRAM controller

Hi Andreas,

To be more precise, I believe the below code snippet in doDRAMAccess(), should 
be called only  for the Row Hit request. For a Row Miss request why do we have 
to update the bank.colAllowedAt for all the Banks?

// update the time for the next read/write burst for each


 // bank (add a max with tCCD/tCCD_L here)

 ranks[j]->banks[i].colAllowedAt = std::max(cmd_at + 
cmd_dly,ranks[j]->banks[i].colAllowedAt)


Thanks,

Prathap


On Tue, Nov 10, 2015 at 12:13 PM, Prathap Kolakkampadath 
> wrote:
Hi Andreas,

As you said all the act-act are taken in to account.
All col-to-col is taken in to account except, if there is a open request(Hit) 
after a closed request(Miss).
If i am using FCFS scheduler, and there are two requests in the queue Request1 
and Request2 like below, according
to the current implementation CAS of Request2 is only issued after CAS of 
Request1.  Is that correct?
I don't see in doDramAccess(), where the CAS of second request is updated ahead 
of CAS of first request.

Request1@Bank1 (PRE-ACT-CAS) --> Request2@Bank2 (CAS)

Could you please clarify?

I will also take a look into the util/dram_sweep_plot.py.

Thanks,
Prathap

On Tue, Nov 10, 2015 at 9:41 AM, Andreas Hansson 
> wrote:
Hi Prathap,

All the col-to-col, act-to-act etc are taken into account, just not command-bus 
contention. Have a look at util/dram_sweep_plot.py for a graphical “test bench” 
for the DRAM controller. As you will see, it never exceeds the theoretical max. 
This script relies on the configs/dram/sweep.py for the actual generation of 
data.

Andreas

From: gem5-users 

Re: [gem5-users] Modelling command bus contention in DRAM controller

2015-11-11 Thread Andreas Hansson
Hi Prathap,

Ok, so for FCFS we are seeing the expected behaviour. Agreed?

I completely agree on the point of the ordered CAS, and for FR-FCFS we could 
indeed hit the case you describe. Additionally, the model makes scheduling 
decisions “conservatively” early (assuming it has to precharge the page), so 
there is also an inherent window where we decide to do something, and something 
else could show up in the meanwhile, which we would have chosen instead.

I agree that we could fix this. The arguments against: 1) in any case, a real 
controller has a pipeline latency that will limit the visibility to the 
scheduler, so if the window is in the order of the “fronted pipeline latency” 
of the model then it’s not really a problem since we would have missed them in 
reality as well (admittedly here it is slightly longer), 2) with more things in 
the queues (typical case), the likelihood of having to make a bad decision 
because of this window is very small, 3) I fear it might add quite some 
complexity to account for these gaps (as opposed to just tracking next CAS), 
with a very small impact in most full-blown use-cases.

It would be great to actually figure out if this is an issue on larger 
use-cases, and what the performance impact on the simulator is for fixing the 
issue. Will you take a stab at coding up a fix?

Andreas

From: gem5-users 
> on behalf of 
Prathap Kolakkampadath >
Reply-To: gem5 users mailing list 
>
Date: Wednesday, 11 November 2015 at 18:47
To: gem5 users mailing list >
Subject: Re: [gem5-users] Modelling command bus contention in DRAM controller

Hello Andreas,

Please see my comments below

Thanks,
Prathap

On Wed, Nov 11, 2015 at 12:38 PM, Andreas Hansson 
> wrote:
Hi Prathap,

I don’t quite understand the statement about the second CAS being issued before 
the first one. FCFS by construction won’t do that (in any case, please do not 
use FCFS for anything besides debugging, it’s really not representative).

 This could happen even in fr-fcfs, incase a hit request arrives soon after 
 a miss request has been selected by the scheduler.

The latency you quote for access (2), is that taking the colAllowedAt and 
busBusyUntil into account? Remember that doDRAMAccess is not necessarily 
coinciding with then this access actually starts.

 My point here is a  CAS to a bank has to be issued as soon as the bank is 
 available. In that case, the request 2 should be ready before request one. 
 However, in the current implementation, "all CAS are strictly ordered".

It could very well be that there is a bug, and if there is we should get it 
sorted.
 I believe that this could be a bug.

Andreas

From: gem5-users 
> on behalf of 
Prathap Kolakkampadath >
Reply-To: gem5 users mailing list 
>
Date: Wednesday, 11 November 2015 at 17:43

To: gem5 users mailing list >
Subject: Re: [gem5-users] Modelling command bus contention in DRAM controller

Hello Andreas,

I believe it is restrictive.
Below is the DRAM trace under fcfs scheduler for two requests, where first 
request is a RowMiss request to Bank0
and second request is a RowHit request to Bank1.

1) Memory access latency of first miss request.
From the trace, the Memory access latency of the first miss request is 
52.5ns (tRP(15) + tRCD(15) + tCL(15) + tBURST(7.5)).
This is expected.
2) Memory access latency of second request, which is a Hit to a different Bank.
   From the trace, the memory access latency for the second request is also 
52.5ns
   This is unexpected. CAS of this ready request should have issued before the 
CAS of the first Miss request.

In doDRAMAccess() the miss request is updating the next read/write burst of all 
banks, thus the CAS of Ready request
can now be issued only after the CAS of the Miss Request.

321190719635810: system.mem_ctrls: Timing access to addr 4291233984, 
rank/bank/row 0 0 65422
321190719635810: system.mem_ctrls: RowMiss:READ
321190719635810: system.mem_ctrls: Access to 4291233984, ready at 
321190719688310 bus busy until 321190719688310.
321190719643310: system.mem_ctrls: Timing access to addr 3983119872, 
rank/bank/row 0 1 56019
321190719643310: system.mem_ctrls: RowHit:READ
321190719643310: system.mem_ctrls: Access to 3983119872, ready at 
321190719695810 bus busy until 321190719695810.

Please let me know what you think.

Thanks,
Prathap


On Wed, Nov 11, 2015 at 3:00 AM, Andreas Hansson 
> wrote:
Hi Prathap,

Could you elaborate on why you think this 

Re: [gem5-users] Modelling command bus contention in DRAM controller

2015-11-11 Thread Prathap Kolakkampadath
Hello Andreas,

see my comments below.

Thanks,
Prathap

On Wed, Nov 11, 2015 at 12:59 PM, Andreas Hansson 
wrote:

> Hi Prathap,
>
> Ok, so for FCFS we are seeing the expected behaviour. Agreed?
>

  >> Agreed.  Because CAS is ordered.


>
> I completely agree on the point of the ordered CAS, and for FR-FCFS we
> could indeed hit the case you describe. Additionally, the model makes
> scheduling decisions “conservatively” early (assuming it has to precharge
> the page), so there is also an inherent window where we decide to do
> something, and something else could show up in the meanwhile, which we
> would have chosen instead.
>


> I agree that we could fix this. The arguments against: 1) in any case, a
> real controller has a pipeline latency that will limit the visibility to
> the scheduler, so if the window is in the order of the “fronted pipeline
> latency” of the model then it’s not really a problem since we would have
> missed them in reality as well (admittedly here it is slightly longer), 2)
> with more things in the queues (typical case), the likelihood of having to
> make a bad decision because of this window is very small, 3) I fear it
> might add quite some complexity to account for these gaps (as opposed to
> just tracking next CAS), with a very small impact in most full-blown
> use-cases.
>

   >> I agree that this may not be an issue on larger use-cases, however
the implementation differs from how a real DRAM controllers schedules the
commands, where CAS can be reordered based
   >> on the readiness of the respective Bank.


>
> It would be great to actually figure out if this is an issue on larger
> use-cases, and what the performance impact on the simulator is for fixing
> the issue. Will you take a stab at coding up a fix?
>

   >> I think this can be easily fixed by "updating the next CAS to banks,
only if the packet is a row hit". I believe this works assuming tRRD  for
any DRAM module is greater than the CAS-CAS delay.
   >> I did a fix and ran dram_sweep.py. There was absolutely no difference
in the performance, which was expected.
   >> Presently i am not able to anticipate any other complexity.


>
> Andreas
>
> From: gem5-users  on behalf of Prathap
> Kolakkampadath 
> Reply-To: gem5 users mailing list 
> Date: Wednesday, 11 November 2015 at 18:47
>
> To: gem5 users mailing list 
> Subject: Re: [gem5-users] Modelling command bus contention in DRAM
> controller
>
> Hello Andreas,
>
> Please see my comments below
>
> Thanks,
> Prathap
>
> On Wed, Nov 11, 2015 at 12:38 PM, Andreas Hansson  > wrote:
>
>> Hi Prathap,
>>
>> I don’t quite understand the statement about the second CAS being issued
>> before the first one. FCFS by construction won’t do that (in any case,
>> please do not use FCFS for anything besides debugging, it’s really not
>> representative).
>>
>
>  This could happen even in fr-fcfs, incase a hit request arrives soon
> after a miss request has been selected by the scheduler.
>
>>
>> The latency you quote for access (2), is that taking the colAllowedAt and
>> busBusyUntil into account? Remember that doDRAMAccess is not necessarily
>> coinciding with then this access actually starts.
>>
>
>  My point here is a  CAS to a bank has to be issued as soon as the
> bank is available. In that case, the request 2 should be ready before
> request one. However, in the current implementation, "all CAS are strictly
> ordered".
>
>>
>> It could very well be that there is a bug, and if there is we should get
>> it sorted.
>>
>  I believe that this could be a bug.
>
>>
>> Andreas
>>
>> From: gem5-users  on behalf of Prathap
>> Kolakkampadath 
>> Reply-To: gem5 users mailing list 
>> Date: Wednesday, 11 November 2015 at 17:43
>>
>> To: gem5 users mailing list 
>> Subject: Re: [gem5-users] Modelling command bus contention in DRAM
>> controller
>>
>> Hello Andreas,
>>
>> I believe it is restrictive.
>> Below is the DRAM trace under fcfs scheduler for two requests, where
>> first request is a RowMiss request to Bank0
>> and second request is a RowHit request to Bank1.
>>
>> 1) *Memory access latency of first miss request*.
>> From the trace, the Memory access latency of the first miss request
>> is 52.5ns (tRP(15) + tRCD(15) + tCL(15) + tBURST(7.5)).
>> This is expected.
>> 2) *Memory access latency of second request, which is a Hit to a
>> different Bank.*
>>From the trace, the memory access latency for the second request is
>> also 52.5ns
>>This is unexpected. CAS of this ready request should have issued
>> before the CAS of the first Miss request.
>>
>> In doDRAMAccess() the miss request is updating the next read/write burst
>> of all banks, thus the CAS of Ready request
>> can now be issued only after the CAS of the Miss 

Re: [gem5-users] Modelling command bus contention in DRAM controller

2015-11-11 Thread Andreas Hansson
Hi Prathap,

Let me first reiterate that I don’t think this would ever be a problem in a 
realistic scenario (the tree arguments from before), but it would be good to 
quantify the impact.

The “solution” in my view would need the controller to take decisions in a 
non-monotonic temporal order, and that would also mean that the data bus 
occupancy would have to be tracked as intervals rather than an end value. I 
think the same holds try for the column (and other) constraints. Perhaps the 
latter can be “tricked” by not updating it and relying on the other 
constraints, but conceptually we would beed to track the start and end, not 
just the end. Agreed?

Andreas

From: gem5-users 
> on behalf of 
Prathap Kolakkampadath >
Reply-To: gem5 users mailing list 
>
Date: Wednesday, 11 November 2015 at 21:54
To: gem5 users mailing list >
Subject: Re: [gem5-users] Modelling command bus contention in DRAM controller

Hello Andreas,

see my comments below.

Thanks,
Prathap

On Wed, Nov 11, 2015 at 12:59 PM, Andreas Hansson 
> wrote:
Hi Prathap,

Ok, so for FCFS we are seeing the expected behaviour. Agreed?

  >> Agreed.  Because CAS is ordered.


I completely agree on the point of the ordered CAS, and for FR-FCFS we could 
indeed hit the case you describe. Additionally, the model makes scheduling 
decisions “conservatively” early (assuming it has to precharge the page), so 
there is also an inherent window where we decide to do something, and something 
else could show up in the meanwhile, which we would have chosen instead.

I agree that we could fix this. The arguments against: 1) in any case, a real 
controller has a pipeline latency that will limit the visibility to the 
scheduler, so if the window is in the order of the “fronted pipeline latency” 
of the model then it’s not really a problem since we would have missed them in 
reality as well (admittedly here it is slightly longer), 2) with more things in 
the queues (typical case), the likelihood of having to make a bad decision 
because of this window is very small, 3) I fear it might add quite some 
complexity to account for these gaps (as opposed to just tracking next CAS), 
with a very small impact in most full-blown use-cases.

   >> I agree that this may not be an issue on larger use-cases, however the 
implementation differs from how a real DRAM controllers schedules the commands, 
where CAS can be reordered based
   >> on the readiness of the respective Bank.


It would be great to actually figure out if this is an issue on larger 
use-cases, and what the performance impact on the simulator is for fixing the 
issue. Will you take a stab at coding up a fix?

   >> I think this can be easily fixed by "updating the next CAS to banks, only 
if the packet is a row hit". I believe this works assuming tRRD  for any DRAM 
module is greater than the CAS-CAS delay.
   >> I did a fix and ran dram_sweep.py. There was absolutely no difference in 
the performance, which was expected.
   >> Presently i am not able to anticipate any other complexity.


Andreas

From: gem5-users 
> on behalf of 
Prathap Kolakkampadath >
Reply-To: gem5 users mailing list 
>
Date: Wednesday, 11 November 2015 at 18:47

To: gem5 users mailing list >
Subject: Re: [gem5-users] Modelling command bus contention in DRAM controller

Hello Andreas,

Please see my comments below

Thanks,
Prathap

On Wed, Nov 11, 2015 at 12:38 PM, Andreas Hansson 
> wrote:
Hi Prathap,

I don’t quite understand the statement about the second CAS being issued before 
the first one. FCFS by construction won’t do that (in any case, please do not 
use FCFS for anything besides debugging, it’s really not representative).

 This could happen even in fr-fcfs, incase a hit request arrives soon after 
 a miss request has been selected by the scheduler.

The latency you quote for access (2), is that taking the colAllowedAt and 
busBusyUntil into account? Remember that doDRAMAccess is not necessarily 
coinciding with then this access actually starts.

 My point here is a  CAS to a bank has to be issued as soon as the bank is 
 available. In that case, the request 2 should be ready before request one. 
 However, in the current implementation, "all CAS are strictly ordered".

It could very well be that there is a bug, and if there is we should get it 
sorted.
 I believe that this could be a bug.

Andreas

From: gem5-users 
>