[gem5-dev] Review Request 2685: tests: Recategorise regressions based on run time

2015-03-06 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2685/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10736:e92e8235fd08
---
tests: Recategorise regressions based on run time

This patch takes a first stab at recategorising the regression tests
based on actual run times. The simple-atomic and simple-timing runs of
vortex and twolf all finish in less than 180 s, and they are
consequently moved from long to quick. All realview64 linux-boot
regressions take more than 700 s, and they are therefore moved to
long.

Later patches will rename quick to short, and further divide the
regressions into short, medium and long.


Diffs
-

  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic-dual/config.ini
 PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic-dual/simerr 
PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic-dual/simout 
PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic-dual/stats.txt
 PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic-dual/system.terminal
 8a20e2a1562d 
  tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic/config.ini 
PRE-CREATION 
  tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic/simerr 
PRE-CREATION 
  tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic/simout 
PRE-CREATION 
  tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic/stats.txt 
PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic/system.terminal
 8a20e2a1562d 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-timing-dual/config.ini
 PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-timing-dual/simerr 
PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-timing-dual/simout 
PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-timing-dual/stats.txt
 PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-timing-dual/system.terminal
 8a20e2a1562d 
  tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-timing/config.ini 
PRE-CREATION 
  tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-timing/simerr 
PRE-CREATION 
  tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-timing/simout 
PRE-CREATION 
  tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-timing/stats.txt 
PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-timing/system.terminal
 8a20e2a1562d 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-switcheroo-atomic/config.ini
 PRE-CREATION 
  tests/long/fs/10.linux-boot/ref/arm/linux/realview64-switcheroo-atomic/simerr 
PRE-CREATION 
  tests/long/fs/10.linux-boot/ref/arm/linux/realview64-switcheroo-atomic/simout 
PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-switcheroo-atomic/stats.txt
 PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-switcheroo-atomic/system.terminal
 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-atomic/config.ini 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-atomic/simerr 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-atomic/simout 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-atomic/smred.msg 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-atomic/smred.out 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-atomic/stats.txt 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-timing/config.ini 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-timing/simerr 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-timing/simout 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-timing/smred.msg 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-timing/smred.out 8a20e2a1562d 
  tests/long/se/50.vortex/ref/alpha/tru64/simple-timing/stats.txt 8a20e2a1562d 
  tests/long/se/50.vortex/ref/arm/linux/simple-atomic/config.ini 8a20e2a1562d 
  tests/long/se/50.vortex/ref/arm/linux/simple-atomic/simerr 8a20e2a1562d 
  tests/long/se/50.vortex/ref/arm/linux/simple-atomic/simout 8a20e2a1562d 
  tests/long/se/50.vortex/ref/arm/linux/simple-atomic/smred.out 8a20e2a1562d 
  tests/long/se/50.vortex/ref/arm/linux/simple-atomic/stats.txt 8a20e2a1562d 
  tests/long/se/50.vortex/ref/arm/linux/simple-timing/config.ini 8a20e2a1562d 
  tests/long/se/50.vortex/ref/arm/linux/simple-timing/simerr 8a20e2a1562d 
  tests/long/se/50.vortex/ref/arm/linux/simple-timing/simout 8a20e2a1562d 
  tests/long/se/50.vortex/ref/arm/linux/simple-timing/smred.out 8a20e2a1562d 
  tests/long/se/50.vortex/ref/arm/linux/simple-timing/stats.txt 8

[gem5-dev] Review Request 2682: arm: Add a GICv2m device

2015-03-06 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2682/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10733:582cb95c654c
---
arm: Add a GICv2m device

This patch adds a new PIO-accessible GICv2m shim. This shim has a PIO
slave port on one side, and SPI 'wires' on the other. It accepts MSIs
from the system and triggers SPIs on the GIC. It is configurable with
a number of frames, each of which has a number of SPIs and a base SPI
offset.

A Linux driver for GICv2m is available upstream.


Diffs
-

  src/dev/arm/Gic.py 8a20e2a1562d 
  src/dev/arm/SConscript 8a20e2a1562d 
  src/dev/arm/gic_v2m.hh PRE-CREATION 
  src/dev/arm/gic_v2m.cc PRE-CREATION 

Diff: http://reviews.gem5.org/r/2682/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2684: test, arm: Add scripts to test checkpoints

2015-03-06 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2684/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10735:7f058afecefe
---
test, arm: Add scripts to test checkpoints

Add a set of scripts to automatically test checkpointing in the
regression framework. The checkpointing tests are similar to the
switcheroo tests, but instead of switching between CPUs, they
checkpoint the system and restore from the checkpoint again. This is
done at regular intervals, typically while booting Linux.

The implementation is fairly straight forward, with the exception that
we have to work around gem5's inability to restore from a checkpoint
after a system has been instantiated. We work around this by forking
off child processes that does the actual simulation and never
instantiate a system in the parent process unless a maximum checkpoint
count is reached (in which case we just simulate the system to
completion in the parent).

Checkpoint testing is currently only enabled 32- and 64-bit ARM
systems using atomic CPUs.

Note: An unfortunate side-effect of forking is that every new process
will overwrite the stats and terminal output from the previous
process. This means that the output directory only contains data from
the last checkpoint.


Diffs
-

  tests/SConscript 8a20e2a1562d 
  tests/configs/checkpoint.py PRE-CREATION 
  tests/configs/realview-simple-atomic-checkpoint.py PRE-CREATION 
  tests/configs/realview64-simple-atomic-checkpoint.py PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic-checkpoint/config.ini
 PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic-checkpoint/config.json
 PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic-checkpoint/stats.txt
 PRE-CREATION 
  
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-simple-atomic-checkpoint/system.terminal
 8a20e2a1562d 
  
tests/quick/fs/10.linux-boot/ref/arm/linux/realview-simple-atomic-checkpoint/config.ini
 PRE-CREATION 
  
tests/quick/fs/10.linux-boot/ref/arm/linux/realview-simple-atomic-checkpoint/config.json
 PRE-CREATION 
  
tests/quick/fs/10.linux-boot/ref/arm/linux/realview-simple-atomic-checkpoint/stats.txt
 PRE-CREATION 
  
tests/quick/fs/10.linux-boot/ref/arm/linux/realview-simple-atomic-checkpoint/system.terminal
 8a20e2a1562d 

Diff: http://reviews.gem5.org/r/2684/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2681: arm: Remove the 'magic MSI register' in the GIC (PL390)

2015-03-06 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2681/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10732:598eb88a159f
---
arm: Remove the 'magic MSI register' in the GIC (PL390)

This patch removes the code that added this magic register. A
follow-up patch provides a GICv2m MSI shim that gives the same
functionality in a standard ARM system architecture way.


Diffs
-

  src/dev/arm/Gic.py 8a20e2a1562d 
  src/dev/arm/gic_pl390.hh 8a20e2a1562d 
  src/dev/arm/gic_pl390.cc 8a20e2a1562d 

Diff: http://reviews.gem5.org/r/2681/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2683: config: Add soak test for memtest.py

2015-03-06 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2683/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10734:0675f16e3ba9
---
config: Add soak test for memtest.py

This patch adds a random option to memtest.py which allows the user to
easily test valid random tree topologies. The patch also adds a
wrapper script to run soak tests using the newly introduced option.

We also adjust the progress interval and progress limit check to make
the output less noisy, and avoid false positives.

Bring on the pain.


Diffs
-

  configs/example/memtest.py 8a20e2a1562d 
  util/memtest-soak.py PRE-CREATION 

Diff: http://reviews.gem5.org/r/2683/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2680: cpu: o3: record cpi stacks

2015-03-05 Thread Andreas Hansson via gem5-dev
Hi Nilay,

I don’t think anyone is suggesting a general split between function and
stats gathering. The whole point with putting stats in probes is
compartmentalisation. We already have problems with too many stats, and
while some stats definitely make sense as part of the running code, bigger
bundles of isolated functionality are probably better of as probes (a CPI
probe, trace probes etc). Needless to say the probe infrastructure is very
useful for things other than bundling up of stats as well.

The specific patch (#2680) is probably a borderline case, and I am not
suggesting it has to be a probe, I am merely arguing that we should
consider it, and not just blindly add more stats since we already have
issues with the signal-to-noise ratio.

Andreas


On 05/03/2015 04:56, "Nilay Vaish"  wrote:

>On Wed, 4 Mar 2015, Andreas Hansson wrote:
>
>>
>> ---
>> This is an automatically generated e-mail. To reply, visit:
>> http://reviews.gem5.org/r/2680/#review5937
>> ---
>>
>>
>> Should this perhaps be a probe rather?
>>
>
>OK, I just read the commit message from the changeset 10023 91faf6649de0.
>I am going to argue against separating statistics gathering from the so
>called functional code.  One of the ways I use to learn gem5 (or have
>used) is observing how the statistics are being collected.  Moving
>statistics collection to a separate file, makes that collection code less
>visible, which is as important as the functional code itself.  This
>approach was being used in ruby originally.  And I changed it (changeset
>67d9da312ef0) so that a person reading the code knows what statistic is
>being updated when.
>
>--
>Nilay
>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2680: cpu: o3: record cpi stacks

2015-03-04 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2680/#review5937
---


Should this perhaps be a probe rather?

- Andreas Hansson


On March 4, 2015, 9 a.m., Nilay Vaish wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2680/
> ---
> 
> (Updated March 4, 2015, 9 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10735:c8c9b6d902cb
> ---
> cpu: o3: record cpi stacks
> 
> This patch labels each empty slot of the commit width every cycle using four
> different types of delays: misprediction, fetch, memory and execution.  For
> memory delays, we check if a memory reference instruction is at the head of 
> the
> rob.  Otherwise, we label the slot as execution delayed.  If the rob is empty,
> we assume the reason for the vacancy to be delay in fetching the instuction.
> Lastly, if the cpu is squashing instructions, then we assume that slots are
> going vacant because of misprediction.
> 
> 
> Diffs
> -
> 
>   src/cpu/o3/commit.hh 8a20e2a1562d 
>   src/cpu/o3/commit_impl.hh 8a20e2a1562d 
> 
> Diff: http://reviews.gem5.org/r/2680/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Nilay Vaish
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Configuration GUI for gem5

2015-03-03 Thread Andreas Hansson via gem5-dev
Hi Marcus,

I’d also say this would be a great topic for the User Workshop at ISCA:
http://www.gem5.org/User_workshop_2015


Andreas

On 03/03/2015 15:14, "Ali Saidi via gem5-dev"  wrote:

>Hi Marcus,
>
>Option 1 is probably the most preferred route as the code is much more
>likely to say in-sync and compatible if it¹s in the same repository and
>tested along side gem5. The best way to proceed with that option is to
>post your patches to the gem5 review board http://reviews.gem5.org. If
>you¹re using mercurial this can be done with the post-review plugin to
>gem5. Take a look at: http://www.gem5.org/Commit_Access
>
>Thanks for doing this, I think it would be a great help to new users!
>
>Ali
>
>
>On 3/3/15, 9:07 AM, "Marcus Johnson via gem5-dev" 
>wrote:
>
>>Hi,
>>
>>I am a senior at Colorado State University working with two other
>>students
>>under Dr. Sudeep Pasricha on a senior design project.  We have created a
>>GUI to aid in configuring gem5 simulations.
>>
>>It currently supports a variety of command line options which are
>>supplied
>>to the se.py and fs.py scripts.  We are beginning to implement some
>>exploration functionality, so that on any particular parameter, a user
>>could specify multiple options for which a simulation would run for each.
>>
>>One of the most important advantages this GUI brings is a reduction to
>>the
>>learning curve to using gem5, as well as a huge aid in debugging
>>configuration options.  It has many simple error checks to prevent simple
>>mistakes.
>>
>>We would like to contribute this project open-source to the gem5
>>community,
>>and wanted to see what your thoughts are on integrating it with gem5.  We
>>believe there are two different routes we can take:
>>
>>Option 1 would be to push our code to the gem5 repository.  If this is an
>>acceptable option, what is a good way to proceed with this?
>>
>>Option 2 would be to have our code run completely independent of gem5
>>(the
>>user would specify as an option where they have downloaded gem5 code, or
>>we
>>release it as a patch for gem5).  If this were the case, is there a good
>>way for us to market this to the gem5 community?
>>
>>What are your thoughts on the above two options?
>>
>>Thanks,
>>Marcus Johnson
>>___
>>gem5-dev mailing list
>>gem5-dev@gem5.org
>>http://m5sim.org/mailman/listinfo/gem5-dev
>>
>
>
>-- IMPORTANT NOTICE: The contents of this email and any attachments are
>confidential and may also be privileged. If you are not the intended
>recipient, please notify the sender immediately and do not disclose the
>contents to any other person, use it for any purpose, or store or copy
>the information in any medium.  Thank you.
>
>ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
>Registered in England & Wales, Company No:  2557590
>ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
>Registered in England & Wales, Company No:  2548782
>
>___
>gem5-dev mailing list
>gem5-dev@gem5.org
>http://m5sim.org/mailman/listinfo/gem5-dev
>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2619: cpu: Add L-TAGE branch predictor

2015-03-03 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2619/#review5927
---

Ship it!


Some minor formatting issues, but overall it looks good. Thanks for getting 
this in shape.

- Andreas Hansson


On March 3, 2015, 2:53 a.m., Dibakar Gope wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2619/
> ---
> 
> (Updated March 3, 2015, 2:53 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> It is the L-TAGE predictor from the Branch Prediction Championship, 
> originally coded by Vignyan Reddy, and modified by me.
> 
> Changeset 10727:73315fc01762
> ---
> [mq]: ltage_updated.patch
> 
> 
> Diffs
> -
> 
>   src/cpu/pred/2bit_local.hh 8a20e2a1562d 
>   src/cpu/pred/2bit_local.cc 8a20e2a1562d 
>   src/cpu/pred/BranchPredictor.py 8a20e2a1562d 
>   src/cpu/pred/SConscript 8a20e2a1562d 
>   src/cpu/pred/bi_mode.hh 8a20e2a1562d 
>   src/cpu/pred/bi_mode.cc 8a20e2a1562d 
>   src/cpu/pred/bpred_unit.hh 8a20e2a1562d 
>   src/cpu/pred/bpred_unit.cc 8a20e2a1562d 
>   src/cpu/pred/bpred_unit_impl.hh 8a20e2a1562d 
>   src/cpu/pred/ltage.hh PRE-CREATION 
>   src/cpu/pred/ltage.cc PRE-CREATION 
>   src/cpu/pred/tournament.hh 8a20e2a1562d 
>   src/cpu/pred/tournament.cc 8a20e2a1562d 
> 
> Diff: http://reviews.gem5.org/r/2619/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Dibakar Gope
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2636: mem: fix prefetcher bug regarding write buffer hits

2015-03-02 Thread Andreas Hansson via gem5-dev
Great. Let us know if there are still any remaining issues.

We’ve got some additional cache fixes that should be on RB before the end of 
the week.

Andreas

From: Steve Reinhardt mailto:ste...@gmail.com>>
Date: Monday, 2 March 2015 16:47
To: Andreas Hansson mailto:andreas.hans...@arm.com>>
Cc: Stephan Diestelhorst 
mailto:stephan.diestelho...@gmail.com>>, 
Default mailto:gem5-dev@gem5.org>>
Subject: Re: Review Request 2636: mem: fix prefetcher bug regarding write 
buffer hits

Done.  Thanks for the reminder.

Steve

On Mon, Mar 2, 2015 at 2:59 AM, Andreas Hansson 
mailto:andreas.hans...@arm.com>> wrote:
This is an automatically generated e-mail. To reply, visit: 
http://reviews.gem5.org/r/2636/


On February 10th, 2015, 5:37 p.m. UTC, Stephan Diestelhorst wrote:

I have had a similar impulse, when inspecting this code.  However, the prefetch 
hitting a write-back in an upper cache is actually already handled in 
Cache::getTimingPacket():

// Check if the prefetch hit a writeback in an upper cache
// and if so we will eventually get a HardPFResp from
// above
if (snoop_pkt.memInhibitAsserted()) {
// If we are getting a non-shared response it is dirty
bool pending_dirty_resp = !snoop_pkt.sharedAsserted();
markInService(mshr, pending_dirty_resp);
DPRINTF(Cache, "Upward snoop of prefetch for addr"
" %#x (%s) hit\n",
tgt_pkt->getAddr(), tgt_pkt->isSecure()? "s": "ns");
return NULL;
}

We are currently testing a patch that shuffles this thing upwards; we have more 
detail tomorrow.  In either way, if we want to go with this patch, it should 
address this seemingly dead bit of logic, as well.  My suggestion is to give 
this an additional day, and in the meantime test as Andreas has suggested.

On February 10th, 2015, 7:44 p.m. UTC, Steve Reinhardt wrote:

Good catch, I hadn't noticed that before.  I believe what's happening in 
George's case is that there are multiple L1s above a shared L2, and one of them 
has/had the block in O state (dirty shared) and is in the process of writing it 
back, while others have the block in the shared state.  So the one with the 
block in writeback goes to write back the data (in accordance with the code 
snippet you have there), but meanwhile other caches have squashed the prefetch, 
so your code never gets executed, because this code immediately above it gets 
executed instead:

// Check to see if the prefetch was squashed by an upper
// cache (to prevent us from grabbing the line) or if a
// writeback arrived between the time the prefetch was
// placed in the MSHRs and when it was selected to be sent.
if (snoop_pkt.prefetchSquashed() || blk != NULL) {
DPRINTF(Cache, "Prefetch squashed by cache.  "
   "Deallocating mshr target %#x.\n", mshr->addr);
[...]
return NULL;
}

Interestingly the point where the prefetches get squashed is here, in 
handleSnoop():

// Invalidate any prefetch's from below that would strip write permissions
// MemCmd::HardPFReq is only observed by upstream caches.  After missing
// above and in it's own cache, a new MemCmd::ReadReq is created that
// downstream caches observe.
if (pkt->cmd == MemCmd::HardPFReq) {
DPRINTF(Cache, "Squashing prefetch from lower cache %#x\n",
pkt->getAddr());
pkt->setPrefetchSquashed();
return;
}

where despite the language about "strip write permissions", the prefetch 
appears to get squashed as long as the block is valid, regardless of the state.

On February 10th, 2015, 7:48 p.m. UTC, Steve Reinhardt wrote:

So if nothing else, my commit message describes the bug incorrectly... it's not 
just a matter of hitting in the write buffer, it's handling the case where it 
*both* hits in the write buffer of one upper-level cache, and also gets 
squashed because of a hit in another upper-level cache.  The actual symptom we 
were seeing was that the response from the cache with the write-buffer copy was 
causing an assertion, since the receiving cache wasn't expecting a response 
because it had squashed the prefetch.

On February 13th, 2015, 8:37 a.m. UTC, Andreas Hansson wrote:

Here is the fix: http://reviews.gem5.org/r/2654/

The fix has been pushed. I believe this patch can be discarded.


- Andreas


On February 6th, 2015, 12:38 a.m. UTC, Steve Reinhardt wrote:

Review request for Default.
By Steve Reinhardt.

Updated Feb. 6, 2015, 12:38 a.m.

Repository: gem5
Description

Changeset 10683:3147f3a868f7
---
mem: fix prefetcher bug regarding write buffer hits

Prefetches are supposed to be squashed if the block is already
present in a higher-level cache.  We squash appropriately if
the block is in a higher-level ca

Re: [gem5-dev] Review Request 2655: config: Fix for 'android' lookup in disk name

2015-03-02 Thread Andreas Hansson via gem5-dev


> On Feb. 19, 2015, 10:50 p.m., Andreas Hansson wrote:
> > Ship It!

Could someone be kind enough to push this?


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2655/#review5898
---


On Feb. 19, 2015, 10:46 p.m., Rizwana Begum wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2655/
> ---
> 
> (Updated Feb. 19, 2015, 10:46 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10695:74aaa564d5cc
> ---
> config: Fix for 'android' lookup in disk name
> 
> This patch modifies FSConfig.py to look for 'android' only in disk
> image name. Before this patch, 'android' was searched in full
> disk path.
> 
> 
> Diffs
> -
> 
>   configs/common/FSConfig.py 1a6785e37d81 
> 
> Diff: http://reviews.gem5.org/r/2655/diff/
> 
> 
> Testing
> ---
> 
> All quick and long regressions for ARM passed
> 
> 
> Thanks,
> 
> Rizwana Begum
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2636: mem: fix prefetcher bug regarding write buffer hits

2015-03-02 Thread Andreas Hansson via gem5-dev


> On Feb. 10, 2015, 5:37 p.m., Stephan Diestelhorst wrote:
> > I have had a similar impulse, when inspecting this code.  However, the 
> > prefetch hitting a write-back in an upper cache is actually already handled 
> > in Cache::getTimingPacket():
> > 
> > // Check if the prefetch hit a writeback in an upper cache
> > // and if so we will eventually get a HardPFResp from
> > // above
> > if (snoop_pkt.memInhibitAsserted()) {
> > // If we are getting a non-shared response it is dirty
> > bool pending_dirty_resp = !snoop_pkt.sharedAsserted();
> > markInService(mshr, pending_dirty_resp);
> > DPRINTF(Cache, "Upward snoop of prefetch for addr"
> > " %#x (%s) hit\n",
> > tgt_pkt->getAddr(), tgt_pkt->isSecure()? "s": "ns");
> > return NULL;
> > }
> > 
> > We are currently testing a patch that shuffles this thing upwards; we have 
> > more detail tomorrow.  In either way, if we want to go with this patch, it 
> > should address this seemingly dead bit of logic, as well.  My suggestion is 
> > to give this an additional day, and in the meantime test as Andreas has 
> > suggested.
> 
> Steve Reinhardt wrote:
> Good catch, I hadn't noticed that before.  I believe what's happening in 
> George's case is that there are multiple L1s above a shared L2, and one of 
> them has/had the block in O state (dirty shared) and is in the process of 
> writing it back, while others have the block in the shared state.  So the one 
> with the block in writeback goes to write back the data (in accordance with 
> the code snippet you have there), but meanwhile other caches have squashed 
> the prefetch, so your code never gets executed, because this code immediately 
> above it gets executed instead:
> 
> // Check to see if the prefetch was squashed by an upper
> // cache (to prevent us from grabbing the line) or if a
> // writeback arrived between the time the prefetch was
> // placed in the MSHRs and when it was selected to be sent.
> if (snoop_pkt.prefetchSquashed() || blk != NULL) {
> DPRINTF(Cache, "Prefetch squashed by cache.  "
>"Deallocating mshr target %#x.\n", 
> mshr->addr);
> [...]
> return NULL;
> }
> 
> Interestingly the point where the prefetches get squashed is here, in 
> handleSnoop():
> 
> // Invalidate any prefetch's from below that would strip write 
> permissions
> // MemCmd::HardPFReq is only observed by upstream caches.  After 
> missing
> // above and in it's own cache, a new MemCmd::ReadReq is created that
> // downstream caches observe.
> if (pkt->cmd == MemCmd::HardPFReq) {
> DPRINTF(Cache, "Squashing prefetch from lower cache %#x\n",
> pkt->getAddr());
> pkt->setPrefetchSquashed();
> return;
> }
> 
> where despite the language about "strip write permissions", the prefetch 
> appears to get squashed as long as the block is valid, regardless of the 
> state.
> 
> Steve Reinhardt wrote:
> So if nothing else, my commit message describes the bug incorrectly... 
> it's not just a matter of hitting in the write buffer, it's handling the case 
> where it *both* hits in the write buffer of one upper-level cache, and also 
> gets squashed because of a hit in another upper-level cache.  The actual 
> symptom we were seeing was that the response from the cache with the 
> write-buffer copy was causing an assertion, since the receiving cache wasn't 
> expecting a response because it had squashed the prefetch.
> 
> Andreas Hansson wrote:
> Here is the fix: http://reviews.gem5.org/r/2654/

The fix has been pushed. I believe this patch can be discarded.


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2636/#review5885
---


On Feb. 6, 2015, 12:38 a.m., Steve Reinhardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2636/
> ---
> 
> (Updated Feb. 6, 2015, 12:38 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10683:3147f3a868f7
> ---
> mem: fix prefetcher bug regarding write buffer hits
> 
> Prefetches are supposed to be squashed if the block is already
> present in a higher-level cache.  We squash appropriately if
> the block is in a higher-level cache or MSHR, but did not
> properly handle the

[gem5-dev] changeset in gem5: mem: Move crossbar default latencies to subcl...

2015-03-02 Thread Andreas Hansson via gem5-dev
changeset 67b3e74de9ae in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=67b3e74de9ae
description:
mem: Move crossbar default latencies to subclasses

This patch introduces a few subclasses to the CoherentXBar and
NoncoherentXBar to distinguish the different uses in the system. We
use the crossbar in a wide range of places: interfacing cores to the
L2, as a system interconnect, connecting I/O and peripherals,
etc. Needless to say, these crossbars have very different performance,
and the clock frequency alone is not enough to distinguish these
scenarios.

Instead of trying to capture every possible case, this patch
introduces dedicated subclasses for the three primary use-cases:
L2XBar, SystemXBar and IOXbar. More can be added if needed, and the
defaults can be overridden.

diffstat:

 configs/common/CacheConfig.py  |   6 +--
 configs/common/FSConfig.py |  14 
 configs/dram/sweep.py  |   2 +-
 configs/example/memcheck.py|   4 +-
 configs/example/memtest.py |   4 +-
 configs/example/ruby_mem_test.py   |   2 +-
 configs/example/se.py  |   2 +-
 configs/ruby/Ruby.py   |   2 +-
 configs/splash2/cluster.py |  10 +++---
 configs/splash2/run.py |   4 +-
 src/cpu/BaseCPU.py |   7 +---
 src/mem/XBar.py|  51 ++---
 tests/configs/base_config.py   |   4 +-
 tests/configs/memtest-filter.py|   6 ++--
 tests/configs/memtest.py   |   4 +-
 tests/configs/o3-timing-mp-ruby.py |   2 +-
 tests/configs/o3-timing-ruby.py|   2 +-
 tests/configs/simple-atomic-mp-ruby.py |   2 +-
 tests/configs/tgen-dram-ctrl.py|   2 +-
 tests/configs/tgen-simple-mem.py   |   2 +-
 20 files changed, 84 insertions(+), 48 deletions(-)

diffs (truncated from 458 to 300 lines):

diff -r b4fc9ad648aa -r 67b3e74de9ae configs/common/CacheConfig.py
--- a/configs/common/CacheConfig.py Mon Mar 02 04:00:46 2015 -0500
+++ b/configs/common/CacheConfig.py Mon Mar 02 04:00:47 2015 -0500
@@ -65,14 +65,12 @@
 if options.l2cache:
 # Provide a clock for the L2 and the L1-to-L2 bus here as they
 # are not connected using addTwoLevelCacheHierarchy. Use the
-# same clock as the CPUs, and set the L1-to-L2 bus width to 32
-# bytes (256 bits).
+# same clock as the CPUs.
 system.l2 = l2_cache_class(clk_domain=system.cpu_clk_domain,
size=options.l2_size,
assoc=options.l2_assoc)
 
-system.tol2bus = CoherentXBar(clk_domain = system.cpu_clk_domain,
-  width = 32)
+system.tol2bus = L2XBar(clk_domain = system.cpu_clk_domain)
 system.l2.cpu_side = system.tol2bus.master
 system.l2.mem_side = system.membus.slave
 
diff -r b4fc9ad648aa -r 67b3e74de9ae configs/common/FSConfig.py
--- a/configs/common/FSConfig.pyMon Mar 02 04:00:46 2015 -0500
+++ b/configs/common/FSConfig.pyMon Mar 02 04:00:47 2015 -0500
@@ -50,7 +50,7 @@
 def childImage(self, ci):
 self.image.child.image_file = ci
 
-class MemBus(CoherentXBar):
+class MemBus(SystemXBar):
 badaddr_responder = BadAddr()
 default = Self.badaddr_responder.pio
 
@@ -78,7 +78,7 @@
 self.tsunami = BaseTsunami()
 
 # Create the io bus to connect all device ports
-self.iobus = NoncoherentXBar()
+self.iobus = IOXBar()
 self.tsunami.attachIO(self.iobus)
 
 self.tsunami.ide.pio = self.iobus.master
@@ -143,7 +143,7 @@
 # generic system
 mdesc = SysConfig()
 self.readfile = mdesc.script()
-self.iobus = NoncoherentXBar()
+self.iobus = IOXBar()
 self.membus = MemBus()
 self.bridge = Bridge(delay='50ns')
 self.t1000 = T1000()
@@ -205,7 +205,7 @@
 mdesc = SysConfig()
 
 self.readfile = mdesc.script()
-self.iobus = NoncoherentXBar()
+self.iobus = IOXBar()
 self.membus = MemBus()
 self.membus.badaddr_responder.warn_access = "warn"
 self.bridge = Bridge(delay='50ns')
@@ -311,7 +311,7 @@
 # generic system
 mdesc = SysConfig()
 self.readfile = mdesc.script()
-self.iobus = NoncoherentXBar()
+self.iobus = IOXBar()
 self.membus = MemBus()
 self.bridge = Bridge(delay='50ns')
 self.mem_ranges = [AddrRange('1GB')]
@@ -358,7 +358,7 @@
 x86_sys.membus = MemBus()
 
 # North Bridge
-x86_sys.iobus = NoncoherentXBar()
+x86_sys.iobus = IOXBar()
 x86_sys.bridge = Bridge(delay='50ns')
 x86_sys.bridge.master = x86_sys.iobus.slave
 x86_sys.bridge.slave = x86_sys.membus.master
@@ -394,7 +394,7 @@
 
 def connectX86RubySystem(x86_sys):
 # North Bridge
-x86_sys.iobus = NoncoherentXBar()
+x86_sys.iobus = IOXBar

[gem5-dev] changeset in gem5: mem: Fix cache MSHR conflict determination

2015-03-02 Thread Andreas Hansson via gem5-dev
changeset 1072b1381560 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=1072b1381560
description:
mem: Fix cache MSHR conflict determination

This patch fixes a rather subtle issue in the sending of MSHR requests
in the cache, where the logic previously did not check for conflicts
between the MSRH queue and the write queue when requests were not
ready. The correct thing to do is to always check, since not having a
ready MSHR does not guarantee that there is no conflict.

The underlying problem seems to have slipped past due to the symmetric
timings used for the write queue and MSHR queue. However, with the
recent timing changes the bug caused regressions to fail.

diffstat:

 src/mem/cache/cache_impl.hh |  48 
 1 files changed, 22 insertions(+), 26 deletions(-)

diffs (72 lines):

diff -r b1d90d88420e -r 1072b1381560 src/mem/cache/cache_impl.hh
--- a/src/mem/cache/cache_impl.hh   Mon Mar 02 04:00:52 2015 -0500
+++ b/src/mem/cache/cache_impl.hh   Mon Mar 02 04:00:54 2015 -0500
@@ -1887,39 +1887,33 @@
 MSHR *
 Cache::getNextMSHR()
 {
-// Check both MSHR queue and write buffer for potential requests
+// Check both MSHR queue and write buffer for potential requests,
+// note that null does not mean there is no request, it could
+// simply be that it is not ready
 MSHR *miss_mshr  = mshrQueue.getNextMSHR();
 MSHR *write_mshr = writeBuffer.getNextMSHR();
 
-// Now figure out which one to send... some cases are easy
-if (miss_mshr && !write_mshr) {
-return miss_mshr;
-}
-if (write_mshr && !miss_mshr) {
-return write_mshr;
-}
+// If we got a write buffer request ready, first priority is a
+// full write buffer, otherwhise we favour the miss requests
+if (write_mshr &&
+((writeBuffer.isFull() && writeBuffer.inServiceEntries == 0) ||
+ !miss_mshr)) {
+// need to search MSHR queue for conflicting earlier miss.
+MSHR *conflict_mshr =
+mshrQueue.findPending(write_mshr->addr, write_mshr->size,
+  write_mshr->isSecure);
 
-if (miss_mshr && write_mshr) {
-// We have one of each... normally we favor the miss request
-// unless the write buffer is full
-if (writeBuffer.isFull() && writeBuffer.inServiceEntries == 0) {
-// Write buffer is full, so we'd like to issue a write;
-// need to search MSHR queue for conflicting earlier miss.
-MSHR *conflict_mshr =
-mshrQueue.findPending(write_mshr->addr, write_mshr->size,
-  write_mshr->isSecure);
+if (conflict_mshr && conflict_mshr->order < write_mshr->order) {
+// Service misses in order until conflict is cleared.
+return conflict_mshr;
 
-if (conflict_mshr && conflict_mshr->order < write_mshr->order) {
-// Service misses in order until conflict is cleared.
-return conflict_mshr;
-}
-
-// No conflicts; issue write
-return write_mshr;
+// @todo Note that we ignore the ready time of the conflict here
 }
 
-// Write buffer isn't full, but need to check it for
-// conflicting earlier writeback
+// No conflicts; issue write
+return write_mshr;
+} else if (miss_mshr) {
+// need to check for conflicting earlier writeback
 MSHR *conflict_mshr =
 writeBuffer.findPending(miss_mshr->addr, miss_mshr->size,
 miss_mshr->isSecure);
@@ -1937,6 +1931,8 @@
 // have to flush writes in order?  I don't think so... not
 // for Alpha anyway.  Maybe for x86?
 return conflict_mshr;
+
+// @todo Note that we ignore the ready time of the conflict here
 }
 
 // No conflicts; issue read
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] changeset in gem5: tests: Run regression timeout as foreground

2015-03-02 Thread Andreas Hansson via gem5-dev
changeset 9b71309d29f9 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=9b71309d29f9
description:
tests: Run regression timeout as foreground

Allow the user to send signals such as Ctrl C to the gem5 runs. Note
that this assumes coreutils >= 8.13, which aligns with Ubuntu 12.04
and RHE6.

diffstat:

 SConstruct   |  13 +
 tests/SConscript |   2 +-
 2 files changed, 10 insertions(+), 5 deletions(-)

diffs (35 lines):

diff -r 890269a13188 -r 9b71309d29f9 SConstruct
--- a/SConstructMon Mar 02 04:00:28 2015 -0500
+++ b/SConstructMon Mar 02 04:00:29 2015 -0500
@@ -784,10 +784,15 @@
 swig_flags=Split('-c++ -python -modern -templatereduce $_CPPINCFLAGS')
 main.Append(SWIGFLAGS=swig_flags)
 
-# Check for 'timeout' from GNU coreutils.  If present, regressions
-# will be run with a time limit.
-TIMEOUT_version = readCommand(['timeout', '--version'], exception=False)
-main['TIMEOUT'] = TIMEOUT_version and TIMEOUT_version.find('timeout') == 0
+# Check for 'timeout' from GNU coreutils. If present, regressions will
+# be run with a time limit. We require version 8.13 since we rely on
+# support for the '--foreground' option.
+timeout_lines = readCommand(['timeout', '--version'],
+exception='').splitlines()
+# Get the first line and tokenize it
+timeout_version = timeout_lines[0].split() if timeout_lines else []
+main['TIMEOUT'] =  timeout_version and \
+compareVersions(timeout_version[-1], '8.13') >= 0
 
 # filter out all existing swig scanners, they mess up the dependency
 # stuff for some reason
diff -r 890269a13188 -r 9b71309d29f9 tests/SConscript
--- a/tests/SConscript  Mon Mar 02 04:00:28 2015 -0500
+++ b/tests/SConscript  Mon Mar 02 04:00:29 2015 -0500
@@ -107,7 +107,7 @@
 # The slowest regression (bzip2) requires ~2.8 hours;
 # 4 hours was chosen to be conservative.
 elif env['TIMEOUT']:
-cmd = 'timeout 4h %s' % cmd
+cmd = 'timeout --foreground 4h %s' % cmd
 
 # Create a default value for the status string, changed as needed
 # based on the status.
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] changeset in gem5: mem: Unify all cache DPRINTF address formatting

2015-03-02 Thread Andreas Hansson via gem5-dev
changeset d1387fcd94b8 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=d1387fcd94b8
description:
mem: Unify all cache DPRINTF address formatting

This patch changes all the DPRINTF messages in the cache to use
'%#llx' every time a packet address is printed. The inclusion of '#'
ensures '0x' is prepended, and since the address type is a uint64_t %x
really should be %llx.

diffstat:

 src/mem/cache/cache_impl.hh |  73 
 src/mem/cache/mshr.cc   |   4 +-
 2 files changed, 41 insertions(+), 36 deletions(-)

diffs (truncated from 318 to 300 lines):

diff -r 1072b1381560 -r d1387fcd94b8 src/mem/cache/cache_impl.hh
--- a/src/mem/cache/cache_impl.hh   Mon Mar 02 04:00:54 2015 -0500
+++ b/src/mem/cache/cache_impl.hh   Mon Mar 02 04:00:56 2015 -0500
@@ -179,7 +179,7 @@
 // appended themselves to this cache before knowing the store
 // will fail.
 blk->status |= BlkDirty;
-DPRINTF(Cache, "%s for %s address %x size %d (write)\n", __func__,
+DPRINTF(Cache, "%s for %s addr %#llx size %d (write)\n", __func__,
 pkt->cmdString(), pkt->getAddr(), pkt->getSize());
 } else if (pkt->isRead()) {
 if (pkt->isLLSC()) {
@@ -241,7 +241,7 @@
 assert(blk != tempBlock);
 tags->invalidate(blk);
 blk->invalidate();
-DPRINTF(Cache, "%s for %s address %x size %d (invalidation)\n",
+DPRINTF(Cache, "%s for %s addr %#llx size %d (invalidation)\n",
 __func__, pkt->cmdString(), pkt->getAddr(), pkt->getSize());
 }
 }
@@ -308,7 +308,7 @@
 // sanity check
 assert(pkt->isRequest());
 
-DPRINTF(Cache, "%s for %s address %x size %d\n", __func__,
+DPRINTF(Cache, "%s for %s addr %#llx size %d\n", __func__,
 pkt->cmdString(), pkt->getAddr(), pkt->getSize());
 if (pkt->req->isUncacheable()) {
 uncacheableFlush(pkt);
@@ -323,9 +323,9 @@
 // that can modify its value.
 blk = tags->accessBlock(pkt->getAddr(), pkt->isSecure(), lat, id);
 
-DPRINTF(Cache, "%s%s %x (%s) %s\n", pkt->cmdString(),
+DPRINTF(Cache, "%s%s addr %#llx size %d (%s) %s\n", pkt->cmdString(),
 pkt->req->isInstFetch() ? " (ifetch)" : "",
-pkt->getAddr(), pkt->isSecure() ? "s" : "ns",
+pkt->getAddr(), pkt->getSize(), pkt->isSecure() ? "s" : "ns",
 blk ? "hit " + blk->print() : "miss");
 
 // Writeback handling is special case.  We can write the block into
@@ -392,7 +392,7 @@
 void
 Cache::recvTimingSnoopResp(PacketPtr pkt)
 {
-DPRINTF(Cache, "%s for %s address %x size %d\n", __func__,
+DPRINTF(Cache, "%s for %s addr %#llx size %d\n", __func__,
 pkt->cmdString(), pkt->getAddr(), pkt->getSize());
 
 assert(pkt->isResponse());
@@ -409,7 +409,7 @@
 assert(pkt->cmd == MemCmd::HardPFResp);
 // Check if it's a prefetch response and handle it. We shouldn't
 // get any other kinds of responses without FRRs.
-DPRINTF(Cache, "Got prefetch response from above for addr %#x (%s)\n",
+DPRINTF(Cache, "Got prefetch response from above for addr %#llx 
(%s)\n",
 pkt->getAddr(), pkt->isSecure() ? "s" : "ns");
 recvTimingResp(pkt);
 return;
@@ -470,7 +470,7 @@
 if (pkt->memInhibitAsserted()) {
 // a cache above us (but not where the packet came from) is
 // responding to the request
-DPRINTF(Cache, "mem inhibited on 0x%x (%s): not responding\n",
+DPRINTF(Cache, "mem inhibited on addr %#llx (%s): not responding\n",
 pkt->getAddr(), pkt->isSecure() ? "s" : "ns");
 assert(!pkt->req->isUncacheable());
 
@@ -678,6 +678,10 @@
 
 // Coalesce unless it was a software prefetch (see above).
 if (pkt) {
+DPRINTF(Cache, "%s coalescing MSHR for %s addr %#llx size 
%d\n",
+__func__, pkt->cmdString(), pkt->getAddr(),
+pkt->getSize());
+
 assert(pkt->req->masterId() < system->maxMasters());
 mshr_hits[pkt->cmdToIndex()][pkt->req->masterId()]++;
 if (mshr->threadNum != 0/*pkt->req->threadId()*/) {
@@ -833,7 +837,7 @@
 PacketPtr pkt = new Packet(cpu_pkt->req, cmd, blkSize);
 
 pkt->allocate();
-DPRINTF(Cache, "%s created %s address %x size %d\n",
+DPRINTF(Cache, "%s created %s addr %#llx size %d\n",
 __func__, pkt->cmdString(), pkt->getAddr(), pkt->getSize());
 return pkt;
 }
@@ -863,19 +867,19 @@
 if (blk && blk->isValid()) {
 tags->invalidate(blk);
 blk->invalidate();
-DPRINTF(Cache, "rcvd mem-inhibited %s on 0x%x (%s):"
+DPRINTF(Cache, "rcvd mem-inhibited %s on %#llx (%s):"
 " invalidating\n",
 pkt->cmdString(), pkt->getAddr(),
   

[gem5-dev] changeset in gem5: stats: Update stats to reflect cache and inte...

2015-03-02 Thread Andreas Hansson via gem5-dev
changeset 8a20e2a1562d in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=8a20e2a1562d
description:
stats: Update stats to reflect cache and interconnect changes

This is a bulk update of stats to match the changes to cache timing,
interconnect timing, and a few minor changes to the o3 CPU.

diffstat:

 tests/long/fs/10.linux-boot/ref/alpha/linux/tsunami-minor/stats.txt
|  1493 +-
 tests/long/fs/10.linux-boot/ref/alpha/linux/tsunami-o3-dual/stats.txt  
|  3848 +++---
 tests/long/fs/10.linux-boot/ref/alpha/linux/tsunami-o3/stats.txt   
|  2079 +-
 tests/long/fs/10.linux-boot/ref/alpha/linux/tsunami-switcheroo-full/stats.txt  
|  3027 ++--
 tests/long/fs/10.linux-boot/ref/arm/linux/realview-minor-dual/stats.txt
|  4418 +++---
 tests/long/fs/10.linux-boot/ref/arm/linux/realview-minor/stats.txt 
|  1832 +-
 tests/long/fs/10.linux-boot/ref/arm/linux/realview-o3-checker/stats.txt
|  2532 ++--
 tests/long/fs/10.linux-boot/ref/arm/linux/realview-o3-dual/stats.txt   
|  5864 
 tests/long/fs/10.linux-boot/ref/arm/linux/realview-o3/stats.txt
|  2460 +-
 tests/long/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-full/stats.txt   
|  3756 ++---
 tests/long/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-o3/stats.txt 
|  4062 +++---
 tests/long/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/stats.txt 
|  2795 ++--
 tests/long/fs/10.linux-boot/ref/arm/linux/realview64-minor-dual/stats.txt  
|  4969 +++---
 tests/long/fs/10.linux-boot/ref/arm/linux/realview64-minor/stats.txt   
|  2129 +-
 tests/long/fs/10.linux-boot/ref/arm/linux/realview64-o3-checker/stats.txt  
|  2818 ++--
 tests/long/fs/10.linux-boot/ref/arm/linux/realview64-o3-dual/stats.txt 
|  6279 -
 tests/long/fs/10.linux-boot/ref/arm/linux/realview64-o3/stats.txt  
|  2704 ++--
 tests/long/fs/10.linux-boot/ref/arm/linux/realview64-switcheroo-full/stats.txt 
|  4291 +++---
 tests/long/fs/10.linux-boot/ref/arm/linux/realview64-switcheroo-o3/stats.txt   
|  4414 +++---
 
tests/long/fs/10.linux-boot/ref/arm/linux/realview64-switcheroo-timing/stats.txt
   |  3220 ++--
 tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/stats.txt   
|  2555 ++--
 tests/long/fs/10.linux-boot/ref/x86/linux/pc-switcheroo-full/stats.txt 
|  3105 ++--
 tests/long/se/10.mcf/ref/arm/linux/minor-timing/stats.txt  
|   584 +-
 tests/long/se/10.mcf/ref/arm/linux/o3-timing/stats.txt 
|  1595 +-
 tests/long/se/10.mcf/ref/arm/linux/simple-atomic/stats.txt 
|22 +-
 tests/long/se/10.mcf/ref/arm/linux/simple-timing/stats.txt 
|   228 +-
 tests/long/se/10.mcf/ref/sparc/linux/simple-timing/stats.txt   
|   572 +-
 tests/long/se/10.mcf/ref/x86/linux/o3-timing/stats.txt 
|  1439 +-
 tests/long/se/10.mcf/ref/x86/linux/simple-timing/stats.txt 
|   572 +-
 tests/long/se/20.parser/ref/alpha/tru64/minor-timing/stats.txt 
|  1034 +-
 tests/long/se/20.parser/ref/arm/linux/minor-timing/stats.txt   
|  1050 +-
 tests/long/se/20.parser/ref/arm/linux/o3-timing/stats.txt  
|  1628 +-
 tests/long/se/20.parser/ref/arm/linux/simple-atomic/stats.txt  
|22 +-
 tests/long/se/20.parser/ref/arm/linux/simple-timing/stats.txt  
|   282 +-
 tests/long/se/20.parser/ref/x86/linux/o3-timing/stats.txt  
|  1657 +-
 tests/long/se/20.parser/ref/x86/linux/simple-timing/stats.txt  
|   452 +-
 tests/long/se/30.eon/ref/alpha/tru64/minor-timing/stats.txt
|   500 +-
 tests/long/se/30.eon/ref/alpha/tru64/o3-timing/stats.txt   
|  1338 +-
 tests/long/se/30.eon/ref/alpha/tru64/simple-timing/stats.txt   
|   564 +-
 tests/long/se/30.eon/ref/arm/linux/minor-timing/stats.txt  
|   708 +-
 tests/long/se/30.eon/ref/arm/linux/o3-timing/stats.txt 
|  1464 +-
 tests/long/se/30.eon/ref/arm/linux/simple-atomic/stats.txt 
|22 +-
 tests/long/se/30.eon/ref/arm/linux/simple-timing/stats.txt 
|   250 +-
 tests/long/se/40.perlbmk/ref/alpha/tru64/minor-timing/stats.txt
|   730 +-
 tests/long/se/40.perlbmk/ref/alpha/tru64/o3-timing/stats.txt   
|  1404 +-
 tests/long/se/40.perlbmk/ref/alpha/tru64/simple-timing/stats.txt   
|   430 +-
 tests/long/se/40.perlbmk/ref/arm/linux/minor-timing/stats.txt  
|   924 +-
 tests/long/se/40.perlbmk/ref/arm/linux/o3-timing/stats.txt 
  

[gem5-dev] changeset in gem5: mem: Tidy up the cache debug messages

2015-03-02 Thread Andreas Hansson via gem5-dev
changeset 9ba5e70964a4 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=9ba5e70964a4
description:
mem: Tidy up the cache debug messages

Avoid redundant inclusion of the name in the DPRINTF string.

diffstat:

 src/mem/cache/base.cc   |  10 +-
 src/mem/cache/base.hh   |   3 ++-
 src/mem/cache/cache_impl.hh |   7 +++
 3 files changed, 10 insertions(+), 10 deletions(-)

diffs (80 lines):

diff -r eddb533708cb -r 9ba5e70964a4 src/mem/cache/base.cc
--- a/src/mem/cache/base.cc Mon Mar 02 04:00:35 2015 -0500
+++ b/src/mem/cache/base.cc Mon Mar 02 04:00:37 2015 -0500
@@ -92,13 +92,13 @@
 BaseCache::CacheSlavePort::setBlocked()
 {
 assert(!blocked);
-DPRINTF(CachePort, "Cache port %s blocking new requests\n", name());
+DPRINTF(CachePort, "Port is blocking new requests\n");
 blocked = true;
 // if we already scheduled a retry in this cycle, but it has not yet
 // happened, cancel it
 if (sendRetryEvent.scheduled()) {
 owner.deschedule(sendRetryEvent);
-DPRINTF(CachePort, "Cache port %s deschedule retry\n", name());
+DPRINTF(CachePort, "Port descheduled retry\n");
 mustSendRetry = true;
 }
 }
@@ -107,10 +107,10 @@
 BaseCache::CacheSlavePort::clearBlocked()
 {
 assert(blocked);
-DPRINTF(CachePort, "Cache port %s accepting new requests\n", name());
+DPRINTF(CachePort, "Port is accepting new requests\n");
 blocked = false;
 if (mustSendRetry) {
-// @TODO: need to find a better time (next bus cycle?)
+// @TODO: need to find a better time (next cycle?)
 owner.schedule(sendRetryEvent, curTick() + 1);
 }
 }
@@ -118,7 +118,7 @@
 void
 BaseCache::CacheSlavePort::processSendRetry()
 {
-DPRINTF(CachePort, "Cache port %s sending retry\n", name());
+DPRINTF(CachePort, "Port is sending retry\n");
 
 // reset the flag and call retry
 mustSendRetry = false;
diff -r eddb533708cb -r 9ba5e70964a4 src/mem/cache/base.hh
--- a/src/mem/cache/base.hh Mon Mar 02 04:00:35 2015 -0500
+++ b/src/mem/cache/base.hh Mon Mar 02 04:00:37 2015 -0500
@@ -129,7 +129,8 @@
  */
 void requestBus(RequestCause cause, Tick time)
 {
-DPRINTF(CachePort, "Asserting bus request for cause %d\n", cause);
+DPRINTF(CachePort, "Scheduling request at %llu due to %d\n",
+time, cause);
 reqQueue.schedSendEvent(time);
 }
 
diff -r eddb533708cb -r 9ba5e70964a4 src/mem/cache/cache_impl.hh
--- a/src/mem/cache/cache_impl.hh   Mon Mar 02 04:00:35 2015 -0500
+++ b/src/mem/cache/cache_impl.hh   Mon Mar 02 04:00:37 2015 -0500
@@ -261,8 +261,7 @@
 markInServiceInternal(mshr, pending_dirty_resp);
 #if 0
 if (mshr->originalCmd == MemCmd::HardPFReq) {
-DPRINTF(HWPrefetch, "%s:Marking a HW_PF in service\n",
-name());
+DPRINTF(HWPrefetch, "Marking a HW_PF in service\n");
 //Also clear pending if need be
 if (!prefetcher->havePending())
 {
@@ -324,10 +323,10 @@
 // that can modify its value.
 blk = tags->accessBlock(pkt->getAddr(), pkt->isSecure(), lat, id);
 
-DPRINTF(Cache, "%s%s %x (%s) %s %s\n", pkt->cmdString(),
+DPRINTF(Cache, "%s%s %x (%s) %s\n", pkt->cmdString(),
 pkt->req->isInstFetch() ? " (ifetch)" : "",
 pkt->getAddr(), pkt->isSecure() ? "s" : "ns",
-blk ? "hit" : "miss", blk ? blk->print() : "");
+blk ? "hit " + blk->print() : "miss");
 
 // Writeback handling is special case.  We can write the block into
 // the cache without having a writeable copy (or any copy at all).
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] changeset in gem5: mem: Add byte mask to Packet::checkFunctional

2015-03-02 Thread Andreas Hansson via gem5-dev
changeset b1d90d88420e in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=b1d90d88420e
description:
mem: Add byte mask to Packet::checkFunctional

This patch changes the valid-bytes start/end to a proper byte
mask. With the changes in timing introduced in previous patches there
are more packets waiting in queues, and there are regressions using
the checker CPU failing due to non-contigous read data being found in
the various cache queues.

This patch also adds some more comments explaining what is going on,
and adds the fourth and missing case to Packet::checkFunctional.

diffstat:

 src/mem/packet.cc |  109 -
 src/mem/packet.hh |   13 +
 2 files changed, 46 insertions(+), 76 deletions(-)

diffs (195 lines):

diff -r 886d2458e0d6 -r b1d90d88420e src/mem/packet.cc
--- a/src/mem/packet.cc Mon Mar 02 04:00:49 2015 -0500
+++ b/src/mem/packet.cc Mon Mar 02 04:00:52 2015 -0500
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2011-2014 ARM Limited
+ * Copyright (c) 2011-2015 ARM Limited
  * All rights reserved
  *
  * The license below extends only to copyright in the software and shall
@@ -207,6 +207,8 @@
 if (isRead()) {
 if (func_start >= val_start && func_end <= val_end) {
 memcpy(getPtr(), _data + offset, getSize());
+if (bytesValid.empty())
+bytesValid.resize(getSize(), true);
 // complete overlap, and as the current packet is a read
 // we are done
 return true;
@@ -218,91 +220,64 @@
 
 // calculate offsets and copy sizes for the two byte arrays
 if (val_start < func_start && val_end <= func_end) {
+// the one we are checking against starts before and
+// ends before or the same
 val_offset = func_start - val_start;
 func_offset = 0;
 overlap_size = val_end - func_start;
 } else if (val_start >= func_start && val_end > func_end) {
+// the one we are checking against starts after or the
+// same, and ends after
 val_offset = 0;
 func_offset = val_start - func_start;
 overlap_size = func_end - val_start;
 } else if (val_start >= func_start && val_end <= func_end) {
+// the one we are checking against is completely
+// subsumed in the current packet, possibly starting
+// and ending at the same address
 val_offset = 0;
 func_offset = val_start - func_start;
 overlap_size = size;
+} else if (val_start < func_start && val_end > func_end) {
+// the current packet is completely subsumed in the
+// one we are checking against
+val_offset = func_start - val_start;
+func_offset = 0;
+overlap_size = func_end - func_start;
 } else {
-panic("BUG: Missed a case for a partial functional request");
+panic("Missed a case for checkFunctional with "
+  " %s 0x%x size %d, against 0x%x size %d\n",
+  cmdString(), getAddr(), getSize(), addr, size);
 }
 
-// Figure out how much of the partial overlap should be copied
-// into the packet and not overwrite previously found bytes.
-if (bytesValidStart == 0 && bytesValidEnd == 0) {
-// No bytes have been copied yet, just set indices
-// to found range
-bytesValidStart = func_offset;
-bytesValidEnd = func_offset + overlap_size;
-} else {
-// Some bytes have already been copied. Use bytesValid
-// indices and offset values to figure out how much data
-// to copy and where to copy it to.
-
-// Indice overlap conditions to check
-int a = func_offset - bytesValidStart;
-int b = (func_offset + overlap_size) - bytesValidEnd;
-int c = func_offset - bytesValidEnd;
-int d = (func_offset + overlap_size) - bytesValidStart;
-
-if (a >= 0 && b <= 0) {
-// bytes already in pkt data array are superset of
-// found bytes, will not copy any bytes
-overlap_size = 0;
-} else if (a < 0 && d >= 0 && b <= 0) {
-// found bytes will move bytesValidStart towards 0
-overlap_size = bytesValidStart - func_offset;
-bytesValidStart = func_offset;
-} else if (b > 0 && c <= 0 && a >= 0) {
-// found bytes will move bytesValidEnd
-// towards end of pkt data array
-   

[gem5-dev] changeset in gem5: mem: Split port retry for all different packe...

2015-03-02 Thread Andreas Hansson via gem5-dev
changeset eddb533708cb in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=eddb533708cb
description:
mem: Split port retry for all different packet classes

This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.

The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.

The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.

diffstat:

 src/arch/x86/pagetable_walker.cc   |6 +-
 src/arch/x86/pagetable_walker.hh   |4 +-
 src/cpu/kvm/base.hh|4 +-
 src/cpu/minor/fetch1.cc|2 +-
 src/cpu/minor/fetch1.hh|4 +-
 src/cpu/minor/lsq.cc   |2 +-
 src/cpu/minor/lsq.hh   |4 +-
 src/cpu/o3/cpu.cc  |8 +-
 src/cpu/o3/cpu.hh  |4 +-
 src/cpu/o3/fetch.hh|2 +-
 src/cpu/o3/fetch_impl.hh   |2 +-
 src/cpu/o3/lsq.hh  |2 +-
 src/cpu/o3/lsq_impl.hh |2 +-
 src/cpu/simple/atomic.hh   |2 +-
 src/cpu/simple/timing.cc   |8 +-
 src/cpu/simple/timing.hh   |8 +-
 src/cpu/testers/directedtest/RubyDirectedTester.hh |2 +-
 src/cpu/testers/memtest/memtest.cc |2 +-
 src/cpu/testers/memtest/memtest.hh |2 +-
 src/cpu/testers/networktest/networktest.cc |2 +-
 src/cpu/testers/networktest/networktest.hh |2 +-
 src/cpu/testers/rubytest/RubyTester.hh |2 +-
 src/cpu/testers/traffic_gen/traffic_gen.cc |2 +-
 src/cpu/testers/traffic_gen/traffic_gen.hh |4 +-
 src/dev/dma_device.cc  |2 +-
 src/dev/dma_device.hh  |2 +-
 src/mem/addr_mapper.cc |8 +-
 src/mem/addr_mapper.hh |   12 +-
 src/mem/bridge.cc  |8 +-
 src/mem/bridge.hh  |4 +-
 src/mem/cache/base.cc  |2 +-
 src/mem/cache/base.hh  |   12 +-
 src/mem/cache/cache.hh |   15 +-
 src/mem/cache/cache_impl.hh|  120 +--
 src/mem/coherent_xbar.cc   |   12 +-
 src/mem/coherent_xbar.hh   |   20 +-
 src/mem/comm_monitor.cc|8 +-
 src/mem/comm_monitor.hh|   12 +-
 src/mem/dram_ctrl.cc   |4 +-
 src/mem/dram_ctrl.hh   |2 +-
 src/mem/dramsim2.cc|8 +-
 src/mem/dramsim2.hh|4 +-
 src/mem/external_slave.cc  |6 +-
 src/mem/mem_checker_monitor.cc |8 +-
 src/mem/mem_checker_monitor.hh |   12 +-
 src/mem/mport.hh   |6 +-
 src/mem/noncoherent_xbar.cc|2 +-
 src/mem/noncoherent_xbar.hh|   10 +-
 src/mem/packet_queue.cc|  153 +++
 src/mem/packet_queue.hh|  160 +++-
 src/mem/port.cc|   16 +-
 src/mem/port.hh|   48 --
 src/mem/qport.hh   |   65 +---
 src/mem/ruby/slicc_interface/AbstractController.cc |6 +-
 src/mem/ruby/slicc_interface/AbstractController.hh |5 +-
 src/mem/ruby/structures/RubyMemoryControl.hh   |2 +-
 src/mem/ruby/system/DMASequencer.cc|2 +-
 src/mem/ruby/system/DMASequencer.hh|2 +-
 src/mem/ruby/system/RubyPort.cc|8 +-
 src/mem/ruby/system/RubyPort.hh

[gem5-dev] changeset in gem5: arm: Share a port for the two table walker ob...

2015-03-02 Thread Andreas Hansson via gem5-dev
changeset 4f8c1bd6fdb8 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=4f8c1bd6fdb8
description:
arm: Share a port for the two table walker objects

This patch changes how the MMU and table walkers are created such that
a single port is used to connect the MMU and the TLBs to the memory
system. Previously two ports were needed as there are two table walker
objects (stage one and stage two), and they both had a port. Now the
port itself is moved to the Stage2MMU, and each TableWalker is simply
using the port from the parent.

By using the same port we also remove the need for having an
additional crossbar joining the two ports before the walker cache or
the L2. This simplifies the creation of the CPU cache topology in
BaseCPU.py considerably. Moreover, for naming and symmetry reasons,
the TLB walker port is connected through the stage-one table walker
thus making the naming identical to x86. Along the same line, we use
the stage-one table walker to generate the master id that is used by
all TLB-related requests.

diffstat:

 src/arch/arm/ArmTLB.py   |  22 ++---
 src/arch/arm/stage2_mmu.cc   |  43 +++---
 src/arch/arm/stage2_mmu.hh   |  58 ---
 src/arch/arm/table_walker.cc |  72 +++
 src/arch/arm/table_walker.hh |  63 +++---
 src/arch/arm/tlb.cc  |  12 +-
 src/arch/arm/tlb.hh  |  13 +--
 src/cpu/BaseCPU.py   |  26 +--
 8 files changed, 161 insertions(+), 148 deletions(-)

diffs (truncated from 674 to 300 lines):

diff -r 4408a83f7881 -r 4f8c1bd6fdb8 src/arch/arm/ArmTLB.py
--- a/src/arch/arm/ArmTLB.pyMon Mar 02 04:00:41 2015 -0500
+++ b/src/arch/arm/ArmTLB.pyMon Mar 02 04:00:42 2015 -0500
@@ -1,6 +1,6 @@
 # -*- mode:python -*-
 
-# Copyright (c) 2009, 2013 ARM Limited
+# Copyright (c) 2009, 2013, 2015 ARM Limited
 # All rights reserved.
 #
 # The license below extends only to copyright in the software and shall
@@ -48,11 +48,17 @@
 cxx_class = 'ArmISA::TableWalker'
 cxx_header = "arch/arm/table_walker.hh"
 is_stage2 =  Param.Bool(False, "Is this object for stage 2 translation?")
-port = MasterPort("Port for TableWalker to do walk the translation with")
-sys = Param.System(Parent.any, "system object parameter")
 num_squash_per_cycle = Param.Unsigned(2,
 "Number of outstanding walks that can be squashed per cycle")
 
+# The port to the memory system. This port is ultimately belonging
+# to the Stage2MMU, and shared by the two table walkers, but we
+# access it through the ITB and DTB walked objects in the CPU for
+# symmetry with the other ISAs.
+port = MasterPort("Port used by the two table walkers")
+
+sys = Param.System(Parent.any, "system object parameter")
+
 class ArmTLB(SimObject):
 type = 'ArmTLB'
 cxx_class = 'ArmISA::TLB'
@@ -77,10 +83,16 @@
 tlb = Param.ArmTLB("Stage 1 TLB")
 stage2_tlb = Param.ArmTLB("Stage 2 TLB")
 
+sys = Param.System(Parent.any, "system object parameter")
+
 class ArmStage2IMMU(ArmStage2MMU):
+# We rely on the itb being a parameter of the CPU, and get the
+# appropriate object that way
 tlb = Parent.itb
-stage2_tlb = ArmStage2TLB(walker = ArmStage2TableWalker())
+stage2_tlb = ArmStage2TLB()
 
 class ArmStage2DMMU(ArmStage2MMU):
+# We rely on the dtb being a parameter of the CPU, and get the
+# appropriate object that way
 tlb = Parent.dtb
-stage2_tlb = ArmStage2TLB(walker = ArmStage2TableWalker())
+stage2_tlb = ArmStage2TLB()
diff -r 4408a83f7881 -r 4f8c1bd6fdb8 src/arch/arm/stage2_mmu.cc
--- a/src/arch/arm/stage2_mmu.ccMon Mar 02 04:00:41 2015 -0500
+++ b/src/arch/arm/stage2_mmu.ccMon Mar 02 04:00:42 2015 -0500
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2012-2013 ARM Limited
+ * Copyright (c) 2012-2013, 2015 ARM Limited
  * All rights reserved
  *
  * The license below extends only to copyright in the software and shall
@@ -37,29 +37,31 @@
  * Authors: Thomas Grocutt
  */
 
+#include "arch/arm/stage2_mmu.hh"
 #include "arch/arm/faults.hh"
-#include "arch/arm/stage2_mmu.hh"
 #include "arch/arm/system.hh"
+#include "arch/arm/table_walker.hh"
 #include "arch/arm/tlb.hh"
 #include "cpu/base.hh"
 #include "cpu/thread_context.hh"
-#include "debug/Checkpoint.hh"
-#include "debug/TLB.hh"
-#include "debug/TLBVerbose.hh"
 
 using namespace ArmISA;
 
 Stage2MMU::Stage2MMU(const Params *p)
-: SimObject(p), _stage1Tlb(p->tlb), _stage2Tlb(p->stage2_tlb)
+: SimObject(p), _stage1Tlb(p->tlb), _stage2Tlb(p->stage2_tlb),
+  port(_stage1Tlb->getTableWalker(), p->sys),
+  masterId(p->sys->getMasterId(_stage1Tlb->getTableWalker()->name()))
 {
-stage1Tlb()->setMMU(this);
-stage2Tlb()->setMMU(this);
+// we use the stage-on

Re: [gem5-dev] Review Request 2677: cpu: o3: commit: mark pipeline delay variable as consts

2015-03-01 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2677/#review5922
---

Ship it!


Ship It!

- Andreas Hansson


On Feb. 28, 2015, 10:41 p.m., Nilay Vaish wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2677/
> ---
> 
> (Updated Feb. 28, 2015, 10:41 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10713:175c1dba1179
> ---
> cpu: o3: commit: mark pipeline delay variable as consts
> 
> 
> Diffs
> -
> 
>   src/cpu/o3/commit.hh 4206946d60fe 
> 
> Diff: http://reviews.gem5.org/r/2677/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Nilay Vaish
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2676: cpu: o3: remove unused stat variables.

2015-03-01 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2676/#review5921
---

Ship it!


Ship It!

- Andreas Hansson


On Feb. 28, 2015, 10:40 p.m., Nilay Vaish wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2676/
> ---
> 
> (Updated Feb. 28, 2015, 10:40 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10712:bb6de70c386f
> ---
> cpu: o3: remove unused stat variables.
> 
> 
> Diffs
> -
> 
>   src/cpu/o3/commit.hh 4206946d60fe 
>   src/cpu/o3/commit_impl.hh 4206946d60fe 
> 
> Diff: http://reviews.gem5.org/r/2676/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Nilay Vaish
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2675: cpu: o3: combine if with same condition

2015-03-01 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2675/#review5920
---

Ship it!


Ship It!

- Andreas Hansson


On Feb. 28, 2015, 10:39 p.m., Nilay Vaish wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2675/
> ---
> 
> (Updated Feb. 28, 2015, 10:39 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10711:fde343df1e97
> ---
> cpu: o3: combine if with same condition
> 
> 
> Diffs
> -
> 
>   src/cpu/o3/commit_impl.hh 4206946d60fe 
> 
> Diff: http://reviews.gem5.org/r/2675/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Nilay Vaish
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2674: cpu: o3: remove member variable squashCounter

2015-03-01 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2674/#review5919
---

Ship it!


Ship It!

- Andreas Hansson


On Feb. 28, 2015, 10:37 p.m., Nilay Vaish wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2674/
> ---
> 
> (Updated Feb. 28, 2015, 10:37 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10709:e490e8f78f64
> ---
> cpu: o3: remove member variable squashCounter
> The variable is used in only one place and a whole new function 
> setNextStatus()
> has been defined just to compute the value of the variable.  Instead of 
> calling
> the function, the value is now computed in the loop that preceded the function
> call.
> 
> 
> Diffs
> -
> 
>   src/cpu/o3/commit.hh 4206946d60fe 
>   src/cpu/o3/commit_impl.hh 4206946d60fe 
> 
> Diff: http://reviews.gem5.org/r/2674/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Nilay Vaish
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2673: cpu: o3: remove unused function annotateMemoryUnits()

2015-03-01 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2673/#review5918
---

Ship it!


Ship It!

- Andreas Hansson


On Feb. 28, 2015, 10:34 p.m., Nilay Vaish wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2673/
> ---
> 
> (Updated Feb. 28, 2015, 10:34 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10708:a440c1e9ccfb
> ---
> cpu: o3: remove unused function annotateMemoryUnits()
> 
> 
> Diffs
> -
> 
>   src/cpu/o3/fu_pool.hh 4206946d60fe 
>   src/cpu/o3/fu_pool.cc 4206946d60fe 
> 
> Diff: http://reviews.gem5.org/r/2673/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Nilay Vaish
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] how can I add cache in tgen-simple-mem

2015-02-26 Thread Andreas Hansson via gem5-dev
As the error message suggest you seem to have a packet that spans a cache
line boundary. Have you checked the address and/or size to make sure they
all are within a cache line?

Andreas

On 26/02/2015 21:15, "Sensen Hu - EWI via gem5-dev" 
wrote:

>hi, Andreas.
>I've tried to set Maxtick more than 16000. But in the command window, it
>shows Aborted (core dumped).
>
>And the simerr file shows:
>
>gem5.opt: build/ARM/mem/cache/cache_impl.hh:164: void
>Cache::satisfyCpuSideRequest(PacketPtr,
>Cache::BlkType*, bool, bool) [with TagStore = LRU; PacketPtr =
>Packet*; Cache::BlkType = CacheBlk]: Assertion
>`pkt->getOffset(blkSize) + pkt->getSize() <= blkSize' failed.
>Program aborted at tick 57000
>
>I adjusted the Maxtick and the duration (ticks) of tgen-simple-mem.cfg, I
>got the same error.
>So I don't know how to solve the error.
>
>
>___
>gem5-dev mailing list
>gem5-dev@gem5.org
>http://m5sim.org/mailman/listinfo/gem5-dev
>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] how can I add cache in tgen-simple-mem

2015-02-26 Thread Andreas Hansson via gem5-dev
Could it not be as simple as back pressure?

The traffic generator can only send requests as fast as the port (crossbar
in this case), can accept them.

I suspect if you set the max time to some larger value it is all fine.

Andreas

On 26/02/2015 08:04, "Sensen Hu - EWI via gem5-dev" 
wrote:

>thanks, Erfan.
>I see. But my TraceGen can't still traverse the whole tgen-simple-mem.trc
>file, while it only executes the first 4 instructions. Then it stops.  I
>check the following simout file, shows that:
>Global frequency set at 1 ticks per second
>info: Entering event queue @ 0.  Starting simulation...
>Exiting @ tick 16000 because simulate() limit reached
>
>there are 12 instructions in my tgen-simple-mem.trc, the last
>instruction's tick is 15000.
>I decode the monitor.ptrc.trc file and find the TraceGen only exectues 4
>instruction.
>
>I'm wondering what's my error.
>
>
>
>___
>gem5-dev mailing list
>gem5-dev@gem5.org
>http://m5sim.org/mailman/listinfo/gem5-dev
>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Adding a supplementary stats file in gem5

2015-02-24 Thread Andreas Hansson via gem5-dev
Hi all,

The patches are all on RB for the C++/Python stats revamp, including json
and database output. The problem is speed. Due to the C++/Python swig
interface the impact on performance is quite substantial (even ignoring
the actual output). We have been playing around with ‘swig —builtin’ but
not manage to get it working. Perhaps someone would like to give it a go?

Andreas

On 24/02/2015 22:25, "Jason Power via gem5-dev"  wrote:

>Hi Sooraj, everyone,
>
>I agree with the sentiment here that the current stats system isn't as
>useful as it could be. In a lot of cases there is either way too much
>information, or way too little information. And all of the information
>could be easier to parse!
>
>On a related note, what ever happened to all of the changes to export the
>statistics to python/some database? Perhaps resurrecting that discussion
>could provide the more user-tunable stats output that you're wanting,
>Sooraj.
>
>Cheers,
>Jason
>
>On Wed Feb 18 2015 at 11:27:39 AM Sooraj Puthoor via gem5-dev <
>gem5-dev@gem5.org> wrote:
>
>> Hi all,
>>
>> I would like to get everyone's  thought on adding an optional
>>supplementary
>> stats file to gem5 in addition to the stats.txt file which we have
>>today.
>> The idea is to have a supplementary stats file  which will be generated
>> only if we explicitly pass a command line option to generate it.
>>
>> The rationale behind having such a supplementary stats file is to push
>>the
>> not so commonly used stats out of the stats.txt file. Thus, the
>>stats.txt
>> file can be kept simple/clean/small with only the important (or most
>> commonly used) statistics. In addition to making it more readable, we
>>are
>> also avoiding the size explosion problem of the stats.txt file. If a
>>user
>> needs to look into stats which are not available in the stats.txt, she
>>can
>> always pass the command line option to generate the supplementary stats
>> file. The supplementary stats file will have all the remaining
>>statistics
>> which are available in gem5 but not printed in stats.txt file.
>>
>> Please let me know your thoughts on this.
>>
>> Thanks
>> Sooraj
>> ___
>> gem5-dev mailing list
>> gem5-dev@gem5.org
>> http://m5sim.org/mailman/listinfo/gem5-dev
>>
>___
>gem5-dev mailing list
>gem5-dev@gem5.org
>http://m5sim.org/mailman/listinfo/gem5-dev
>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Changes on the horizon

2015-02-24 Thread Andreas Hansson via gem5-dev
Hi all,

As you may have seen, we have posted a number of patches that improve the 
fidelity of the classic interconnect models, along with quite some fixes for 
the caches and CPUs. Unsurprisingly these patches shake up the regression stats 
quite a bit, and I encourage anyone interested in memory-system performance to 
have a look and provide feedback. I hope to push these patches next week.

Thanks,

Andreas

# Port flow control
http://reviews.gem5.org/r/2646/ (already got Ship It)

# ARM CPU timings and TLB organisation
http://reviews.gem5.org/r/2647/
http://reviews.gem5.org/r/2658/

# Device fixes
http://reviews.gem5.org/r/2659/

# Crossbar timings
http://reviews.gem5.org/r/2660/
http://reviews.gem5.org/r/2661/
http://reviews.gem5.org/r/2662/

# Cache fixes
http://reviews.gem5.org/r/2663/
http://reviews.gem5.org/r/2664/
http://reviews.gem5.org/r/2670/



-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No: 2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] how can I add cache in tgen-simple-mem

2015-02-23 Thread Andreas Hansson via gem5-dev
Hi,

I would suggest to rather make a self-contained script and not rely on
test.py. That way it is easier to get a global picture of what you are
actually doing.

I suspect something is going wrong in how you are launching the simulator.

Andreas

On 23/02/2015 16:47, "Sensen Hu - EWI via gem5-dev" 
wrote:

>Thank you very much. I've solved the problem.
>But the TraceGen doesn't traverse the whole trace file.
>I make a small trace, for example:
>r,64,64,4000
>r,128,64,5000
>w,196,64,6000
>r,256,64,7000
>r,3453276,64,8000
>r,320,64,9000
>r,320,64,1
>w,196,64,11000
>r,3453276,64,12000
>
>The traceGen only run to Tick 9000. I want to know why the execution
>stops.
>
>My tgen-simple-mem.cfg is following
>STATE 0 100 TRACE tests/quick/se/70.tgen/tgen-simple-mem.trc 1
>INIT 0
>TRANSITION 0 0 1
>
>I set maxtick = 16000 in test.py
>___
>gem5-dev mailing list
>gem5-dev@gem5.org
>http://m5sim.org/mailman/listinfo/gem5-dev
>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2671: config: Specify OS type and release on command line

2015-02-23 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2671/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10725:46ee7538a15b
---
config: Specify OS type and release on command line

This patch enables users to speficy --os-type on the command
line. This option is used to take specific actions for an OS type,
such as changing the kernel command line. This patch is part of the
Android KitKat enablement.


Diffs
-

  configs/common/Benchmarks.py c6cb94a14fea 
  configs/common/FSConfig.py c6cb94a14fea 
  configs/common/Options.py c6cb94a14fea 
  configs/example/fs.py c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2671/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2670: mem: Fix cache MSHR conflict determination

2015-02-23 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2670/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10723:a8355eea658c
---
mem: Fix cache MSHR conflict determination

This patch fixes a rather subtle issue in the sending of MSHR requests
in the cache, where the logic previously did not check for conflicts
between the MSRH queue and the write queue when requests were not
ready. The correct thing to do is to always check, since not having a
ready MSHR does not guarantee that there is no conflict.

The underlying problem seems to have slipped past due to the symmetric
timings used for the write queue and MSHR queue. However, with the
recent timing changes the bug caused regressions to fail.


Diffs
-

  src/mem/cache/cache_impl.hh c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2670/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2664: mem: Add byte mask to Packet::checkFunctional

2015-02-23 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2664/
---

(Updated Feb. 23, 2015, 1:04 p.m.)


Review request for Default.


Repository: gem5


Description (updated)
---

Changeset 10722:366b2fa691b7
---
mem: Add byte mask to Packet::checkFunctional

This patch changes the valid-bytes start/end to a proper byte
mask. With the changes in timing introduced in previous patches there
are more packets waiting in queues, and there are regressions using
the checker CPU failing due to non-contigous read data being found in
the various cache queues.

This patch also adds some more comments explaining what is going on,
and adds the fourth and missing case to Packet::checkFunctional.


Diffs (updated)
-

  src/mem/packet.cc c6cb94a14fea 
  src/mem/packet.hh c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2664/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2660: mem: Add crossbar latencies

2015-02-23 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2660/
---

(Updated Feb. 23, 2015, 1:02 p.m.)


Review request for Default.


Repository: gem5


Description (updated)
---

Changeset 10718:1f9584f1a1c6
---
mem: Add crossbar latencies

This patch introduces latencies in crossbar that were neglected
before. In particular, it adds three parameters in crossbar model:
front_end_latency, forward_latency, and response_latency. Along with
these parameters, three corresponding members are added:
frontEndLatency, forwardLatency, and responseLatency. The coherent
crossbar has an additional snoop_response_latency.

The latency of the request path through the xbar is set as
--> frontEndLatency + forwardLatency

In case the snoop filter is enabled, the request path latency is charged
also by look-up latency of the snoop filter.
--> frontEndLatency + SF(lookupLatency) + forwardLatency.

The latency of the response path through the xbar is set instead as
--> responseLatency.

In case of snoop response, if the response is treated as a normal response
the latency associated is again
--> responseLatency;

If instead it is forwarded as snoop response we add an additional variable
+ snoopResponseLatency
and the latency associated is
--> snoopResponseLatency;

Furthermore, this patch lets the crossbar progress on the next clock
edge after an unused retry, changing the time the crossbar considers
itself busy after sending a retry that was not acted upon.


Diffs (updated)
-

  src/mem/xbar.cc c6cb94a14fea 
  src/mem/noncoherent_xbar.cc c6cb94a14fea 
  src/mem/xbar.hh c6cb94a14fea 
  src/mem/noncoherent_xbar.hh c6cb94a14fea 
  src/mem/coherent_xbar.cc c6cb94a14fea 
  src/mem/XBar.py c6cb94a14fea 
  src/mem/coherent_xbar.hh c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2660/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Reading Memory Traces

2015-02-20 Thread Andreas Hansson via gem5-dev
Hi Gregory,

The traffic generators should work just fine with Ruby, and they already
have trace players. There is nothing precluding you from using the
non-Ruby memory system. Ruby gives you the option of having a more
elaborate network topology, but if you are fine with crossbars the
non-Ruby memory system is probably a more sensible starting point.

gem5 also has trace capturing facilities through the CommMonitor, so if
you capture the traces on gem5 it should be easy to replay them (and avoid
the PA VA issues).

Andreas


On 20/02/2015 14:17, "VAUMOURIN Grégory via gem5-dev" 
wrote:

>Hello,
>
>I'm new to the Gem5 !
>
>I want to read some pre-recorded memory traces (it's a file that contains
>a serie of memory access) with Gem5. I want to stress the interconnection
>network and the memories with it, I need precises simulations so I want
>to use the Ruby Simulator.
>
>I've starting to implement a CPU that just read an access on my memory
>trace and convert it into a request for Ruby, this is quite similar to
>the Ruby Tester that generate some traffic with randomly addresses. I
>don't know if some works have already been done in this direction.
>
>Anyway I'm having trouble with the address of my request, cause I
>recorded my memory traces on my computer so the addresses that I have
>recorded are 48-bits virtual addresses and I don't know how to convert it
>to physical addresses (which seem to be on 27 bits in my case ?)
>
>If anyone has a clue , or can point be to a better direction to stimulate
>the Ruby system with some pre-recorded file ? I can't use the TrafficGen
>generators if I use Ruby Systems right ?
>
>--
>Grégory Vaumourin
>Doctorant - Phd Student
>Laboratoire Adéquation Algorithme Architecture -- Matching
>Algorithm Architecture Laboratory
>Phone: +33 (1) 69080069
>
>
>
>
>___
>gem5-dev mailing list
>gem5-dev@gem5.org
>http://m5sim.org/mailman/listinfo/gem5-dev
>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2655: config: Fix for 'android' lookup in disk name

2015-02-19 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2655/#review5898
---

Ship it!


Ship It!

- Andreas Hansson


On Feb. 19, 2015, 10:46 p.m., Rizwana Begum wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2655/
> ---
> 
> (Updated Feb. 19, 2015, 10:46 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10695:74aaa564d5cc
> ---
> config: Fix for 'android' lookup in disk name
> 
> This patch modifies FSConfig.py to look for 'android' only in disk
> image name. Before this patch, 'android' was searched in full
> disk path.
> 
> 
> Diffs
> -
> 
>   configs/common/FSConfig.py 1a6785e37d81 
> 
> Diff: http://reviews.gem5.org/r/2655/diff/
> 
> 
> Testing
> ---
> 
> All quick and long regressions for ARM passed
> 
> 
> Thanks,
> 
> Rizwana Begum
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2664: mem: Add byte mask to Packet::checkFunctional

2015-02-19 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2664/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10722:e34eb3130030
---
mem: Add byte mask to Packet::checkFunctional

This patch changes the valid-bytes start/end to a proper byte
mask. With the changes in timing introduced in previous patches there
are more packets waiting in queues, and there are regressions using
the checker CPU failing due to non-contigous read data being found in
the various cache queues.

This patch also adds some more comments explaining what is going on,
and adds the fourth and missing case to Packet::checkFunctional.


Diffs
-

  src/mem/packet.hh c6cb94a14fea 
  src/mem/packet.cc c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2664/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2662: mem: Downstream components consumes new crossbar delays

2015-02-19 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2662/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10720:bd4fc2b06322
---
mem: Downstream components consumes new crossbar delays

This patch makes the caches and memory controllers consume the delay
that is annotated to a packet by the crossbar. Previously many
components simply threw these delays away. Note that the devices still
do not pay for these delays.


Diffs
-

  src/mem/cache/cache_impl.hh c6cb94a14fea 
  src/mem/dram_ctrl.cc c6cb94a14fea 
  src/mem/dramsim2.cc c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2662/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2661: mem: Move crossbar default latencies to subclasses

2015-02-19 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2661/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10719:857838bcc36e
---
mem: Move crossbar default latencies to subclasses

This patch introduces a few subclasses to the CoherentXBar and
NoncoherentXBar to distinguish the different uses in the system. We
use the crossbar in a wide range of places: interfacing cores to the
L2, as a system interconnect, connecting I/O and peripherals,
etc. Needless to say, these crossbars have very different performance,
and the clock frequency alone is not enough to distinguish these
scenarios.

Instead of trying to capture every possible case, this patch
introduces dedicated subclasses for the three primary use-cases:
L2XBar, SystemXBar and IOXbar. More can be added if needed, and the
defaults can be overridden.


Diffs
-

  configs/common/CacheConfig.py c6cb94a14fea 
  configs/common/FSConfig.py c6cb94a14fea 
  configs/dram/sweep.py c6cb94a14fea 
  configs/example/memcheck.py c6cb94a14fea 
  configs/example/memtest.py c6cb94a14fea 
  configs/example/ruby_mem_test.py c6cb94a14fea 
  configs/example/se.py c6cb94a14fea 
  configs/ruby/Ruby.py c6cb94a14fea 
  configs/splash2/cluster.py c6cb94a14fea 
  configs/splash2/run.py c6cb94a14fea 
  src/cpu/BaseCPU.py c6cb94a14fea 
  src/mem/XBar.py c6cb94a14fea 
  tests/configs/base_config.py c6cb94a14fea 
  tests/configs/memtest-filter.py c6cb94a14fea 
  tests/configs/memtest.py c6cb94a14fea 
  tests/configs/o3-timing-mp-ruby.py c6cb94a14fea 
  tests/configs/o3-timing-ruby.py c6cb94a14fea 
  tests/configs/simple-atomic-mp-ruby.py c6cb94a14fea 
  tests/configs/tgen-dram-ctrl.py c6cb94a14fea 
  tests/configs/tgen-simple-mem.py c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2661/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2660: mem: Add crossbar latencies

2015-02-19 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2660/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10718:6a1f2d99bf79
---
mem: Add crossbar latencies

This patch introduces latencies in crossbar that were neglected
before. In particular, it adds three parameters in crossbar model:
front_end_latency, forward_latency, and response_latency. Along with
these parameters, three corresponding members are added:
frontEndLatency, forwardLatency, and responseLatency. The coherent
crossbar has an additional snoop_response_latency.

The latency of the request path through the xbar is set as
--> frontEndLatency + forwardLatency

In case the snoop filter is enabled, the request path latency is charged
also by look-up latency of the snoop filter.
--> frontEndLatency + SF(lookupLatency) + forwardLatency.

The latency of the response path through the xbar is set instead as
--> responseLatency.

In case of snoop response, if the response is treated as a normal response
the latency associated is again
--> responseLatency;

If instead it is forwarded as snoop response we add an additional variable
+ snoopResponseLatency
and the latency associated is
--> snoopResponseLatency;

Furthermore, this patch lets the crossbar progress on the next clock
edge after an unused retry, changing the time the crossbar considers
itself busy after sending a retry that was not acted upon.


Diffs
-

  src/mem/XBar.py c6cb94a14fea 
  src/mem/coherent_xbar.hh c6cb94a14fea 
  src/mem/coherent_xbar.cc c6cb94a14fea 
  src/mem/noncoherent_xbar.cc c6cb94a14fea 
  src/mem/xbar.hh c6cb94a14fea 
  src/mem/xbar.cc c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2660/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2663: mem: Add option to force in-order insertion in PacketQueue

2015-02-19 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2663/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10721:212f4fdb3eaa
---
mem: Add option to force in-order insertion in PacketQueue

By default, the packet queue is ordered by the ticks of the to-be-sent
packages. With the recent modifications of packages sinking their header time
when their resposne leaves the caches, there could be cases of MSHR targets
being allocated and ordered A, B, but their responses being sent out in the
order B,A. This led to inconsistencies in bus traffic, in particular the snoop
filter observing first a ReadExResp and later a ReadRespWithInv.  Logically,
these were ordered the other way around behind the MSHR, but due to the timing
adjustments when inserting into the PacketQueue, they were sent out in the
wrong order on the bus, confusing the snoop filter.

This patch adds a flag (off by default) such that these special cases can
request in-order insertion into the packet queue, which might offset timing
slighty. This is expected to occur rarely and not affect timing results.


Diffs
-

  src/mem/cache/cache_impl.hh c6cb94a14fea 
  src/mem/packet_queue.hh c6cb94a14fea 
  src/mem/packet_queue.cc c6cb94a14fea 
  src/mem/qport.hh c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2663/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2657: cpu: Add a PC-value to the traffic generator requests

2015-02-19 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2657/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10710:756b63187a8e
---
cpu: Add a PC-value to the traffic generator requests

Have the traffic generator add its masterID as the PC address to the
requests. That way, prefetchers (and other components) that use a PC
for request classification will see per-tester streams of requests.
This enables us to test strided prefetchers with the memchecker, too.


Diffs
-

  src/cpu/testers/traffic_gen/generators.cc c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2657/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2659: dev, arm: Clean up PL011 and rewrite interrupt handling

2015-02-19 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2659/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10717:d4b5abfcc632
---
dev, arm: Clean up PL011 and rewrite interrupt handling

The ARM PL011 UART model didn't clear and raise interrupts
correctly. This changeset rewrites the whole interrupt handling and
makes it both simpler and fixes several cases where the correct
interrupts weren't raised or cleared. Additionally, it cleans up many
other aspects of the code.


Diffs
-

  src/dev/arm/pl011.hh c6cb94a14fea 
  src/dev/arm/pl011.cc c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2659/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2656: tests: Run regression timeout as foreground

2015-02-19 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2656/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10709:1fe2beace503
---
tests: Run regression timeout as foreground

Allow the user to send signals such as Ctrl C to the gem5 runs. Note
that this assumes coreutils >= 8.13, which aligns with Ubuntu 12.04
and RHE6.


Diffs
-

  SConstruct c6cb94a14fea 
  tests/SConscript c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2656/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2658: arm: Share a port for the two table walker objects

2015-02-19 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2658/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10716:30365626f911
---
arm: Share a port for the two table walker objects

This patch changes how the MMU and table walkers are created such that
a single port is used to connect the MMU and the TLBs to the memory
system. Previously two ports were needed as there are two table walker
objects (stage one and stage two), and they both had a port. Now the
port itself is moved to the Stage2MMU, and each TableWalker is simply
using the port from the parent.

By using the same port we also remove the need for having an
additional crossbar joining the two ports before the walker cache or
the L2. This simplifies the creation of the CPU cache topology in
BaseCPU.py considerably. Moreover, for naming and symmetry reasons,
the TLB walker port is connected through the stage-one table walker
thus making the naming identical to x86. Along the same line, we use
the stage-one table walker to generate the master id that is used by
all TLB-related requests.


Diffs
-

  src/arch/arm/ArmTLB.py c6cb94a14fea 
  src/arch/arm/stage2_mmu.hh c6cb94a14fea 
  src/arch/arm/stage2_mmu.cc c6cb94a14fea 
  src/arch/arm/table_walker.hh c6cb94a14fea 
  src/arch/arm/table_walker.cc c6cb94a14fea 
  src/arch/arm/tlb.hh c6cb94a14fea 
  src/arch/arm/tlb.cc c6cb94a14fea 
  src/cpu/BaseCPU.py c6cb94a14fea 

Diff: http://reviews.gem5.org/r/2658/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] changeset in gem5: dev: Fix undefined behaviuor in i8254xGBe

2015-02-16 Thread Andreas Hansson via gem5-dev
changeset ac3236a0873b in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=ac3236a0873b
description:
dev: Fix undefined behaviuor in i8254xGBe

This patch fixes a rather unfortunate oversight where the annotation
pointer was used even though it is null. Somehow the code still works,
but UBSan is rather unhappy. The use is now guarded, and the variable
is initialised in the constructor (as well as init()).

diffstat:

 src/dev/i8254xGBe.cc |   4 ++--
 src/dev/i8254xGBe.hh |  23 +++
 2 files changed, 17 insertions(+), 10 deletions(-)

diffs (71 lines):

diff -r aaa0f985da8e -r ac3236a0873b src/dev/i8254xGBe.cc
--- a/src/dev/i8254xGBe.cc  Mon Feb 16 03:34:18 2015 -0500
+++ b/src/dev/i8254xGBe.cc  Mon Feb 16 03:34:35 2015 -0500
@@ -58,7 +58,7 @@
 using namespace Net;
 
 IGbE::IGbE(const Params *p)
-: EtherDevice(p), etherInt(NULL),  drainManager(NULL),
+: EtherDevice(p), etherInt(NULL), cpa(NULL), drainManager(NULL),
   rxFifo(p->rx_fifo_size), txFifo(p->tx_fifo_size), rxTick(false),
   txTick(false), txFifoTick(false), rxDmaPacket(false), pktOffset(0),
   fetchDelay(p->fetch_delay), wbDelay(p->wb_delay), 
@@ -2390,7 +2390,7 @@
 
 anPq("TXQ", "TX FIFO Q");
 if (etherInt->sendPacket(txFifo.front())) {
-cpa->hwQ(CPA::FL_NONE, sys, macAddr, "TXQ", "WireQ", 0);
+anQ("TXQ", "WireQ");
 if (DTRACE(EthernetSM)) {
 IpPtr ip(txFifo.front());
 if (ip)
diff -r aaa0f985da8e -r ac3236a0873b src/dev/i8254xGBe.hh
--- a/src/dev/i8254xGBe.hh  Mon Feb 16 03:34:18 2015 -0500
+++ b/src/dev/i8254xGBe.hh  Mon Feb 16 03:34:35 2015 -0500
@@ -182,31 +182,38 @@
 void checkDrain();
 
 void anBegin(std::string sm, std::string st, int flags = CPA::FL_NONE) {
-cpa->hwBegin((CPA::flags)flags, sys, macAddr, sm, st);
+if (cpa)
+cpa->hwBegin((CPA::flags)flags, sys, macAddr, sm, st);
 }
 
-void anQ(std::string sm, std::string q) { 
-cpa->hwQ(CPA::FL_NONE, sys, macAddr, sm, q, macAddr);
+void anQ(std::string sm, std::string q) {
+if (cpa)
+cpa->hwQ(CPA::FL_NONE, sys, macAddr, sm, q, macAddr);
 }
 
 void anDq(std::string sm, std::string q) {
-cpa->hwDq(CPA::FL_NONE, sys, macAddr, sm, q, macAddr);
+if (cpa)
+cpa->hwDq(CPA::FL_NONE, sys, macAddr, sm, q, macAddr);
 }
 
 void anPq(std::string sm, std::string q, int num = 1) {
-cpa->hwPq(CPA::FL_NONE, sys, macAddr, sm, q, macAddr, NULL, num);
+if (cpa)
+cpa->hwPq(CPA::FL_NONE, sys, macAddr, sm, q, macAddr, NULL, num);
 }
 
 void anRq(std::string sm, std::string q, int num = 1) {
-cpa->hwRq(CPA::FL_NONE, sys, macAddr, sm, q, macAddr, NULL, num);
+if (cpa)
+cpa->hwRq(CPA::FL_NONE, sys, macAddr, sm, q, macAddr, NULL, num);
 }
 
 void anWe(std::string sm, std::string q) {
-cpa->hwWe(CPA::FL_NONE, sys, macAddr, sm, q, macAddr);
+if (cpa)
+cpa->hwWe(CPA::FL_NONE, sys, macAddr, sm, q, macAddr);
 }
 
 void anWf(std::string sm, std::string q) {
-cpa->hwWf(CPA::FL_NONE, sys, macAddr, sm, q, macAddr);
+if (cpa)
+cpa->hwWf(CPA::FL_NONE, sys, macAddr, sm, q, macAddr);
 }
 
 
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] changeset in gem5: cpu: TrafficGen sinks snoops without complaining

2015-02-16 Thread Andreas Hansson via gem5-dev
changeset 63810213a687 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=63810213a687
description:
cpu: TrafficGen sinks snoops without complaining

To be able to use the TrafficGen in a system with caches we need to
allow it to sink incoming snoop requests. By default the master port
panics, so silently ignore any snoops.

diffstat:

 src/cpu/testers/traffic_gen/traffic_gen.hh |  6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diffs (16 lines):

diff -r 41413f830836 -r 63810213a687 src/cpu/testers/traffic_gen/traffic_gen.hh
--- a/src/cpu/testers/traffic_gen/traffic_gen.hhMon Feb 16 03:34:47 
2015 -0500
+++ b/src/cpu/testers/traffic_gen/traffic_gen.hhMon Feb 16 03:34:55 
2015 -0500
@@ -152,6 +152,12 @@
 
 bool recvTimingResp(PacketPtr pkt);
 
+void recvTimingSnoopReq(PacketPtr pkt) { }
+
+void recvFunctionalSnoop(PacketPtr pkt) { }
+
+Tick recvAtomicSnoop(PacketPtr pkt) { return 0; }
+
   private:
 
 TrafficGen& trafficGen;
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] changeset in gem5: config: Add memcheck stress test

2015-02-16 Thread Andreas Hansson via gem5-dev
changeset c6cb94a14fea in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=c6cb94a14fea
description:
config: Add memcheck stress test

This is a rather unfortunate copy of the memtest.py example script,
that actually stresses the system with true sharing as opposed to the
false sharing of the MemTest. To do so it uses TrafficGen instances to
generate the reads/writes, and MemCheckerMonitor combined with the
MemChecker to check the validity of the read/written values.

As a bonus, this script also enables the addition of prefetchers, and
the traffic is created to have a mix of random addresses and linear
strides. We use the TaggedPrefetcher since the packets do not have a
request with a PC.

At the moment the code is almost identical to the memtest.py script,
and no effort has been made to factor out the construction of the
tree. The challenge is that the instantiation and connection of the
testers and monitors is done as part of the tree building.

diffstat:

 configs/example/memcheck.py |  306 
 configs/example/memtest.py  |2 +-
 2 files changed, 307 insertions(+), 1 deletions(-)

diffs (truncated from 322 to 300 lines):

diff -r 63810213a687 -r c6cb94a14fea configs/example/memcheck.py
--- /dev/null   Thu Jan 01 00:00:00 1970 +
+++ b/configs/example/memcheck.py   Mon Feb 16 03:35:23 2015 -0500
@@ -0,0 +1,306 @@
+# Copyright (c) 2015 ARM Limited
+# All rights reserved.
+#
+# The license below extends only to copyright in the software and shall
+# not be construed as granting a license to any other intellectual
+# property including but not limited to intellectual property relating
+# to a hardware implementation of the functionality of the software
+# licensed hereunder.  You may use the software subject to the license
+# terms below provided that you ensure that this notice is replicated
+# unmodified and in its entirety in all distributions of the software,
+# modified or unmodified, in source code or in binary form.
+#
+# Copyright (c) 2006-2007 The Regents of The University of Michigan
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met: redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer;
+# redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution;
+# neither the name of the copyright holders nor the names of its
+# contributors may be used to endorse or promote products derived from
+# this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+# Authors: Ron Dreslinski
+#  Andreas Hansson
+
+import optparse
+import sys
+
+import m5
+from m5.objects import *
+
+parser = optparse.OptionParser()
+
+parser.add_option("-a", "--atomic", action="store_true",
+  help="Use atomic (non-timing) mode")
+parser.add_option("-b", "--blocking", action="store_true",
+  help="Use blocking caches")
+parser.add_option("-m", "--maxtick", type="int", default=m5.MaxTick,
+  metavar="T",
+  help="Stop after T ticks")
+parser.add_option("-p", "--prefetchers", action="store_true",
+  help="Use prefetchers")
+parser.add_option("-s", "--stridepref", action="store_true",
+  help="Use strided prefetchers")
+
+# This example script has a lot in common with the memtest.py in that
+# it is designed to stress tests the memory system. However, this
+# script uses oblivious traffic generators to create the stimuli, and
+# couples them with memcheckers to verify that the data read matches
+# the allowed outcomes. Just like memtest.py, the traffic generators
+# and checkers are placed in a tree topology. At the bottom of the
+# tree is a shared memory, and then at each level a number of
+# generators and checkers are attached, alo

[gem5-dev] changeset in gem5: mem: Use the range cache for lookup as well a...

2015-02-16 Thread Andreas Hansson via gem5-dev
changeset d0004c12d024 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=d0004c12d024
description:
mem: Use the range cache for lookup as well as access

This patch changes the range cache used in the global physical memory
to be an iterator so that we can use it not only as part of isMemAddr,
but also access and functionalAccess. This matches use-cases where a
core is using the atomic non-caching memory mode, and repeatedly calls
isMemAddr and access.

Linux boot on aarch32, with a single atomic CPU, is now more than 30%
faster when using "--fastmem" compared to not using the direct memory
access.

diffstat:

 src/mem/physical.cc |  38 --
 src/mem/physical.hh |   5 +++--
 2 files changed, 27 insertions(+), 16 deletions(-)

diffs (92 lines):

diff -r 829adc48e175 -r d0004c12d024 src/mem/physical.cc
--- a/src/mem/physical.cc   Mon Feb 16 03:33:28 2015 -0500
+++ b/src/mem/physical.cc   Mon Feb 16 03:33:37 2015 -0500
@@ -60,7 +60,7 @@
 
 PhysicalMemory::PhysicalMemory(const string& _name,
const vector& _memories) :
-_name(_name), size(0)
+_name(_name), rangeCache(addrMap.end()), size(0)
 {
 // add the memories from the system to the address map as
 // appropriate
@@ -181,7 +181,9 @@
 PhysicalMemory::isMemAddr(Addr addr) const
 {
 // see if the address is within the last matched range
-if (!rangeCache.contains(addr)) {
+if (rangeCache != addrMap.end() && rangeCache->first.contains(addr)) {
+return true;
+} else {
 // lookup in the interval tree
 const auto& r = addrMap.find(addr);
 if (r == addrMap.end()) {
@@ -189,13 +191,9 @@
 return false;
 }
 // the range is in the tree, update the cache
-rangeCache = r->first;
+rangeCache = r;
+return true;
 }
-
-assert(addrMap.find(addr) != addrMap.end());
-
-// either matched the cache or found in the tree
-return true;
 }
 
 AddrRangeList
@@ -239,9 +237,15 @@
 {
 assert(pkt->isRequest());
 Addr addr = pkt->getAddr();
-const auto& m = addrMap.find(addr);
-assert(m != addrMap.end());
-m->second->access(pkt);
+if (rangeCache != addrMap.end() && rangeCache->first.contains(addr)) {
+rangeCache->second->access(pkt);
+} else {
+// do not update the cache here, as we typically call
+// isMemAddr before calling access
+const auto& m = addrMap.find(addr);
+assert(m != addrMap.end());
+m->second->access(pkt);
+}
 }
 
 void
@@ -249,9 +253,15 @@
 {
 assert(pkt->isRequest());
 Addr addr = pkt->getAddr();
-const auto& m = addrMap.find(addr);
-assert(m != addrMap.end());
-m->second->functionalAccess(pkt);
+if (rangeCache != addrMap.end() && rangeCache->first.contains(addr)) {
+rangeCache->second->functionalAccess(pkt);
+} else {
+// do not update the cache here, as we typically call
+// isMemAddr before calling functionalAccess
+const auto& m = addrMap.find(addr);
+assert(m != addrMap.end());
+m->second->functionalAccess(pkt);
+}
 }
 
 void
diff -r 829adc48e175 -r d0004c12d024 src/mem/physical.hh
--- a/src/mem/physical.hh   Mon Feb 16 03:33:28 2015 -0500
+++ b/src/mem/physical.hh   Mon Feb 16 03:33:37 2015 -0500
@@ -75,8 +75,9 @@
 // Global address map
 AddrRangeMap addrMap;
 
-// a mutable cache for the last range that matched an address
-mutable AddrRange rangeCache;
+// a mutable cache for the last address map iterator that matched
+// an address
+mutable AddrRangeMap::const_iterator rangeCache;
 
 // All address-mapped memories
 std::vector memories;
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] changeset in gem5: arch: Make readMiscRegNoEffect const throughout

2015-02-16 Thread Andreas Hansson via gem5-dev
changeset 829adc48e175 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=829adc48e175
description:
arch: Make readMiscRegNoEffect const throughout

Finally took the plunge and made this apply to all ISAs, not just ARM.

diffstat:

 src/arch/alpha/isa.cc |  2 +-
 src/arch/alpha/isa.hh |  2 +-
 src/arch/mips/isa.cc  |  4 ++--
 src/arch/mips/isa.hh  |  4 ++--
 src/arch/power/isa.hh |  2 +-
 src/arch/sparc/isa.cc |  2 +-
 src/arch/sparc/isa.hh |  2 +-
 src/arch/x86/isa.cc   |  2 +-
 src/arch/x86/isa.hh   |  2 +-
 src/cpu/checker/cpu.hh|  2 +-
 src/cpu/checker/thread_context.hh |  2 +-
 src/cpu/minor/exec_context.hh |  2 +-
 src/cpu/o3/cpu.cc |  2 +-
 src/cpu/o3/cpu.hh |  2 +-
 src/cpu/o3/thread_context.hh  |  2 +-
 src/cpu/simple/base.hh|  2 +-
 src/cpu/simple_thread.hh  |  2 +-
 src/cpu/thread_context.hh |  4 ++--
 18 files changed, 21 insertions(+), 21 deletions(-)

diffs (242 lines):

diff -r 71c40e5c8bd4 -r 829adc48e175 src/arch/alpha/isa.cc
--- a/src/arch/alpha/isa.cc Fri Jan 16 14:12:03 2015 -0600
+++ b/src/arch/alpha/isa.cc Mon Feb 16 03:33:28 2015 -0500
@@ -74,7 +74,7 @@
 
 
 MiscReg
-ISA::readMiscRegNoEffect(int misc_reg, ThreadID tid)
+ISA::readMiscRegNoEffect(int misc_reg, ThreadID tid) const
 {
 switch (misc_reg) {
   case MISCREG_FPCR:
diff -r 71c40e5c8bd4 -r 829adc48e175 src/arch/alpha/isa.hh
--- a/src/arch/alpha/isa.hh Fri Jan 16 14:12:03 2015 -0600
+++ b/src/arch/alpha/isa.hh Mon Feb 16 03:33:28 2015 -0500
@@ -73,7 +73,7 @@
 
   public:
 
-MiscReg readMiscRegNoEffect(int misc_reg, ThreadID tid = 0);
+MiscReg readMiscRegNoEffect(int misc_reg, ThreadID tid = 0) const;
 MiscReg readMiscReg(int misc_reg, ThreadContext *tc, ThreadID tid = 0);
 
 void setMiscRegNoEffect(int misc_reg, const MiscReg &val,
diff -r 71c40e5c8bd4 -r 829adc48e175 src/arch/mips/isa.cc
--- a/src/arch/mips/isa.cc  Fri Jan 16 14:12:03 2015 -0600
+++ b/src/arch/mips/isa.cc  Mon Feb 16 03:33:28 2015 -0500
@@ -410,14 +410,14 @@
 }
 
 inline unsigned
-ISA::getVPENum(ThreadID tid)
+ISA::getVPENum(ThreadID tid) const
 {
 TCBindReg tcBind = miscRegFile[MISCREG_TC_BIND][tid];
 return tcBind.curVPE;
 }
 
 MiscReg
-ISA::readMiscRegNoEffect(int misc_reg, ThreadID tid)
+ISA::readMiscRegNoEffect(int misc_reg, ThreadID tid) const
 {
 unsigned reg_sel = (bankType[misc_reg] == perThreadContext)
 ? tid : getVPENum(tid);
diff -r 71c40e5c8bd4 -r 829adc48e175 src/arch/mips/isa.hh
--- a/src/arch/mips/isa.hh  Fri Jan 16 14:12:03 2015 -0600
+++ b/src/arch/mips/isa.hh  Mon Feb 16 03:33:28 2015 -0500
@@ -76,7 +76,7 @@
 
 void configCP();
 
-unsigned getVPENum(ThreadID tid);
+unsigned getVPENum(ThreadID tid) const;
 
 //
 //
@@ -87,7 +87,7 @@
 //@TODO: MIPS MT's register view automatically connects
 //   Status to TCStatus depending on current thread
 void updateCP0ReadView(int misc_reg, ThreadID tid) { }
-MiscReg readMiscRegNoEffect(int misc_reg, ThreadID tid = 0);
+MiscReg readMiscRegNoEffect(int misc_reg, ThreadID tid = 0) const;
 
 //template 
 MiscReg readMiscReg(int misc_reg,
diff -r 71c40e5c8bd4 -r 829adc48e175 src/arch/power/isa.hh
--- a/src/arch/power/isa.hh Fri Jan 16 14:12:03 2015 -0600
+++ b/src/arch/power/isa.hh Mon Feb 16 03:33:28 2015 -0500
@@ -61,7 +61,7 @@
 }
 
 MiscReg
-readMiscRegNoEffect(int misc_reg)
+readMiscRegNoEffect(int misc_reg) const
 {
 fatal("Power does not currently have any misc regs defined\n");
 return dummy;
diff -r 71c40e5c8bd4 -r 829adc48e175 src/arch/sparc/isa.cc
--- a/src/arch/sparc/isa.cc Fri Jan 16 14:12:03 2015 -0600
+++ b/src/arch/sparc/isa.cc Mon Feb 16 03:33:28 2015 -0500
@@ -173,7 +173,7 @@
 }
 
 MiscReg
-ISA::readMiscRegNoEffect(int miscReg)
+ISA::readMiscRegNoEffect(int miscReg) const
 {
 
   // The three miscRegs are moved up from the switch statement
diff -r 71c40e5c8bd4 -r 829adc48e175 src/arch/sparc/isa.hh
--- a/src/arch/sparc/isa.hh Fri Jan 16 14:12:03 2015 -0600
+++ b/src/arch/sparc/isa.hh Mon Feb 16 03:33:28 2015 -0500
@@ -183,7 +183,7 @@
 
   public:
 
-MiscReg readMiscRegNoEffect(int miscReg);
+MiscReg readMiscRegNoEffect(int miscReg) const;
 MiscReg readMiscReg(int miscReg, ThreadContext *tc);
 
 void setMiscRegNoEffect(int miscReg, const MiscReg val);
diff -r 71c40e5c8bd4 -r 829adc48e175 src/arch/x86/isa.cc
--- a/src/arch/x86/isa.cc   Fri Jan 16 14:12:03 2015 -0600
+++ b/src/arch/x86/isa.cc   Mon Feb 16 03:33:28 2015 -0500
@@ -124,7 +124,7 @@
 }
 
 MiscReg
-ISA::readMiscRegNoEffect(int miscReg)
+ISA::readMiscRegNoEffect(int miscReg) const
 {

[gem5-dev] changeset in gem5: mem: mmap the backing store with MAP_NORESERVE

2015-02-16 Thread Andreas Hansson via gem5-dev
changeset 417ba77dedb4 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=417ba77dedb4
description:
mem: mmap the backing store with MAP_NORESERVE

This patch ensures we can run simulations with very large simulated
memories (at least 64 TB based on some quick runs on a Linux
workstation). In essence this allows us to efficiently deal with
sparse address maps without having to implement a redirection layer in
the backing store.

This opens up for run-time errors if we eventually exhausts the hosts
memory and swap space, but this should hopefully never happen.

diffstat:

 src/mem/physical.cc |  28 ++--
 src/mem/physical.hh |   6 +-
 src/sim/System.py   |   7 +++
 src/sim/system.cc   |   2 +-
 4 files changed, 39 insertions(+), 4 deletions(-)

diffs (101 lines):

diff -r d0004c12d024 -r 417ba77dedb4 src/mem/physical.cc
--- a/src/mem/physical.cc   Mon Feb 16 03:33:37 2015 -0500
+++ b/src/mem/physical.cc   Mon Feb 16 03:33:47 2015 -0500
@@ -56,12 +56,29 @@
 #include "mem/abstract_mem.hh"
 #include "mem/physical.hh"
 
+/**
+ * On Linux, MAP_NORESERVE allow us to simulate a very large memory
+ * without committing to actually providing the swap space on the
+ * host. On OSX the MAP_NORESERVE flag does not exist, so simply make
+ * it 0.
+ */
+#if defined(__APPLE__)
+#ifndef MAP_NORESERVE
+#define MAP_NORESERVE 0
+#endif
+#endif
+
 using namespace std;
 
 PhysicalMemory::PhysicalMemory(const string& _name,
-   const vector& _memories) :
-_name(_name), rangeCache(addrMap.end()), size(0)
+   const vector& _memories,
+   bool mmap_using_noreserve) :
+_name(_name), rangeCache(addrMap.end()), size(0),
+mmapUsingNoReserve(mmap_using_noreserve)
 {
+if (mmap_using_noreserve)
+warn("Not reserving swap space. May cause SIGSEGV on actual usage\n");
+
 // add the memories from the system to the address map as
 // appropriate
 for (const auto& m : _memories) {
@@ -148,6 +165,13 @@
 DPRINTF(AddrRanges, "Creating backing store for range %s with size %d\n",
 range.to_string(), range.size());
 int map_flags = MAP_ANON | MAP_PRIVATE;
+
+// to be able to simulate very large memories, the user can opt to
+// pass noreserve to mmap
+if (mmapUsingNoReserve) {
+map_flags |= MAP_NORESERVE;
+}
+
 uint8_t* pmem = (uint8_t*) mmap(NULL, range.size(),
 PROT_READ | PROT_WRITE,
 map_flags, -1, 0);
diff -r d0004c12d024 -r 417ba77dedb4 src/mem/physical.hh
--- a/src/mem/physical.hh   Mon Feb 16 03:33:37 2015 -0500
+++ b/src/mem/physical.hh   Mon Feb 16 03:33:47 2015 -0500
@@ -85,6 +85,9 @@
 // The total memory size
 uint64_t size;
 
+// Let the user choose if we reserve swap space when calling mmap
+const bool mmapUsingNoReserve;
+
 // The physical memory used to provide the memory in the simulated
 // system
 std::vector> backingStore;
@@ -112,7 +115,8 @@
  * Create a physical memory object, wrapping a number of memories.
  */
 PhysicalMemory(const std::string& _name,
-   const std::vector& _memories);
+   const std::vector& _memories,
+   bool mmap_using_noreserve);
 
 /**
  * Unmap all the backing store we have used.
diff -r d0004c12d024 -r 417ba77dedb4 src/sim/System.py
--- a/src/sim/System.py Mon Feb 16 03:33:37 2015 -0500
+++ b/src/sim/System.py Mon Feb 16 03:33:47 2015 -0500
@@ -59,6 +59,13 @@
   "All memories in the system")
 mem_mode = Param.MemoryMode('atomic', "The mode the memory system is in")
 
+# When reserving memory on the host, we have the option of
+# reserving swap space or not (by passing MAP_NORESERVE to
+# mmap). By enabling this flag, we accomodate cases where a large
+# (but sparse) memory is simulated.
+mmap_using_noreserve = Param.Bool(False, "mmap the backing store " \
+  "without reserving swap")
+
 # The memory ranges are to be populated when creating the system
 # such that these can be passed from the I/O subsystem through an
 # I/O bridge or cache
diff -r d0004c12d024 -r 417ba77dedb4 src/sim/system.cc
--- a/src/sim/system.cc Mon Feb 16 03:33:37 2015 -0500
+++ b/src/sim/system.cc Mon Feb 16 03:33:47 2015 -0500
@@ -88,7 +88,7 @@
   loadAddrMask(p->load_addr_mask),
   loadAddrOffset(p->load_offset),
   nextPID(0),
-  physmem(name() + ".physmem", p->memories),
+  physmem(name() + ".physmem", p->memories, p->mmap_using_noreserve),
   memoryMode(p->mem_mode),
   _cacheLineSize(p->cache_line_size),
   workItemsBegin(0),
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5s

Re: [gem5-dev] Review Request 2655: config: Fix for 'android' lookup in disk name

2015-02-15 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2655/#review5897
---



configs/common/FSConfig.py


Thanks for this.

Could you perhaps add a comment on the line above stating what this check 
is doing?



- Andreas Hansson


On Feb. 13, 2015, 9:10 p.m., Rizwana Begum wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2655/
> ---
> 
> (Updated Feb. 13, 2015, 9:10 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10695:542395f62f88
> ---
> config: Fix for 'android' lookup in disk name
> 
> This patch modifies FSConfig.py to look for 'android' only in disk
> image name. Before this patch, 'android' was searched in full
> disk path.
> 
> 
> Diffs
> -
> 
>   configs/common/FSConfig.py 1a6785e37d81 
> 
> Diff: http://reviews.gem5.org/r/2655/diff/
> 
> 
> Testing
> ---
> 
> All quick and long regressions for ARM passed
> 
> 
> Thanks,
> 
> Rizwana Begum
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Patches ready to go

2015-02-13 Thread Andreas Hansson via gem5-dev
Hi everyone,

The following patches are ready to go, and I intend to push them early next 
week. If you need more time to review please let me know.

Thanks,

Andreas

# CPU tracing
http://reviews.gem5.org/r/2563/

# Miscellaneous cleanups and fixes
http://reviews.gem5.org/r/2567/
http://reviews.gem5.org/r/2621/
http://reviews.gem5.org/r/2622/
http://reviews.gem5.org/r/2623/
http://reviews.gem5.org/r/2624/
http://reviews.gem5.org/r/2630/

# Memcheck and testing
http://reviews.gem5.org/r/2625/
http://reviews.gem5.org/r/2626/


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No: 2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2626: config: Add memcheck stress test

2015-02-13 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2626/
---

(Updated Feb. 13, 2015, 8:55 a.m.)


Review request for Default.


Repository: gem5


Description
---

Changeset 10705:bc21bfd38cbd
---
config: Add memcheck stress test

This is a rather unfortunate copy of the memtest.py example script,
that actually stresses the system with true sharing as opposed to the
false sharing of the MemTest. To do so it uses TrafficGen instances to
generate the reads/writes, and MemCheckerMonitor combined with the
MemChecker to check the validity of the read/written values.

As a bonus, this script also enables the addition of prefetchers, and
the traffic is created to have a mix of random addresses and linear
strides. We use the TaggedPrefetcher since the packets do not have a
request with a PC.

At the moment the code is almost identical to the memtest.py script,
and no effort has been made to factor out the construction of the
tree. The challenge is that the instantiation and connection of the
testers and monitors is done as part of the tree building.


Diffs (updated)
-

  configs/example/memcheck.py PRE-CREATION 
  configs/example/memtest.py 1a6785e37d81 

Diff: http://reviews.gem5.org/r/2626/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2626: config: Add memcheck stress test

2015-02-13 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2626/
---

(Updated Feb. 13, 2015, 8:54 a.m.)


Review request for Default.


Repository: gem5


Description (updated)
---

Changeset 10705:bc21bfd38cbd
---
config: Add memcheck stress test

This is a rather unfortunate copy of the memtest.py example script,
that actually stresses the system with true sharing as opposed to the
false sharing of the MemTest. To do so it uses TrafficGen instances to
generate the reads/writes, and MemCheckerMonitor combined with the
MemChecker to check the validity of the read/written values.

As a bonus, this script also enables the addition of prefetchers, and
the traffic is created to have a mix of random addresses and linear
strides. We use the TaggedPrefetcher since the packets do not have a
request with a PC.

At the moment the code is almost identical to the memtest.py script,
and no effort has been made to factor out the construction of the
tree. The challenge is that the instantiation and connection of the
testers and monitors is done as part of the tree building.


Diffs (updated)
-

  configs/example/memcheck.py PRE-CREATION 
  configs/example/memtest.py 1a6785e37d81 

Diff: http://reviews.gem5.org/r/2626/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2636: mem: fix prefetcher bug regarding write buffer hits

2015-02-13 Thread Andreas Hansson via gem5-dev


> On Feb. 10, 2015, 5:37 p.m., Stephan Diestelhorst wrote:
> > I have had a similar impulse, when inspecting this code.  However, the 
> > prefetch hitting a write-back in an upper cache is actually already handled 
> > in Cache::getTimingPacket():
> > 
> > // Check if the prefetch hit a writeback in an upper cache
> > // and if so we will eventually get a HardPFResp from
> > // above
> > if (snoop_pkt.memInhibitAsserted()) {
> > // If we are getting a non-shared response it is dirty
> > bool pending_dirty_resp = !snoop_pkt.sharedAsserted();
> > markInService(mshr, pending_dirty_resp);
> > DPRINTF(Cache, "Upward snoop of prefetch for addr"
> > " %#x (%s) hit\n",
> > tgt_pkt->getAddr(), tgt_pkt->isSecure()? "s": "ns");
> > return NULL;
> > }
> > 
> > We are currently testing a patch that shuffles this thing upwards; we have 
> > more detail tomorrow.  In either way, if we want to go with this patch, it 
> > should address this seemingly dead bit of logic, as well.  My suggestion is 
> > to give this an additional day, and in the meantime test as Andreas has 
> > suggested.
> 
> Steve Reinhardt wrote:
> Good catch, I hadn't noticed that before.  I believe what's happening in 
> George's case is that there are multiple L1s above a shared L2, and one of 
> them has/had the block in O state (dirty shared) and is in the process of 
> writing it back, while others have the block in the shared state.  So the one 
> with the block in writeback goes to write back the data (in accordance with 
> the code snippet you have there), but meanwhile other caches have squashed 
> the prefetch, so your code never gets executed, because this code immediately 
> above it gets executed instead:
> 
> // Check to see if the prefetch was squashed by an upper
> // cache (to prevent us from grabbing the line) or if a
> // writeback arrived between the time the prefetch was
> // placed in the MSHRs and when it was selected to be sent.
> if (snoop_pkt.prefetchSquashed() || blk != NULL) {
> DPRINTF(Cache, "Prefetch squashed by cache.  "
>"Deallocating mshr target %#x.\n", 
> mshr->addr);
> [...]
> return NULL;
> }
> 
> Interestingly the point where the prefetches get squashed is here, in 
> handleSnoop():
> 
> // Invalidate any prefetch's from below that would strip write 
> permissions
> // MemCmd::HardPFReq is only observed by upstream caches.  After 
> missing
> // above and in it's own cache, a new MemCmd::ReadReq is created that
> // downstream caches observe.
> if (pkt->cmd == MemCmd::HardPFReq) {
> DPRINTF(Cache, "Squashing prefetch from lower cache %#x\n",
> pkt->getAddr());
> pkt->setPrefetchSquashed();
> return;
> }
> 
> where despite the language about "strip write permissions", the prefetch 
> appears to get squashed as long as the block is valid, regardless of the 
> state.
> 
> Steve Reinhardt wrote:
> So if nothing else, my commit message describes the bug incorrectly... 
> it's not just a matter of hitting in the write buffer, it's handling the case 
> where it *both* hits in the write buffer of one upper-level cache, and also 
> gets squashed because of a hit in another upper-level cache.  The actual 
> symptom we were seeing was that the response from the cache with the 
> write-buffer copy was causing an assertion, since the receiving cache wasn't 
> expecting a response because it had squashed the prefetch.

Here is the fix: http://reviews.gem5.org/r/2654/


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2636/#review5885
---


On Feb. 6, 2015, 12:38 a.m., Steve Reinhardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2636/
> ---
> 
> (Updated Feb. 6, 2015, 12:38 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10683:3147f3a868f7
> ---
> mem: fix prefetcher bug regarding write buffer hits
> 
> Prefetches are supposed to be squashed if the block is already
> present in a higher-level cache.  We squash appropriately if
> the block is in a higher-level cache or MSHR, but did not
> properly handle the case where the block is in the write buffer.
> 
> Thanks to George Michelogiannakis  for
> help i

[gem5-dev] Review Request 2654: mem: Fix prefetchSquash + memInhibitAsserted bug

2015-02-13 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2654/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10706:7aa00b79bc99
---
mem: Fix prefetchSquash + memInhibitAsserted bug

This patch resolves a bug with hardware prefetches. Before a hardware
prefetch is sent towards the memory, the system generates a snoop
request to check all caches above the prefetch generating cache for
the presence of the prefetth target. If the prefetch target is found
in the tags or the MSHRs of the upper caches, the cache sets the
prefetchSquashed flag in the snoop packet. When the snoop packet
returns with the prefetchSquashed flag set, the prefetch generating
cache deallocates the MSHR reserved for the prefetch. If the prefetch
target is found in the writeback buffer of the upper cache, the cache
sets the memInhibit flag, which signals the prefetch generating cache
to expect the data from the writeback. When the snoop packet returns
with the memInhibitAsserted flag set, it marks the allocated MSHR as
inService and waits for the data from the writeback.

If the prefetch target is found in multiple upper level caches,
specifically in the tags or MSHRs of one upper level cache and the
writeback buffer of another, the snoop packet will return with both
prefetchSquashed and memInhibitAsserted set, while the current code is
not written to handle such an outcome. Current code checks for the
prefetchSquashed flag first, if it finds the flag, it deallocates the
reserved MSHR. This leads to assert failure when the data from the
writeback appears at cache. In this fix, we simply switch the order of
checks. We first check for memInhibitAsserted and then for prefetch
squashed.


Diffs
-

  src/mem/cache/cache_impl.hh 1a6785e37d81 

Diff: http://reviews.gem5.org/r/2654/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Using gprof in SE mode

2015-02-12 Thread Andreas Hansson via gem5-dev
Hi Raul,

gem5.perf and gem5.prof are targets used to profile gem5, not what is
running as a guest in gem5. gem5.perf is using google perftools, and
gem5.prof is using gprof. Thus, if you are optimising gem5 itself these
build targets are very helpful. Note that the stats.txt has nothing to do
with profiling gem5 itself, and you get exactly the same stats with
gem5.fast, gem5.debug etc.

gem5 ARM full-system lets you run a ‘normal’ profiler in the simulated
guest OS. I do not dare say for the other ISAs.

Andreas

On 12/02/2015 22:34, "Raul Garcia via gem5-dev"  wrote:

>Hello Steve,
>
>Thank you for your reply :). I am a new user of gem5 and so far I know
>that running gem5.prof enables the stats.txt to report a number of
>metrics like number of ops simulated, etc Now I have some questions:
>
>1. I want to know the the percentage of the total running time that
>program used by a particular function, like gprof tool would report e. g.:
>
>Each sample counts as 0.01 seconds.
>%cumulative self  self   total
>time secondsseconds calls s/call s/call name
>33.86 15.52 15.52115.52  15.52  func2
>33.82 31.02 15.50115.50  15.50  new_func1
>33.29 46.27 15.26115.26  30.75  func1
>0.07  46.30 0.03main
>
>Does the stats.txt file report this metrics?
>
>2. If I take the gmon.out file that was generated before and the binary
>and I execute the gprof command:
>
>$ arm-linux-gnueabi-gprof hello gmon.out
>arm-linux-gnueabi-gprof: gmon.out: unexpected EOF after reading 4196091
>of 25458728 samples
>
>I get an error, am I doing it wrong?. How can I read the generated
>gmon.out file using gprof and the binary?
>
>3. The stats.txt reports the number of committed ops, but how can I know
>the opcodes of all of those ops that were committed? I know that the
>number of instruction of each class is reported  but how can I dump the
>exact number that each instruction in particular was executed?
>
>
>Best Regards,
>Raul.
>
>From: gem5-dev [gem5-dev-boun...@gem5.org] on behalf of Steve Reinhardt
>via gem5-dev [gem5-dev@gem5.org]
>Sent: Thursday, February 12, 2015 8:37 PM
>To: gem5 Developer List
>Subject: Re: [gem5-dev] Using gprof in SE mode
>
>Are you looking to run gprof specifically, or are you just wanting profile
>information?  As was discussed in this earlier thread, those are very
>different things, and only the latter one is really practical.
>
>I don't know that anyone has followed up on the ideas discussed on that
>earlier thread (I hope anyone who has speaks up), but if you need help on
>some of the things that were mentioned (using the BaseCPU profile or
>function_trace features) please ask away.
>
>Steve
>
>
>On Thu, Feb 12, 2015 at 10:32 AM, Raul Garcia via gem5-dev <
>gem5-dev@gem5.org> wrote:
>
>> Hello All,
>>
>> I know that this thread is a little old, but I am interested in using
>> gprof for applications profiling in Syscall emulation mode. So my
>>question
>> is, is this feasible to do? Would it work in FS mode or should I look
>>for a
>> different alternative?
>>
>> Cheers,
>> Raul.
>> ___
>> gem5-dev mailing list
>> gem5-dev@gem5.org
>> http://m5sim.org/mailman/listinfo/gem5-dev
>>
>___
>gem5-dev mailing list
>gem5-dev@gem5.org
>http://m5sim.org/mailman/listinfo/gem5-dev
>___
>gem5-dev mailing list
>gem5-dev@gem5.org
>http://m5sim.org/mailman/listinfo/gem5-dev
>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] changeset in gem5: stats: Bump the MemTest regression stats

2015-02-11 Thread Andreas Hansson via gem5-dev
changeset 12632859858a in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=12632859858a
description:
stats: Bump the MemTest regression stats

Reflect changes in the tester behaviour.

diffstat:

 
tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/stats.txt 
 |  1919 ++--
 
tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MOESI_CMP_directory/stats.txt
 |  2575 +++---
 
tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MOESI_CMP_token/stats.txt
 |  2861 
 tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MOESI_hammer/stats.txt  
  |  2671 +++---
 tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby/stats.txt   
  |  1258 +-
 tests/quick/se/50.memtest/ref/null/none/memtest-filter/stats.txt   
  |  3343 -
 tests/quick/se/50.memtest/ref/null/none/memtest/stats.txt  
  |  3321 
 7 files changed, 8938 insertions(+), 9010 deletions(-)

diffs (truncated from 18771 to 300 lines):

diff -r 22452667fd5c -r 12632859858a 
tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/stats.txt
--- 
a/tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/stats.txt
   Wed Feb 11 10:23:28 2015 -0500
+++ 
b/tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/stats.txt
   Wed Feb 11 10:23:31 2015 -0500
@@ -1,48 +1,48 @@
 
 -- Begin Simulation Statistics --
-sim_seconds  0.010101   # 
Number of seconds simulated
-sim_ticks10100518   # 
Number of ticks simulated
-final_tick   10100518   # 
Number of ticks from beginning of simulation (restored from checkpoints and 
never reset)
+sim_seconds  0.010140   # 
Number of seconds simulated
+sim_ticks10139920   # 
Number of ticks simulated
+final_tick   10139920   # 
Number of ticks from beginning of simulation (restored from checkpoints and 
never reset)
 sim_freq   10   # 
Frequency of simulated ticks
-host_tick_rate 129349   # 
Simulator tick rate (ticks/s)
-host_mem_usage 663928   # 
Number of bytes of host memory used
-host_seconds78.09   # 
Real time elapsed on the host
+host_tick_rate 145167   # 
Simulator tick rate (ticks/s)
+host_mem_usage 481408   # 
Number of bytes of host memory used
+host_seconds69.85   # 
Real time elapsed on the host
 system.voltage_domain.voltage   1   # 
Voltage in Volts
 system.clk_domain.clock 1   # 
Clock period in ticks
-system.mem_ctrls.bytes_read::ruby.dir_cntrl0 39550848  
 # Number of bytes read from this memory
-system.mem_ctrls.bytes_read::total   39550848   # 
Number of bytes read from this memory
-system.mem_ctrls.bytes_written::ruby.dir_cntrl0 14145024   
# Number of bytes written to this memory
-system.mem_ctrls.bytes_written::total14145024   # 
Number of bytes written to this memory
-system.mem_ctrls.num_reads::ruby.dir_cntrl0   617982   
# Number of read requests responded to by this memory
-system.mem_ctrls.num_reads::total  617982   # 
Number of read requests responded to by this memory
-system.mem_ctrls.num_writes::ruby.dir_cntrl0   221016  
 # Number of write requests responded to by this memory
-system.mem_ctrls.num_writes::total 221016   # 
Number of write requests responded to by this memory
-system.mem_ctrls.bw_read::ruby.dir_cntrl0   3915724718   # 
Total read bandwidth from this memory (bytes/s)
-system.mem_ctrls.bw_read::total3915724718   # 
Total read bandwidth from this memory (bytes/s)
-system.mem_ctrls.bw_write::ruby.dir_cntrl0   1400425602   
# Write bandwidth from this memory (bytes/s)
-system.mem_ctrls.bw_write::total   1400425602   # 
Write bandwidth from this memory (bytes/s)
-system.mem_ctrls.bw_total::ruby.dir_cntrl0   5316150320   
# Total bandwidth to/from this memory (bytes/s)
-system.mem_ctrls.bw_total::total   5316150320   # 
Total bandwidth to/from this memory 

[gem5-dev] changeset in gem5: config: Revamp memtest to allow testers on an...

2015-02-11 Thread Andreas Hansson via gem5-dev
changeset 4972ada74310 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=4972ada74310
description:
config: Revamp memtest to allow testers on any level

This patch revamps the memtest example script and allows for the
insertion of testers at any level in the cache hierarchy. Previously
all created topologies placed testers only at the very top, and the
tree was thus entirely symmetric. With the changes made, it is possible
to not only place testers at the leaf caches (L1), but also to connect
testers at the L2, L3 etc.

As part of the changes the object hierarchy is also simplified to
ensure that the visual representation from the DOT printing looks
sensible. Using SubSystems to group the objects is one of the key
features.

diffstat:

 configs/example/memtest.py |  256 +---
 1 files changed, 170 insertions(+), 86 deletions(-)

diffs (truncated from 340 to 300 lines):

diff -r 12632859858a -r 4972ada74310 configs/example/memtest.py
--- a/configs/example/memtest.pyWed Feb 11 10:23:31 2015 -0500
+++ b/configs/example/memtest.pyWed Feb 11 10:23:31 2015 -0500
@@ -1,3 +1,15 @@
+# Copyright (c) 2015 ARM Limited
+# All rights reserved.
+#
+# The license below extends only to copyright in the software and shall
+# not be construed as granting a license to any other intellectual
+# property including but not limited to intellectual property relating
+# to a hardware implementation of the functionality of the software
+# licensed hereunder.  You may use the software subject to the license
+# terms below provided that you ensure that this notice is replicated
+# unmodified and in its entirety in all distributions of the software,
+# modified or unmodified, in source code or in binary form.
+#
 # Copyright (c) 2006-2007 The Regents of The University of Michigan
 # All rights reserved.
 #
@@ -25,6 +37,7 @@
 # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 #
 # Authors: Ron Dreslinski
+#  Andreas Hansson
 
 import optparse
 import sys
@@ -44,32 +57,33 @@
   metavar="T",
   help="Stop after T ticks")
 
+# This example script stress tests the memory system by creating false
+# sharing in a tree topology. At the bottom of the tree is a shared
+# memory, and then at each level a number of testers are attached,
+# along with a number of caches that them selves fan out to subtrees
+# of testers and caches. Thus, it is possible to create a system with
+# arbitrarily deep cache hierarchies, sharing or no sharing of caches,
+# and testers not only at the L1s, but also at the L2s, L3s etc.
 #
-# The "tree" specification is a colon-separated list of one or more
-# integers.  The first integer is the number of caches/testers
-# connected directly to main memory.  The last integer in the list is
-# the number of testers associated with the uppermost level of memory
-# (L1 cache, if there are caches, or main memory if no caches).  Thus
-# if there is only one integer, there are no caches, and the integer
-# specifies the number of testers connected directly to main memory.
-# The other integers (if any) specify the number of caches at each
-# level of the hierarchy between.
-#
-# Examples:
-#
-#  "2:1"Two caches connected to memory with a single tester behind each
-#   (single-level hierarchy, two testers total)
-#
-#  "2:2:1"  Two-level hierarchy, 2 L1s behind each of 2 L2s, 4 testers total
-#
-parser.add_option("-t", "--treespec", type="string", default="8:1",
-  help="Colon-separated multilevel tree specification, "
+# The tree specification consists of two colon-separated lists of one
+# or more integers, one for the caches, and one for the testers. The
+# first integer is the number of caches/testers closest to main
+# memory. Each cache then fans out to a subtree. The last integer in
+# the list is the number of caches/testers associated with the
+# uppermost level of memory. The other integers (if any) specify the
+# number of caches/testers connected at each level of the crossbar
+# hierarchy. The tester string should have one element more than the
+# cache string as there should always be testers attached to the
+# uppermost caches.
+
+parser.add_option("-c", "--caches", type="string", default="2:2:1",
+  help="Colon-separated cache hierarchy specification, "
   "see script comments for details "
   "[default: %default]")
-
-parser.add_option("--force-bus", action="store_true",
-  help="Use bus between levels even with single cache")
-
+parser.add_option("-t", "--testers", type="string", default="1:1:0:2",
+  help="Colon-separated tester hierarchy specification, "
+  "see script comments for details "
+  "[default: %default]")
 parser.add_option("-f", "--functional

[gem5-dev] changeset in gem5: cpu: Tidy up the MemTest and make false shari...

2015-02-11 Thread Andreas Hansson via gem5-dev
changeset 22452667fd5c in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=22452667fd5c
description:
cpu: Tidy up the MemTest and make false sharing more obvious

The MemTest class really only tests false sharing, and as such there
was a lot of old cruft that could be removed. This patch cleans up the
tester, and also makes it more clear what the assumptions are. As part
of this simplification the reference functional memory is also
removed.

The regression configs using MemTest are updated to reflect the
changes, and the stats will be bumped in a separate patch. The example
config will be updated in a separate patch due to more extensive
re-work.

In a follow-on patch a new tester will be introduced that uses the
MemChecker to implement true sharing.

diffstat:

 configs/example/memtest.py |   15 +-
 src/cpu/testers/memtest/MemTest.py |   57 +++--
 src/cpu/testers/memtest/memtest.cc |  363 +++-
 src/cpu/testers/memtest/memtest.hh |  170 
 tests/configs/memtest-filter.py|   12 +-
 tests/configs/memtest-ruby.py  |   16 +-
 tests/configs/memtest.py   |   11 +-
 7 files changed, 289 insertions(+), 355 deletions(-)

diffs (truncated from 1018 to 300 lines):

diff -r 276da6265ab8 -r 22452667fd5c configs/example/memtest.py
--- a/configs/example/memtest.pyWed Feb 11 10:23:27 2015 -0500
+++ b/configs/example/memtest.pyWed Feb 11 10:23:28 2015 -0500
@@ -124,7 +124,7 @@
 
 # build a list of prototypes, one for each level of treespec, starting
 # at the end (last entry is tester objects)
-prototypes = [ MemTest(atomic=options.atomic, max_loads=options.maxloads,
+prototypes = [ MemTest(max_loads=options.maxloads,
percent_functional=options.functional,
percent_uncacheable=options.uncacheable,
progress_interval=options.progress) ]
@@ -146,12 +146,9 @@
  prototypes.insert(0, next)
 
 # system simulated
-system = System(funcmem = SimpleMemory(in_addr_map = False),
-funcbus = NoncoherentXBar(),
-physmem = SimpleMemory(latency = "100ns"),
+system = System(physmem = SimpleMemory(latency = "100ns"),
 cache_line_size = block_size)
 
-
 system.voltage_domain = VoltageDomain(voltage = '1V')
 
 system.clk_domain = SrcClockDomain(clock =  options.sys_clock,
@@ -182,14 +179,10 @@
   # we just built the MemTest objects
   parent.cpu = objs
   for t in objs:
-   t.test = getattr(attach_obj, attach_port)
-   t.functional = system.funcbus.slave
+   t.port = getattr(attach_obj, attach_port)
 
 make_level(treespec, prototypes, system.physmem, "port")
 
-# connect reference memory to funcbus
-system.funcbus.master = system.funcmem.port
-
 # ---
 # run simulation
 # ---
@@ -202,7 +195,7 @@
 
 # The system port is never used in the tester so merely connect it
 # to avoid problems
-root.system.system_port = root.system.funcbus.slave
+root.system.system_port = root.system.physmem.cpu_side_bus.slave
 
 # Not much point in this being higher than the L1 latency
 m5.ticks.setGlobalFrequency('1ns')
diff -r 276da6265ab8 -r 22452667fd5c src/cpu/testers/memtest/MemTest.py
--- a/src/cpu/testers/memtest/MemTest.pyWed Feb 11 10:23:27 2015 -0500
+++ b/src/cpu/testers/memtest/MemTest.pyWed Feb 11 10:23:28 2015 -0500
@@ -1,3 +1,15 @@
+# Copyright (c) 2015 ARM Limited
+# All rights reserved.
+#
+# The license below extends only to copyright in the software and shall
+# not be construed as granting a license to any other intellectual
+# property including but not limited to intellectual property relating
+# to a hardware implementation of the functionality of the software
+# licensed hereunder.  You may use the software subject to the license
+# terms below provided that you ensure that this notice is replicated
+# unmodified and in its entirety in all distributions of the software,
+# modified or unmodified, in source code or in binary form.
+#
 # Copyright (c) 2005-2007 The Regents of The University of Michigan
 # All rights reserved.
 #
@@ -25,6 +37,7 @@
 # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 #
 # Authors: Nathan Binkert
+#  Andreas Hansson
 
 from MemObject import MemObject
 from m5.params import *
@@ -33,26 +46,30 @@
 class MemTest(MemObject):
 type = 'MemTest'
 cxx_header = "cpu/testers/memtest/memtest.hh"
-max_loads = Param.Counter(0, "number of loads to execute")
-atomic = Param.Bool(False, "Execute tester in atomic mode? (or timing)\n")
-memory_size = Param.Int(65536, "memory size")
-percent_dest_unaligned = Param.Percent(50,
-"percent of copy dest address that are unaligned")
-percent_reads = Param.Percent(65, "target read percentage")
- 

[gem5-dev] changeset in gem5: base: Do not dereference NULL in CompoundFlag...

2015-02-11 Thread Andreas Hansson via gem5-dev
changeset a24286e33318 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=a24286e33318
description:
base: Do not dereference NULL in CompoundFlag creation

This patch fixes the CompoundFlag constructor, ensuring that it does
not dereference NULL. Doing so has undefined behaviuor, and both clang
and gcc's undefined-behaviour sanitiser was rather unhappy.

diffstat:

 src/SConscript|   4 ++--
 src/base/debug.hh |  26 +-
 2 files changed, 15 insertions(+), 15 deletions(-)

diffs (56 lines):

diff -r d1d95f0f4563 -r a24286e33318 src/SConscript
--- a/src/SConscriptWed Feb 11 10:23:22 2015 -0500
+++ b/src/SConscriptWed Feb 11 10:23:23 2015 -0500
@@ -852,9 +852,9 @@
 last = len(compound) - 1
 for i,flag in enumerate(compound):
 if i != last:
-comp_code('$flag,')
+comp_code('&$flag,')
 else:
-comp_code('$flag);')
+comp_code('&$flag);')
 comp_code.dedent()
 
 code.append(comp_code)
diff -r d1d95f0f4563 -r a24286e33318 src/base/debug.hh
--- a/src/base/debug.hh Wed Feb 11 10:23:22 2015 -0500
+++ b/src/base/debug.hh Wed Feb 11 10:23:23 2015 -0500
@@ -81,24 +81,24 @@
 {
   protected:
 void
-addFlag(Flag &f)
+addFlag(Flag *f)
 {
-if (&f != NULL)
-_kids.push_back(&f);
+if (f != nullptr)
+_kids.push_back(f);
 }
 
   public:
 CompoundFlag(const char *name, const char *desc,
-Flag &f00 = *(Flag *)0, Flag &f01 = *(Flag *)0,
-Flag &f02 = *(Flag *)0, Flag &f03 = *(Flag *)0,
-Flag &f04 = *(Flag *)0, Flag &f05 = *(Flag *)0,
-Flag &f06 = *(Flag *)0, Flag &f07 = *(Flag *)0,
-Flag &f08 = *(Flag *)0, Flag &f09 = *(Flag *)0,
-Flag &f10 = *(Flag *)0, Flag &f11 = *(Flag *)0,
-Flag &f12 = *(Flag *)0, Flag &f13 = *(Flag *)0,
-Flag &f14 = *(Flag *)0, Flag &f15 = *(Flag *)0,
-Flag &f16 = *(Flag *)0, Flag &f17 = *(Flag *)0,
-Flag &f18 = *(Flag *)0, Flag &f19 = *(Flag *)0)
+Flag *f00 = nullptr, Flag *f01 = nullptr,
+Flag *f02 = nullptr, Flag *f03 = nullptr,
+Flag *f04 = nullptr, Flag *f05 = nullptr,
+Flag *f06 = nullptr, Flag *f07 = nullptr,
+Flag *f08 = nullptr, Flag *f09 = nullptr,
+Flag *f10 = nullptr, Flag *f11 = nullptr,
+Flag *f12 = nullptr, Flag *f13 = nullptr,
+Flag *f14 = nullptr, Flag *f15 = nullptr,
+Flag *f16 = nullptr, Flag *f17 = nullptr,
+Flag *f18 = nullptr, Flag *f19 = nullptr)
 : SimpleFlag(name, desc)
 {
 addFlag(f00); addFlag(f01); addFlag(f02); addFlag(f03); addFlag(f04);
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] gem5 User Workshop 2015

2015-02-11 Thread Andreas Hansson via gem5-dev
Hi everyone,

At ISCA-42 this year we are organising the second gem5 user workshop,
bringing together groups and individuals across the gem5
community. The goal of the workshop is to provide a forum to
discuss what is going on in the community, how we can best
leverage each other's contributions, and how we can continue to
make gem5 a successful community-supported simulation
framework. The workshop is taking place on Sunday June 14th, 2015.

The key part of the workshop is a set of presentations from the
community about how individuals or groups are using the
simulator, any features you have added that might be useful to
others, and any major pain points, and what can be done to make
gem5 better and more broadly adopted. The hope is that this will
provide a forum for people with similar uses or needs to connect
with each other. If you would like to give a short
presentation (~15 minutes) please send a short (few paragraphs)
abstract to workshop2015  gem5.org by March 31st, 2015.

Additionally, we plan to dedicate one hour of the workshop for
gem5 developers to address particular issues or questions in
depth. If you would like to suggest a topic for one of these in-depth
presentations please email workshop2015  gem5.org (also by March 31st).

Tentative Program
* Introduction [15 minutes]
* Overview of key changes and additions [30 minutes]
* Solicited presentations (~2 in-depth presentations) [1 hour]
* User Perspectives (~12 short presentations)  [3 hours]
* Break-out sessions on progressing identified pain points [1.5 hours]
* Wrap-up and next steps [45 minutes]

More information will be available at: http://www.gem5.org/User_workshop_2015

We look forward to your contributions and hope to see you there.


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No: 2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2647: cpu: o3 register renaming request handling improved

2015-02-10 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2647/
---

(Updated Feb. 10, 2015, 5:43 p.m.)


Review request for Default.


Repository: gem5


Description (updated)
---

Changeset 10707:0872f196797d
---
cpu: o3 register renaming request handling improved

Now, prior to the renaming, the instruction requests the exact amount of
registers it will need, and the rename_map decides whether the instruction is
allowed to proceed or not.


Diffs (updated)
-

  src/cpu/base_dyn_inst.hh 94901e131a7f 
  src/cpu/o3/rename_impl.hh 94901e131a7f 
  src/cpu/o3/rename_map.hh 94901e131a7f 
  src/cpu/static_inst.hh 94901e131a7f 

Diff: http://reviews.gem5.org/r/2647/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] changeset in gem5: ruby: dma sequencer: remove RubyPort as paren...

2015-02-10 Thread Andreas Hansson via gem5-dev
Hi all,

To me it sounds like the whole ruby philosophy of “two, or even three
views of the same data” might need an overhaul.

Andreas

On 10/02/2015 04:44, "Beckmann, Brad via gem5-dev" 
wrote:

>Thanks Jason.  I didn't notice your patch until after I sent out my
>email.  It looks like we are encountering the same problem.
>
>I'm a bit concern that your patch doesn't modify
>DMASequencer::ackCallback(), thus it doesn't get us very close to
>correctly supporting multi-cache block dma.  It seems like we need a
>fundamental different approach that separates updating the backing store
>from sending the response packet to the dma device.
>
>Brad
>
>
>
>-Original Message-
>From: gem5-dev [mailto:gem5-dev-boun...@gem5.org] On Behalf Of Jason
>Power via gem5-dev
>Sent: Monday, February 09, 2015 6:06 PM
>To: gem5 Developer List; gem5-...@m5sim.org
>Subject: Re: [gem5-dev] changeset in gem5: ruby: dma sequencer: remove
>RubyPort as paren...
>
>Hey Brad,
>
>I think this change is in my patch to fix the backing store. It's on
>reviewboard now.
>http://reviews.gem5.org/r/2627/
>
>I'm not sure if that patch supports multi cache block DMA, but it's
>definitely a step in the right direction.
>
>There's also another patch for unaligned DMA.
>http://reviews.gem5.org/r/2653/. That patch was required for us to get
>our copy engine working.
>
>
>
>Let me know if you have any feedback or if it solves your problem.
>
>Cheers,
>
>
>Jason
>
>On Mon Feb 09 2015 at 7:03:54 PM Beckmann, Brad via gem5-dev <
>gem5-dev@gem5.org> wrote:
>
>I should clarify.  Is the simple way of supporting the backing store and
>DMA is adding those 6 lines back to the DMA sequencer?
>
>Also have you considered the case for multi-cache block DMA and updating
>the backing store?  It appears that only the final cache block of a large
>multi-cache block write will even call
>DMASequencer::MemSlavePort::hitCallback().  How can we get the other DMA
>writes to update the backing store?
>
>Thanks,
>
>Brad
>
>-Original Message-
>From: gem5-dev [mailto:gem5-dev-boun...@gem5.org] On Behalf Of Beckmann,
>Brad via gem5-dev
>Sent: Monday, February 09, 2015 4:58 PM
>To: gem5 Developer List; gem5-...@m5sim.org
>Subject: Re: [gem5-dev] changeset in gem5: ruby: dma sequencer: remove
>RubyPort as paren...
>
>Hi Nilay,
>
>Did you consider this patch when you added the backing store back to Ruby?
>The following lines in "DMASequencer::MemSlavePort::hitCallback(PacketPtr
>pkt)" initially updated the backing store, but I believe they were
>removed in a later patch (7a3ad4b09ce4).
>
>+if (accessPhysMem) {
>+DMASequencer *seq = static_cast(&owner);
>+seq->system->getPhysMem().access(pkt);
>+} else if (needsResponse) {
>+pkt->makeResponse();
>+}
>+
>
>Is it as simple as adding those 6 lines back into the DMA sequencer?  Are
>there other issues we should consider?
>
>Thanks,
>
>Brad
>
>-Original Message-
>From: gem5-dev [mailto:gem5-dev-boun...@gem5.org] On Behalf Of Nilay
>Vaish via gem5-dev
>Sent: Thursday, November 06, 2014 3:37 AM
>To: gem5-...@m5sim.org
>Subject: [gem5-dev] changeset in gem5: ruby: dma sequencer: remove
>RubyPort as paren...
>
>changeset 30e3715c9405 in /z/repo/gem5
>details: http://repo.gem5.org/gem5?cmd=changeset;node=30e3715c9405
>description:
>ruby: dma sequencer: remove RubyPort as parent class
>As of now DMASequencer inherits from the RubyPort class.  But the
>code in
>RubyPort class is heavily tailored for the CPU Sequencer.  There
>are parts of
>the code that are not required at all for the DMA sequencer.
>Moreover, the
>next patch uses the dma sequencer for carrying out memory
>accesses for all the
>io devices.  Hence, it is better to have a leaner dma sequencer.
>
>diffstat:
>
> src/mem/ruby/system/DMASequencer.cc |  195
>+++-
> src/mem/ruby/system/DMASequencer.hh |   75 +-
> src/mem/ruby/system/Sequencer.py|   13 +-
> 3 files changed, 274 insertions(+), 9 deletions(-)
>
>diffs (truncated from 374 to 300 lines):
>
>diff -r ba51f8572571 -r 30e3715c9405 src/mem/ruby/system/DMASequencer.cc
>--- a/src/mem/ruby/system/DMASequencer.cc   Mon Nov 03 10:14:42 2014
>-0600
>+++ b/src/mem/ruby/system/DMASequencer.cc   Thu Nov 06 00:55:09 2014
>-0600
>@@ -28,26 +28,212 @@
>
> #include 
>
>+#include "debug/Config.hh"
>+#include "debug/Drain.hh"
> #include "debug/RubyDma.hh"
> #include "debug/RubyStats.hh"
> #include "mem/protocol/SequencerMsg.hh"
>-#include "mem/protocol/SequencerRequestType.hh"
> #include "mem/ruby/system/DMASequencer.hh"
> #include "mem/ruby/system/System.hh"
>+#include "sim/system.hh"
>
> DMASequencer::DMASequencer(const Params *p)
>-: RubyPort(p)
>+: MemObject(p), m_version(p->version), m_controller(NULL),
>+  m_mandatory_q_ptr(NULL), m_usingRubyTester(p->using_ruby_tester),
>+  slave_port(csprintf("%s.slave", name()), this, access_phys_mem, 0),
>+  drainManag

Re: [gem5-dev] memory per channel

2015-02-09 Thread Andreas Hansson via gem5-dev
Hi Giorgos,

The default behaviour in gem5 is to interleave the memory channels on a
128-byte granularity and use an XOR-based hash to avoid skewing the
channel usage.

That said, you can do what ever you please :-). The basic functionality is
controlled by configs/common/MemConfig.py, and you can change it as you
see fit. Take a pick: no interleaving, or arbitrary-granularity
interleaving, hashing or no hashing. It is all supported.

Andreas

On 09/02/2015 12:47, "Giorgos K. via gem5-dev"  wrote:

>Hi! On Benchamrks.py we add an  src file and choose 1024MB ram .If the
>memory system consists of 4 channel, is 1024MB referenced to one channel
>or to all channels totally.
>
>___
>gem5-dev mailing list
>gem5-dev@gem5.org
>http://m5sim.org/mailman/listinfo/gem5-dev
>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2652: dev: Fix undefined behaviuor in i8254xGBe

2015-02-09 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2652/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10705:6bb47e6b28d2
---
dev: Fix undefined behaviuor in i8254xGBe

This patch fixes a rather unfortunate oversight where the annotation
pointer was used even though it is null. Somehow the code still works,
but UBSan is rather unhappy. The use is now guarded, and the variable
is initialised in the constructor (as well as init()).


Diffs
-

  src/dev/i8254xGBe.hh 94901e131a7f 
  src/dev/i8254xGBe.cc 94901e131a7f 

Diff: http://reviews.gem5.org/r/2652/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2624: mem: mmap the backing store with MAP_NORESERVE

2015-02-09 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2624/
---

(Updated Feb. 9, 2015, 2:23 p.m.)


Review request for Default.


Repository: gem5


Description (updated)
---

Changeset 10693:40aaf1161841
---
mem: mmap the backing store with MAP_NORESERVE

This patch ensures we can run simulations with very large simulated
memories (at least 64 TB based on some quick runs on a Linux
workstation). In essence this allows us to efficiently deal with
sparse address maps without having to implement a redirection layer in
the backing store.

This opens up for run-time errors if we eventually exhausts the hosts
memory and swap space, but this should hopefully never happen.


Diffs (updated)
-

  src/mem/physical.hh 94901e131a7f 
  src/mem/physical.cc 94901e131a7f 
  src/sim/System.py 94901e131a7f 
  src/sim/system.cc 94901e131a7f 

Diff: http://reviews.gem5.org/r/2624/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2651: cache: remove redundant test in recvTimingResp()

2015-02-09 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2651/#review5876
---

Ship it!


mem:

- Andreas Hansson


On Feb. 9, 2015, 6:05 a.m., Steve Reinhardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2651/
> ---
> 
> (Updated Feb. 9, 2015, 6:05 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10686:7fc1f2b733cf
> ---
> cache: remove redundant test in recvTimingResp()
> 
> For some reason we were checking mshr->hasTargets() even though
> we had already called mshr->getTarget() unconditionally earlier
> in the same function (which asserts if there are no targets).
> Get rid of this useless check, and while we're at it get rid
> of the redundant call to mshr->getTarget(), since we still have
> the value saved in a local var.
> 
> 
> Diffs
> -
> 
>   src/mem/cache/cache_impl.hh 94901e131a7f2452b3cf2b2740be866b4a6f404d 
> 
> Diff: http://reviews.gem5.org/r/2651/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Steve Reinhardt
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2650: cache: add local var in recvTimingResp()

2015-02-09 Thread Andreas Hansson via gem5-dev


> On Feb. 9, 2015, 8:28 a.m., Andreas Hansson wrote:
> > Ship It!

I guess mem: Add local var in Cache::recvTimingResp


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2650/#review5874
---


On Feb. 9, 2015, 6:05 a.m., Steve Reinhardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2650/
> ---
> 
> (Updated Feb. 9, 2015, 6:05 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10685:5be4344f16fe
> ---
> cache: add local var in recvTimingResp()
> 
> The main loop in recvTimingResp() uses target->pkt all over
> the place.  Create a local tgt_pkt to help keep lines
> under the line length limit.
> 
> 
> Diffs
> -
> 
>   src/mem/cache/cache_impl.hh 94901e131a7f2452b3cf2b2740be866b4a6f404d 
> 
> Diff: http://reviews.gem5.org/r/2650/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Steve Reinhardt
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2650: cache: add local var in recvTimingResp()

2015-02-09 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2650/#review5874
---

Ship it!


Ship It!

- Andreas Hansson


On Feb. 9, 2015, 6:05 a.m., Steve Reinhardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2650/
> ---
> 
> (Updated Feb. 9, 2015, 6:05 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10685:5be4344f16fe
> ---
> cache: add local var in recvTimingResp()
> 
> The main loop in recvTimingResp() uses target->pkt all over
> the place.  Create a local tgt_pkt to help keep lines
> under the line length limit.
> 
> 
> Diffs
> -
> 
>   src/mem/cache/cache_impl.hh 94901e131a7f2452b3cf2b2740be866b4a6f404d 
> 
> Diff: http://reviews.gem5.org/r/2650/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Steve Reinhardt
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2649: mem: restructure Packet cmd initialization a bit more

2015-02-09 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2649/#review5873
---

Ship it!


Ship It!

- Andreas Hansson


On Feb. 8, 2015, 10:34 p.m., Steve Reinhardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2649/
> ---
> 
> (Updated Feb. 8, 2015, 10:34 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10684:39b06818b9a6
> ---
> mem: restructure Packet cmd initialization a bit more
> 
> Refactor the way that specific MemCmd values are generated for packets.
> The new approach is a little more elegant in that we assign the right
> value up front, and it's also more amenable to non-heap-allocated
> Packet objects.
> 
> Also replaced the code in the Minor model that was still doing it the
> ad-hoc way.
> 
> This is basically a refinement of http://repo.gem5.org/gem5/rev/711eb0e64249
> 
> 
> Diffs
> -
> 
>   src/cpu/inorder/resources/cache_unit.hh 
> 94901e131a7f2452b3cf2b2740be866b4a6f404d 
>   src/cpu/inorder/resources/cache_unit.cc 
> 94901e131a7f2452b3cf2b2740be866b4a6f404d 
>   src/cpu/minor/lsq.cc 94901e131a7f2452b3cf2b2740be866b4a6f404d 
>   src/cpu/simple/atomic.cc 94901e131a7f2452b3cf2b2740be866b4a6f404d 
>   src/mem/packet.hh 94901e131a7f2452b3cf2b2740be866b4a6f404d 
> 
> Diff: http://reviews.gem5.org/r/2649/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Steve Reinhardt
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2646: mem: Split port retry for all different packet classes

2015-02-08 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2646/
---

(Updated Feb. 8, 2015, 5:48 p.m.)


Review request for Default.


Repository: gem5


Description (updated)
---

Changeset 10705:1fec273b2946
---
mem: Split port retry for all different packet classes

This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.

The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.

The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.


Diffs (updated)
-

  src/arch/x86/pagetable_walker.hh 94901e131a7f 
  src/arch/x86/pagetable_walker.cc 94901e131a7f 
  src/cpu/kvm/base.hh 94901e131a7f 
  src/cpu/minor/fetch1.hh 94901e131a7f 
  src/cpu/minor/fetch1.cc 94901e131a7f 
  src/cpu/minor/lsq.hh 94901e131a7f 
  src/cpu/minor/lsq.cc 94901e131a7f 
  src/cpu/o3/cpu.hh 94901e131a7f 
  src/cpu/o3/cpu.cc 94901e131a7f 
  src/cpu/o3/fetch.hh 94901e131a7f 
  src/cpu/o3/fetch_impl.hh 94901e131a7f 
  src/cpu/o3/lsq.hh 94901e131a7f 
  src/cpu/o3/lsq_impl.hh 94901e131a7f 
  src/cpu/simple/atomic.hh 94901e131a7f 
  src/cpu/simple/timing.hh 94901e131a7f 
  src/cpu/simple/timing.cc 94901e131a7f 
  src/cpu/testers/directedtest/RubyDirectedTester.hh 94901e131a7f 
  src/cpu/testers/memtest/memtest.hh 94901e131a7f 
  src/cpu/testers/memtest/memtest.cc 94901e131a7f 
  src/cpu/testers/networktest/networktest.hh 94901e131a7f 
  src/cpu/testers/networktest/networktest.cc 94901e131a7f 
  src/cpu/testers/rubytest/RubyTester.hh 94901e131a7f 
  src/cpu/testers/traffic_gen/traffic_gen.hh 94901e131a7f 
  src/cpu/testers/traffic_gen/traffic_gen.cc 94901e131a7f 
  src/dev/dma_device.hh 94901e131a7f 
  src/dev/dma_device.cc 94901e131a7f 
  src/mem/addr_mapper.hh 94901e131a7f 
  src/mem/addr_mapper.cc 94901e131a7f 
  src/mem/bridge.hh 94901e131a7f 
  src/mem/bridge.cc 94901e131a7f 
  src/mem/cache/base.hh 94901e131a7f 
  src/mem/cache/base.cc 94901e131a7f 
  src/mem/cache/cache.hh 94901e131a7f 
  src/mem/cache/cache_impl.hh 94901e131a7f 
  src/mem/coherent_xbar.hh 94901e131a7f 
  src/mem/coherent_xbar.cc 94901e131a7f 
  src/mem/comm_monitor.hh 94901e131a7f 
  src/mem/comm_monitor.cc 94901e131a7f 
  src/mem/dram_ctrl.hh 94901e131a7f 
  src/mem/dram_ctrl.cc 94901e131a7f 
  src/mem/dramsim2.hh 94901e131a7f 
  src/mem/dramsim2.cc 94901e131a7f 
  src/mem/external_slave.cc 94901e131a7f 
  src/mem/mem_checker_monitor.hh 94901e131a7f 
  src/mem/mem_checker_monitor.cc 94901e131a7f 
  src/mem/mport.hh 94901e131a7f 
  src/mem/noncoherent_xbar.hh 94901e131a7f 
  src/mem/noncoherent_xbar.cc 94901e131a7f 
  src/mem/packet_queue.hh 94901e131a7f 
  src/mem/packet_queue.cc 94901e131a7f 
  src/mem/port.hh 94901e131a7f 
  src/mem/port.cc 94901e131a7f 
  src/mem/qport.hh 94901e131a7f 
  src/mem/ruby/slicc_interface/AbstractController.hh 94901e131a7f 
  src/mem/ruby/slicc_interface/AbstractController.cc 94901e131a7f 
  src/mem/ruby/structures/RubyMemoryControl.hh 94901e131a7f 
  src/mem/ruby/system/DMASequencer.hh 94901e131a7f 
  src/mem/ruby/system/DMASequencer.cc 94901e131a7f 
  src/mem/ruby/system/RubyPort.hh 94901e131a7f 
  src/mem/ruby/system/RubyPort.cc 94901e131a7f 
  src/mem/simple_mem.hh 94901e131a7f 
  src/mem/simple_mem.cc 94901e131a7f 
  src/mem/tport.hh 94901e131a7f 
  src/mem/tport.cc 94901e131a7f 
  src/mem/xbar.hh 94901e131a7f 
  src/mem/xbar.cc 94901e131a7f 
  src/sim/system.hh 94901e131a7f 

Diff: http://reviews.gem5.org/r/2646/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2624: mem: mmap the backing store with MAP_NORESERVE

2015-02-08 Thread Andreas Hansson via gem5-dev


> On Feb. 3, 2015, 9:37 p.m., Brandon Potter wrote:
> > The patch seems like a landmine for an unsuspecting user as it would be 
> > difficult to diagnose if swap space is exhausted.  It will probably be 
> > evident that memory exhaustion caused the runtime error (segmentation 
> > fault), but tracking down the cause to this patch will probably be 
> > non-trivial (maybe?).  For a user, the memory allocation failure from 
> > insufficient swap space would probably be preferable to a hard to diagnose 
> > segmentation fault.
> > 
> > Even though it is ugly, maybe it's best to add #if 0 / #endif around the 
> > code to allow someone to find this later on and enable the feature if they 
> > need it (with a warning about the segmentation fault).  The memory 
> > allocation failure will point the user to the code in this patch.
> 
> Andreas Hansson wrote:
> Fair point. An option would be to not use the flag for "sensible" sizes, 
> perhaps <16 GB or so, and adopt it for larger sizes with a warning?
> 
> Brandon Potter wrote:
> Perhaps that's fine.  Although, I think that a value for "sensible" will 
> vary from person to person.
> 
> Personally, I still feel that it is better to leave it completely 
> disabled and allow someone to enable it if needed (solely to prevent 
> confusion later).  It is trivial to enable the functionality by modifying the 
> condition.  Another avenue might be whether it is worthwhile to use a 
> preprocessor define and the #ifdef to enable/disable this during compilation; 
> might be overkill to go through the trouble though.
> 
> Overall, maybe this is just paranoia on my part; it does seem unlikely to 
> cause a problem in practice.
> 
> Andreas Hansson wrote:
> The input is much appreciated.
> 
> My hesitation to the approach of asking the user to enable it if needed, 
> is that it means yet another parameter to keep track of (or yet another patch 
> to manage). If we go for the "limit approach", and a large value like 1 TB 
> perhaps?
> 
> I also think this is unlikely to cause problems the way it is, and as 
> such I would prefer not adding more configurability unless you really think 
> it is needed.
>
> 
> Brandon Potter wrote:
> Seems reasonable.  There is a tradeoff either way.
> 
> Andreas Hansson wrote:
> So what do people think?
> 
> 1) Good as it is (avoid more switches and magic limits)?
> 2) Add a limit, e.g. 1 TB and use above that?
> 3) Add a switch and force the user to explicitly set it?
> 
> In the spirit of keeping things simple I'd suggest (1).
> 
> Steve Reinhardt wrote:
> Hi Andreas,
> 
> I'm curious about the scenarios you have where you want to simulate a 
> system with TB of physical memory but don't expect to touch more than a few 
> GB of that.  That seems like a special case to me.  I sympathize with 
> Brandon's point, and I don't want to make things more confusing for what I 
> would guess is the majority of users (in terms of them getting weird delayed 
> faults when they could have been told up front they need more swap) just to 
> make the default case work for what I would guess is a minority of users who 
> probably know they're unusual by trying to simulate really large memories.
> 
> My inclination would be to make the use of MAP_NORESERVE for the backing 
> store a SimObject option that's false by default (maintaining the current 
> behavior).  At AMD, we end up writing custom top-level python scripts for our 
> major configurations anyway, so if we wanted to hardwire that to true for 
> some of those configs, we could just do that in our custom script.
> 
> If, in some configurations, people really want to define a threshold to 
> do this automatically, then that can be done in python.
> 
> Or to put it another way, I definitely want to enable you to use 
> MAP_NORESERVE without having to maintain it as a patch on the gem5 tree, but 
> since I haven't heard any clamoring from the broader user population for this 
> change (did I miss it?), I'm not too enthusiastic about enabling it by 
> default.

I'll add a switch.

This patch allows a sparse address map without actually implementing a sparse 
memory in gem5 and adding another level of indirection. This patch simply 
leaves it to the OS to allocate only those pages we touch, and does not 
complain if they are very far apart in the physical address space.


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2624/#review5821
---


On Feb. 3, 2015, 7:57 p.m., Andreas Hansson wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2624/
> ---
> 
> (Updated Feb. 3, 

Re: [gem5-dev] Review Request 2646: mem: Split port retry for all different packet classes

2015-02-08 Thread Andreas Hansson via gem5-dev


> On Feb. 7, 2015, 6:14 p.m., Steve Reinhardt wrote:
> > Awesome!  This lack of separate channels for requests & responses has 
> > bugged me for years, I'm really glad to see it addressed.
> > 
> > One minor suggestion: to me, terms like recvRetryReq and recvRetryResp seem 
> > backward, since 'retry' is the noun and 'req'/'resp' are adjectives.  So in 
> > isolation, recvReqRetry makes more sense, because I'm receiving a retry 
> > that's specific to requests (i.e., a request-flavored retry), not receiving 
> > a request for a retry (as recvRetryReq implies).  On the other hand, I can 
> > see how there is some parallelism between sendTimingReq and recvRetryReq.  
> > But to me the readability in context trumps the parallelism in the abstract 
> > (which you only see in places like the commit message where the function 
> > names are juxtaposed).
> > 
> > Just to overcome the effort hurdle, I think this issue can be fixed with:
> > perl -pi -e 's/([Rr])(etry)R(eq|esp)/$1$3R$2/'
> 
> Steve Reinhardt wrote:
> might want to tack a 'g' onto that substitution just in case

Makes sense. We might want to drop "Timing" all together, and simply have 
sendReq, sendResp, recvSnoopReq etc.

I'll bump the patch with the suggested changes.


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2646/#review5867
---


On Feb. 7, 2015, 5:24 p.m., Andreas Hansson wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2646/
> ---
> 
> (Updated Feb. 7, 2015, 5:24 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10707:ec8bd9a8e1e2
> ---
> mem: Split port retry for all different packet classes
> 
> This patch fixes a long-standing isue with the port flow
> control. Before this patch the retry mechanism was shared between all
> different packet classes. As a result, a snoop response could get
> stuck behind a request waiting for a retry, even if the send/recv
> functions were split. This caused message-dependent deadlocks in
> stress-test scenarios.
> 
> The patch splits the retry into one per packet (message) class. Thus,
> sendTimingReq has a corresponding recvRetryReq, sendTimingResp has
> recvRetryResp etc. Most of the changes to the code involve simply
> clarifying what type of request a specific object was accepting.
> 
> The biggest change in functionality is in the cache downstream packet
> queue, facing the memory. This queue was shared by requests and snoop
> responses, and it is now split into two queues, each with their own
> flow control, but the same physical MasterPort. These changes fixes
> the previously seen deadlocks.
> 
> 
> Diffs
> -
> 
>   src/cpu/minor/fetch1.hh 94901e131a7f 
>   src/cpu/kvm/base.hh 94901e131a7f 
>   src/arch/x86/pagetable_walker.hh 94901e131a7f 
>   src/arch/x86/pagetable_walker.cc 94901e131a7f 
>   src/cpu/minor/fetch1.cc 94901e131a7f 
>   src/cpu/minor/lsq.hh 94901e131a7f 
>   src/cpu/minor/lsq.cc 94901e131a7f 
>   src/cpu/o3/cpu.hh 94901e131a7f 
>   src/cpu/o3/cpu.cc 94901e131a7f 
>   src/cpu/o3/fetch.hh 94901e131a7f 
>   src/cpu/o3/fetch_impl.hh 94901e131a7f 
>   src/cpu/o3/lsq.hh 94901e131a7f 
>   src/cpu/o3/lsq_impl.hh 94901e131a7f 
>   src/cpu/simple/atomic.hh 94901e131a7f 
>   src/cpu/simple/timing.hh 94901e131a7f 
>   src/cpu/simple/timing.cc 94901e131a7f 
>   src/cpu/testers/directedtest/RubyDirectedTester.hh 94901e131a7f 
>   src/cpu/testers/memtest/memtest.hh 94901e131a7f 
>   src/cpu/testers/memtest/memtest.cc 94901e131a7f 
>   src/cpu/testers/networktest/networktest.hh 94901e131a7f 
>   src/cpu/testers/networktest/networktest.cc 94901e131a7f 
>   src/cpu/testers/rubytest/RubyTester.hh 94901e131a7f 
>   src/cpu/testers/traffic_gen/traffic_gen.hh 94901e131a7f 
>   src/cpu/testers/traffic_gen/traffic_gen.cc 94901e131a7f 
>   src/dev/dma_device.hh 94901e131a7f 
>   src/dev/dma_device.cc 94901e131a7f 
>   src/mem/addr_mapper.hh 94901e131a7f 
>   src/mem/addr_mapper.cc 94901e131a7f 
>   src/mem/bridge.hh 94901e131a7f 
>   src/mem/bridge.cc 94901e131a7f 
>   src/mem/cache/base.hh 94901e131a7f 
>   src/mem/cache/base.cc 94901e131a7f 
>   src/mem/cache/cache.hh 94901e131a7f 
>   src/mem/cache/cache_impl.hh 94901e131a7f 
>   src/mem/coherent_xbar.hh 94901e131a7f 
>   src/mem/coherent_xbar.cc 94901e131a7f 
>   src/mem/comm_monitor.hh 94901e131a7f 
>   src/mem/comm_monitor.cc 94901e131a7f 
>   src/mem/dram_ctrl.hh 94901e131a7f 
>   src/mem/dram_ctrl.cc 94901e131a7f 
>   src/mem/dramsim2.hh 94901e131a7f 
>   src/mem/dramsim2.cc 94901e131a7f 
>   src/mem/external_slave.cc 94901e131a7f 
>   src/mem/mem_checker_monitor.hh 94901e131a7f 
>   src/mem/mem_checker_monitor.cc 9

[gem5-dev] Review Request 2647: cpu: o3 register renaming request handling improved

2015-02-07 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2647/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10709:a6a8ea1781e8
---
cpu: o3 register renaming rquest handling improved

Now, prior to the renaming, the instruction requests the exact amount of
registers it will need, and the rename_map decides whether the instruction is
allowed to proceed or not.


Diffs
-

  src/cpu/base_dyn_inst.hh 94901e131a7f 
  src/cpu/o3/rename_impl.hh 94901e131a7f 
  src/cpu/o3/rename_map.hh 94901e131a7f 
  src/cpu/static_inst.hh 94901e131a7f 

Diff: http://reviews.gem5.org/r/2647/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2646: mem: Split port retry for all different packet classes

2015-02-07 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2646/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10707:ec8bd9a8e1e2
---
mem: Split port retry for all different packet classes

This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.

The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvRetryReq, sendTimingResp has
recvRetryResp etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.

The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.


Diffs
-

  src/cpu/minor/fetch1.hh 94901e131a7f 
  src/cpu/kvm/base.hh 94901e131a7f 
  src/arch/x86/pagetable_walker.hh 94901e131a7f 
  src/arch/x86/pagetable_walker.cc 94901e131a7f 
  src/cpu/minor/fetch1.cc 94901e131a7f 
  src/cpu/minor/lsq.hh 94901e131a7f 
  src/cpu/minor/lsq.cc 94901e131a7f 
  src/cpu/o3/cpu.hh 94901e131a7f 
  src/cpu/o3/cpu.cc 94901e131a7f 
  src/cpu/o3/fetch.hh 94901e131a7f 
  src/cpu/o3/fetch_impl.hh 94901e131a7f 
  src/cpu/o3/lsq.hh 94901e131a7f 
  src/cpu/o3/lsq_impl.hh 94901e131a7f 
  src/cpu/simple/atomic.hh 94901e131a7f 
  src/cpu/simple/timing.hh 94901e131a7f 
  src/cpu/simple/timing.cc 94901e131a7f 
  src/cpu/testers/directedtest/RubyDirectedTester.hh 94901e131a7f 
  src/cpu/testers/memtest/memtest.hh 94901e131a7f 
  src/cpu/testers/memtest/memtest.cc 94901e131a7f 
  src/cpu/testers/networktest/networktest.hh 94901e131a7f 
  src/cpu/testers/networktest/networktest.cc 94901e131a7f 
  src/cpu/testers/rubytest/RubyTester.hh 94901e131a7f 
  src/cpu/testers/traffic_gen/traffic_gen.hh 94901e131a7f 
  src/cpu/testers/traffic_gen/traffic_gen.cc 94901e131a7f 
  src/dev/dma_device.hh 94901e131a7f 
  src/dev/dma_device.cc 94901e131a7f 
  src/mem/addr_mapper.hh 94901e131a7f 
  src/mem/addr_mapper.cc 94901e131a7f 
  src/mem/bridge.hh 94901e131a7f 
  src/mem/bridge.cc 94901e131a7f 
  src/mem/cache/base.hh 94901e131a7f 
  src/mem/cache/base.cc 94901e131a7f 
  src/mem/cache/cache.hh 94901e131a7f 
  src/mem/cache/cache_impl.hh 94901e131a7f 
  src/mem/coherent_xbar.hh 94901e131a7f 
  src/mem/coherent_xbar.cc 94901e131a7f 
  src/mem/comm_monitor.hh 94901e131a7f 
  src/mem/comm_monitor.cc 94901e131a7f 
  src/mem/dram_ctrl.hh 94901e131a7f 
  src/mem/dram_ctrl.cc 94901e131a7f 
  src/mem/dramsim2.hh 94901e131a7f 
  src/mem/dramsim2.cc 94901e131a7f 
  src/mem/external_slave.cc 94901e131a7f 
  src/mem/mem_checker_monitor.hh 94901e131a7f 
  src/mem/mem_checker_monitor.cc 94901e131a7f 
  src/mem/mport.hh 94901e131a7f 
  src/mem/noncoherent_xbar.hh 94901e131a7f 
  src/mem/noncoherent_xbar.cc 94901e131a7f 
  src/mem/packet_queue.hh 94901e131a7f 
  src/mem/packet_queue.cc 94901e131a7f 
  src/mem/port.hh 94901e131a7f 
  src/mem/port.cc 94901e131a7f 
  src/mem/qport.hh 94901e131a7f 
  src/mem/ruby/slicc_interface/AbstractController.hh 94901e131a7f 
  src/mem/ruby/slicc_interface/AbstractController.cc 94901e131a7f 
  src/mem/ruby/structures/RubyMemoryControl.hh 94901e131a7f 
  src/mem/ruby/system/DMASequencer.hh 94901e131a7f 
  src/mem/ruby/system/DMASequencer.cc 94901e131a7f 
  src/mem/ruby/system/RubyPort.hh 94901e131a7f 
  src/mem/ruby/system/RubyPort.cc 94901e131a7f 
  src/mem/simple_mem.hh 94901e131a7f 
  src/mem/simple_mem.cc 94901e131a7f 
  src/mem/tport.hh 94901e131a7f 
  src/mem/tport.cc 94901e131a7f 
  src/mem/xbar.hh 94901e131a7f 
  src/mem/xbar.cc 94901e131a7f 
  src/sim/system.hh 94901e131a7f 

Diff: http://reviews.gem5.org/r/2646/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2611: cpu: Tidy up the MemTest and make false sharing more obvious

2015-02-07 Thread Andreas Hansson via gem5-dev


> On Jan. 30, 2015, 9:47 p.m., Nilay Vaish wrote:
> > src/cpu/testers/memtest/MemTest.py, line 55
> > 
> >
> > Are you sure this should be dropped?  I think the coherence protocols 
> > that provide a dma controller need this for testing.
> 
> Andreas Hansson wrote:
> All regressions work just fine (with stats updates).
> 
> I am about to post a separate test script that actually shares data (not 
> just false sharing), doing so based on the TrafficGen and MemChecker.
> 
> Andreas Hansson wrote:
> You're ok with this Nilay? I'll post the true-sharing checker later this 
> week.
> 
> Andreas Hansson wrote:
> Pretty please :-)
> 
> Andreas Hansson wrote:
> http://reviews.gem5.org/r/2626/ actually tests true sharing

Would really appreciate getting this pushed, so if there are any more comments 
please let me know.


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2611/#review5809
---


On Jan. 21, 2015, 1:23 p.m., Andreas Hansson wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2611/
> ---
> 
> (Updated Jan. 21, 2015, 1:23 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10671:94bc71e83168
> ---
> cpu: Tidy up the MemTest and make false sharing more obvious
> 
> The MemTest class really only tests false sharing, and as such there
> was a lot of old cruft that could be removed. This patch cleans up the
> tester, and also makes it more clear what the assumptions are. As part
> of this simplification the reference functional memory is also
> removed.
> 
> The regression configs using MemTest are updated to reflect the
> changes, and the stats will be bumped in a separate patch. The example
> config will be updated in a separate patch due to more extensive
> re-work.
> 
> In a follow-on patch a new tester will be introduced that uses the
> MemChecker to implement true sharing.
> 
> 
> Diffs
> -
> 
>   configs/example/memtest.py a6fe75e8296b 
>   src/cpu/testers/memtest/MemTest.py a6fe75e8296b 
>   src/cpu/testers/memtest/memtest.hh a6fe75e8296b 
>   src/cpu/testers/memtest/memtest.cc a6fe75e8296b 
>   tests/configs/memtest-filter.py a6fe75e8296b 
>   tests/configs/memtest-ruby.py a6fe75e8296b 
>   tests/configs/memtest.py a6fe75e8296b 
> 
> Diff: http://reviews.gem5.org/r/2611/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andreas Hansson
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2624: mem: mmap the backing store with MAP_NORESERVE

2015-02-07 Thread Andreas Hansson via gem5-dev


> On Feb. 3, 2015, 9:37 p.m., Brandon Potter wrote:
> > The patch seems like a landmine for an unsuspecting user as it would be 
> > difficult to diagnose if swap space is exhausted.  It will probably be 
> > evident that memory exhaustion caused the runtime error (segmentation 
> > fault), but tracking down the cause to this patch will probably be 
> > non-trivial (maybe?).  For a user, the memory allocation failure from 
> > insufficient swap space would probably be preferable to a hard to diagnose 
> > segmentation fault.
> > 
> > Even though it is ugly, maybe it's best to add #if 0 / #endif around the 
> > code to allow someone to find this later on and enable the feature if they 
> > need it (with a warning about the segmentation fault).  The memory 
> > allocation failure will point the user to the code in this patch.
> 
> Andreas Hansson wrote:
> Fair point. An option would be to not use the flag for "sensible" sizes, 
> perhaps <16 GB or so, and adopt it for larger sizes with a warning?
> 
> Brandon Potter wrote:
> Perhaps that's fine.  Although, I think that a value for "sensible" will 
> vary from person to person.
> 
> Personally, I still feel that it is better to leave it completely 
> disabled and allow someone to enable it if needed (solely to prevent 
> confusion later).  It is trivial to enable the functionality by modifying the 
> condition.  Another avenue might be whether it is worthwhile to use a 
> preprocessor define and the #ifdef to enable/disable this during compilation; 
> might be overkill to go through the trouble though.
> 
> Overall, maybe this is just paranoia on my part; it does seem unlikely to 
> cause a problem in practice.
> 
> Andreas Hansson wrote:
> The input is much appreciated.
> 
> My hesitation to the approach of asking the user to enable it if needed, 
> is that it means yet another parameter to keep track of (or yet another patch 
> to manage). If we go for the "limit approach", and a large value like 1 TB 
> perhaps?
> 
> I also think this is unlikely to cause problems the way it is, and as 
> such I would prefer not adding more configurability unless you really think 
> it is needed.
>
> 
> Brandon Potter wrote:
> Seems reasonable.  There is a tradeoff either way.

So what do people think?

1) Good as it is (avoid more switches and magic limits)?
2) Add a limit, e.g. 1 TB and use above that?
3) Add a switch and force the user to explicitly set it?

In the spirit of keeping things simple I'd suggest (1).


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2624/#review5821
---


On Feb. 3, 2015, 7:57 p.m., Andreas Hansson wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2624/
> ---
> 
> (Updated Feb. 3, 2015, 7:57 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10689:568e9a533cf0
> ---
> mem: mmap the backing store with MAP_NORESERVE
> 
> This patch ensures we can run simulations with very large simulated
> memories (at least 64 TB based on some quick runs on a Linux
> workstation). In essence this allows us to efficiently deal with
> sparse address maps without having to implement a redirection layer in
> the backing store.
> 
> This opens up for run-time errors if we eventually exhausts the hosts
> memory and swap space, but this should hopefully never happen.
> 
> 
> Diffs
> -
> 
>   src/mem/physical.cc 7639c17357dc 
> 
> Diff: http://reviews.gem5.org/r/2624/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andreas Hansson
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2626: config: Add memcheck stress test

2015-02-06 Thread Andreas Hansson via gem5-dev


> On Feb. 6, 2015, 5:10 p.m., Steve Reinhardt wrote:
> > So would this replace memtest.py?  If so, then factoring out the common 
> > code would not be an issue.  If not, why not?
> 
> Andreas Hansson wrote:
> The two are complementary:
> 
> memtest.py uses MemTest and only tests false sharing, with some progress 
> detection built in
> 
> memcheck.py uses TrafficGen and MemCheck and tests actual sharing and 
> prefetching, and has a mix of random and strided access patterns
>
> 
> Steve Reinhardt wrote:
> But do you expect that there are errors that memtest.py would find that 
> memcheck.py would not, even allowing for some tweaking of the TrafficGen 
> access patterns?  The overhead of not just maintaining but also running two 
> different memory testers seems pretty high, so even if they have 
> complementary approaches, it seems worthwhile only if they have significant 
> non-overlap in their coverage of the memory system state/protocol transition 
> space.

I do believe that they both fill a purpose at the moment.

memtest.py has very little state, and can work on a much larger scale (100's of 
testers with 10's of cache levels), I have run some 100 trillion cycles with 
this (once the message-dependent deadlocks are fixes)

memcheck.py relies on the MemCheck tracking of what the allowed read values 
are, and the state tracking does not scale very well at this point. I believe 
we can do some more aggressive pruning (and this is planned work), but even 
running a few billion cycles is currently challenging

Long term the memcheck is probably all we need


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2626/#review5855
---


On Feb. 3, 2015, 7:57 p.m., Andreas Hansson wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2626/
> ---
> 
> (Updated Feb. 3, 2015, 7:57 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10692:f0a93f672561
> ---
> config: Add memcheck stress test
> 
> This is a rather unfortunate copy of the memtest.py example script,
> that actually stresses the system with true sharing as opposed to the
> false sharing of the MemTest. To do so it uses TrafficGen instances to
> generate the reads/writes, and MemCheckerMonitor combined with the
> MemChecker to check the validity of the read/written values.
> 
> As a bonus, this script also enables the addition of prefetchers, and
> the traffic is created to have a mix of random addresses and linear
> strides. We use the TaggedPrefetcher since the packets do not have a
> request with a PC.
> 
> At the moment the code is almost identical to the memtest.py script,
> and no effort has been made to factor out the construction of the
> tree. The challenge is that the instantiation and connection of the
> testers and monitors is done as part of the tree building.
> 
> 
> Diffs
> -
> 
>   configs/example/memcheck.py PRE-CREATION 
> 
> Diff: http://reviews.gem5.org/r/2626/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andreas Hansson
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2626: config: Add memcheck stress test

2015-02-06 Thread Andreas Hansson via gem5-dev


> On Feb. 6, 2015, 5:10 p.m., Steve Reinhardt wrote:
> > So would this replace memtest.py?  If so, then factoring out the common 
> > code would not be an issue.  If not, why not?

The two are complementary:

memtest.py uses MemTest and only tests false sharing, with some progress 
detection built in

memcheck.py uses TrafficGen and MemCheck and tests actual sharing and 
prefetching, and has a mix of random and strided access patterns


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2626/#review5855
---


On Feb. 3, 2015, 7:57 p.m., Andreas Hansson wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2626/
> ---
> 
> (Updated Feb. 3, 2015, 7:57 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10692:f0a93f672561
> ---
> config: Add memcheck stress test
> 
> This is a rather unfortunate copy of the memtest.py example script,
> that actually stresses the system with true sharing as opposed to the
> false sharing of the MemTest. To do so it uses TrafficGen instances to
> generate the reads/writes, and MemCheckerMonitor combined with the
> MemChecker to check the validity of the read/written values.
> 
> As a bonus, this script also enables the addition of prefetchers, and
> the traffic is created to have a mix of random addresses and linear
> strides. We use the TaggedPrefetcher since the packets do not have a
> request with a PC.
> 
> At the moment the code is almost identical to the memtest.py script,
> and no effort has been made to factor out the construction of the
> tree. The challenge is that the instantiation and connection of the
> testers and monitors is done as part of the tree building.
> 
> 
> Diffs
> -
> 
>   configs/example/memcheck.py PRE-CREATION 
> 
> Diff: http://reviews.gem5.org/r/2626/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andreas Hansson
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2637: mem: clean up write buffer check in Cache::handleSnoop()

2015-02-06 Thread Andreas Hansson via gem5-dev


> On Feb. 6, 2015, 8:16 a.m., Andreas Hansson wrote:
> > Ship It!
> 
> Andreas Hansson wrote:
> Here it would be good to test with the new memtest: 
> http://reviews.gem5.org/r/2612/
> 
> Note that the caches suffer from message deadlock at the moment, and the 
> test dies very quickly. I shall post a patch that fixes the message deadlock 
> in the next few days.
> 
>
> 
> Steve Reinhardt wrote:
> I think I can prove that this patch does not affect program semantics.  
> Combined with passing the current regressions, I have a lot of confidence 
> that this patch doesn't break anything.  It's really just code cleanup.

I'm not saying it is changing anything, simply that there is an awesome way to 
test cache changes...and it still needs a "ship it" :-)


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2637/#review5848
---


On Feb. 6, 2015, 12:38 a.m., Steve Reinhardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2637/
> ---
> 
> (Updated Feb. 6, 2015, 12:38 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10684:ded42ff6f410
> ---
> mem: clean up write buffer check in Cache::handleSnoop()
> 
> The 'if (writebacks.size)' check was redundant, because
> writeBuffer.findMatches() would return false if the
> writebacks list was empty.
> 
> Also renamed 'mshr' to 'wb_entry' in this context since
> we are pointing at a writebuffer entry and not an MSHR
> (even though it's the same C++ class).
> 
> 
> Diffs
> -
> 
>   src/mem/cache/cache_impl.hh 3d17366c0423a59478ae63d40c8feeea34df218a 
> 
> Diff: http://reviews.gem5.org/r/2637/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Steve Reinhardt
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2636: mem: fix prefetcher bug regarding write buffer hits

2015-02-06 Thread Andreas Hansson via gem5-dev


> On Feb. 6, 2015, 8:18 a.m., Andreas Hansson wrote:
> > It think there are more (and bigger issues) at play here. I'd suggest to 
> > try this out with: http://reviews.gem5.org/r/2626/
> > 
> > If you run it with "-p", prefetches enabled, it doesn't take many seconds 
> > before it hits an assert. We're working on fixing it, and #2626 is a step 
> > on the way.
> 
> Steve Reinhardt wrote:
> Hi Andreas,
> 
> Are you saying that you will be coming out with a patch soon that will 
> subsume this one?  I don't doubt that there are more bugs to be fixed, but 
> this patch most definitely fixes one of them (the one that George was running 
> into, and the only one he's run into so far).  I'm fine with holding off if a 
> more comprehensive fix is imminent, but on the other hand I don't want to not 
> fix one bug just because the patch fails to fix all the bugs...

A more comprehensive fix is on the way and I think we should have it on RB next 
week.


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2636/#review5849
---


On Feb. 6, 2015, 12:38 a.m., Steve Reinhardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2636/
> ---
> 
> (Updated Feb. 6, 2015, 12:38 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10683:3147f3a868f7
> ---
> mem: fix prefetcher bug regarding write buffer hits
> 
> Prefetches are supposed to be squashed if the block is already
> present in a higher-level cache.  We squash appropriately if
> the block is in a higher-level cache or MSHR, but did not
> properly handle the case where the block is in the write buffer.
> 
> Thanks to George Michelogiannakis  for
> help in tracking down and testing this fix.
> 
> 
> Diffs
> -
> 
>   src/mem/cache/cache_impl.hh 3d17366c0423a59478ae63d40c8feeea34df218a 
> 
> Diff: http://reviews.gem5.org/r/2636/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Steve Reinhardt
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2637: mem: clean up write buffer check in Cache::handleSnoop()

2015-02-06 Thread Andreas Hansson via gem5-dev


> On Feb. 6, 2015, 8:16 a.m., Andreas Hansson wrote:
> > Ship It!

Here it would be good to test with the new memtest: 
http://reviews.gem5.org/r/2612/

Note that the caches suffer from message deadlock at the moment, and the test 
dies very quickly. I shall post a patch that fixes the message deadlock in the 
next few days.


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2637/#review5848
---


On Feb. 6, 2015, 12:38 a.m., Steve Reinhardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2637/
> ---
> 
> (Updated Feb. 6, 2015, 12:38 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10684:ded42ff6f410
> ---
> mem: clean up write buffer check in Cache::handleSnoop()
> 
> The 'if (writebacks.size)' check was redundant, because
> writeBuffer.findMatches() would return false if the
> writebacks list was empty.
> 
> Also renamed 'mshr' to 'wb_entry' in this context since
> we are pointing at a writebuffer entry and not an MSHR
> (even though it's the same C++ class).
> 
> 
> Diffs
> -
> 
>   src/mem/cache/cache_impl.hh 3d17366c0423a59478ae63d40c8feeea34df218a 
> 
> Diff: http://reviews.gem5.org/r/2637/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Steve Reinhardt
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2636: mem: fix prefetcher bug regarding write buffer hits

2015-02-06 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2636/#review5849
---


It think there are more (and bigger issues) at play here. I'd suggest to try 
this out with: http://reviews.gem5.org/r/2626/

If you run it with "-p", prefetches enabled, it doesn't take many seconds 
before it hits an assert. We're working on fixing it, and #2626 is a step on 
the way.

- Andreas Hansson


On Feb. 6, 2015, 12:38 a.m., Steve Reinhardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2636/
> ---
> 
> (Updated Feb. 6, 2015, 12:38 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10683:3147f3a868f7
> ---
> mem: fix prefetcher bug regarding write buffer hits
> 
> Prefetches are supposed to be squashed if the block is already
> present in a higher-level cache.  We squash appropriately if
> the block is in a higher-level cache or MSHR, but did not
> properly handle the case where the block is in the write buffer.
> 
> Thanks to George Michelogiannakis  for
> help in tracking down and testing this fix.
> 
> 
> Diffs
> -
> 
>   src/mem/cache/cache_impl.hh 3d17366c0423a59478ae63d40c8feeea34df218a 
> 
> Diff: http://reviews.gem5.org/r/2636/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Steve Reinhardt
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2637: mem: clean up write buffer check in Cache::handleSnoop()

2015-02-06 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2637/#review5848
---

Ship it!


Ship It!

- Andreas Hansson


On Feb. 6, 2015, 12:38 a.m., Steve Reinhardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2637/
> ---
> 
> (Updated Feb. 6, 2015, 12:38 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10684:ded42ff6f410
> ---
> mem: clean up write buffer check in Cache::handleSnoop()
> 
> The 'if (writebacks.size)' check was redundant, because
> writeBuffer.findMatches() would return false if the
> writebacks list was empty.
> 
> Also renamed 'mshr' to 'wb_entry' in this context since
> we are pointing at a writebuffer entry and not an MSHR
> (even though it's the same C++ class).
> 
> 
> Diffs
> -
> 
>   src/mem/cache/cache_impl.hh 3d17366c0423a59478ae63d40c8feeea34df218a 
> 
> Diff: http://reviews.gem5.org/r/2637/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Steve Reinhardt
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2627: Ruby: Update backing store option to propagate through to all RubyPorts

2015-02-05 Thread Andreas Hansson via gem5-dev


> On Feb. 5, 2015, 4:16 p.m., Joel Hestness wrote:
> > LGTM, also.
> > 
> > I don't think it should be addressed in this patch, but I'm making notes 
> > here as a vision for how I think we should proceed with these backing store 
> > and protocol developments:
> > 
> > Really what we're capturing with the backing store option is whether a Ruby 
> > coherence protocol requires the backing store, but *not* that the 
> > RubySystem needs it for any fundamental reason. In other words, requiring a 
> > backing store is a property of a particular coherence protocol 
> > implementation rather than the overall RubySystem. We've noted previously 
> > that protocol implementations can require backing store if they are not 
> > known to correctly buffer data in the cache controllers, or if they 
> > implement some functionality that cannot be easily modeled without a 
> > backing store (e.g. slim GPU atomics). The last couple changes to backing 
> > stores are major steps in the right direction: we've reduced the backing 
> > store specifier to a single option, rather than the prior structure, which 
> > disaggregated the variable and required that the programmer set the option 
> > on each of the RubySequencers (previously 'access_phys_mem'). However, the 
> > ultimate solution *should* be that each coherence 
 protocol implementation specifies whether it requires the backing store.
> > 
> > To this end, I propose that the next step for development on this should be 
> > to create an abstract Python wrapper object for Ruby protocol 
> > implementations that embeds this "requires backing store" property as a 
> > variable. All Ruby protocols could descend from that wrapper object, and we 
> > could handle the protocol as a Python object in config files rather than 
> > using the ugly Python exec calling convention we have in Ruby.py currently. 
> > If the protocol requires the backing store, Ruby should check for the 
> > requirement (e.g. accessor function in the abstract Python protocol object) 
> > and set up the backing store without the need for any command line 
> > parameters. If a developer decides to validate a protocol as "data 
> > correct", s/he would simply set the "requires backing store" variable to 
> > False for that protocol when the validation is completed.
> >

I have a question:

With this enabled, we now have two backing stores, the 1) "real" one accessed 
through the interconnect, or directly through the system pointer 
(system->getPhysMem()->access(...)), 2) the "fake" one that is used by the 
sequencers (RubySystem->getPhysMem() ...). Does Ruby properly update (1), or 
are we saying that (1) is occasionally wrong and you shouldn't trust it?


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2627/#review5840
---


On Feb. 5, 2015, 4:02 p.m., Jason Power wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2627/
> ---
> 
> (Updated Feb. 5, 2015, 4:02 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10681:b2dc4aee1a0d
> ---
> Ruby: Update backing store option to propagate through to all RubyPorts
> 
> Previously, the user would have to manually set access_backing_store=True
> on all RubyPorts (Sequencers) in the config files.
> Now, instead there is one global option that each RubyPort checks on
> initialization.
> 
> 
> Diffs
> -
> 
>   configs/ruby/Ruby.py 7639c17357dc 
>   src/mem/ruby/system/DMASequencer.hh 7639c17357dc 
>   src/mem/ruby/system/DMASequencer.cc 7639c17357dc 
>   src/mem/ruby/system/RubyPort.cc 7639c17357dc 
>   src/mem/ruby/system/RubySystem.py 7639c17357dc 
>   src/mem/ruby/system/Sequencer.py 7639c17357dc 
>   src/mem/ruby/system/System.hh 7639c17357dc 
>   src/mem/ruby/system/System.cc 7639c17357dc 
> 
> Diff: http://reviews.gem5.org/r/2627/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Power
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2627: Ruby: Update backing store option to propagate through to all RubyPorts

2015-02-05 Thread Andreas Hansson via gem5-dev


> On Feb. 5, 2015, 8:25 a.m., Andreas Hansson wrote:
> > src/mem/ruby/system/RubySystem.py, line 51
> > 
> >
> > I find the description rather misleading.
> > 
> > In essence it is not the case of bypassing the interconnect, it is more 
> > a case of duplicating the memory, and ignoring the "normal" one.
> > 
> > Thinking about it, why do we not just use the actual memory, but do so 
> > by faking it through system->getPhysMem().access(...)
> 
> Jason Power wrote:
> I've updated this description. I don't follow your last statement, though.
> 
> Andreas Hansson wrote:
> Why do we need this additional backing store, and why do we not simply 
> bypass the interconnect and use C++ magic, going through the 
> system->getPhysMem(), but to the same backing store we would have used had we 
> gone through the interconnect. Perhaps I am missing something...
> 
> Jason Power wrote:
> Well, on the one hand, this patch is just fixing the support Nilay added 
> in patch 10525:77787650cbb (http://reviews.gem5.org/r/2466/). IIRC, that 
> patch was added after Nilay removed this support although both gem5-gpu and 
> AMD depends on it. It turns out that patch was totally broken, and without 
> this patch that support didn't work. I'd like to get this patch pushed in 
> since it's fixing something that we depend on.
> 
> On the other hand, if I follow what you're saying, yes, there is a much 
> better way to provide this support. At the high level all we want is to use 
> Ruby for timing and a backing store for the data. It's very hard to implement 
> a cache coherence protocol that deals with data correctly, and the backing 
> store is useful both in debugging protocols while building them and for 
> protocols with subtle bugs in how data is handled.
> 
> Does this make sense?

Yup, makes sense.

Let's save the overhaul for another patch (or perhaps a new spin on Nilay's 
previous patch).


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2627/#review5828
---


On Feb. 5, 2015, 4:02 p.m., Jason Power wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2627/
> ---
> 
> (Updated Feb. 5, 2015, 4:02 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10681:b2dc4aee1a0d
> ---
> Ruby: Update backing store option to propagate through to all RubyPorts
> 
> Previously, the user would have to manually set access_backing_store=True
> on all RubyPorts (Sequencers) in the config files.
> Now, instead there is one global option that each RubyPort checks on
> initialization.
> 
> 
> Diffs
> -
> 
>   configs/ruby/Ruby.py 7639c17357dc 
>   src/mem/ruby/system/DMASequencer.hh 7639c17357dc 
>   src/mem/ruby/system/DMASequencer.cc 7639c17357dc 
>   src/mem/ruby/system/RubyPort.cc 7639c17357dc 
>   src/mem/ruby/system/RubySystem.py 7639c17357dc 
>   src/mem/ruby/system/Sequencer.py 7639c17357dc 
>   src/mem/ruby/system/System.hh 7639c17357dc 
>   src/mem/ruby/system/System.cc 7639c17357dc 
> 
> Diff: http://reviews.gem5.org/r/2627/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Power
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2627: Ruby: Update backing store option to propagate through to all RubyPorts

2015-02-05 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2627/#review5839
---

Ship it!


LGTM

- Andreas Hansson


On Feb. 5, 2015, 4:02 p.m., Jason Power wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2627/
> ---
> 
> (Updated Feb. 5, 2015, 4:02 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10681:b2dc4aee1a0d
> ---
> Ruby: Update backing store option to propagate through to all RubyPorts
> 
> Previously, the user would have to manually set access_backing_store=True
> on all RubyPorts (Sequencers) in the config files.
> Now, instead there is one global option that each RubyPort checks on
> initialization.
> 
> 
> Diffs
> -
> 
>   configs/ruby/Ruby.py 7639c17357dc 
>   src/mem/ruby/system/DMASequencer.hh 7639c17357dc 
>   src/mem/ruby/system/DMASequencer.cc 7639c17357dc 
>   src/mem/ruby/system/RubyPort.cc 7639c17357dc 
>   src/mem/ruby/system/RubySystem.py 7639c17357dc 
>   src/mem/ruby/system/Sequencer.py 7639c17357dc 
>   src/mem/ruby/system/System.hh 7639c17357dc 
>   src/mem/ruby/system/System.cc 7639c17357dc 
> 
> Diff: http://reviews.gem5.org/r/2627/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Power
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2627: Ruby: Update backing store option to propagate through to all RubyPorts

2015-02-05 Thread Andreas Hansson via gem5-dev


> On Feb. 5, 2015, 8:25 a.m., Andreas Hansson wrote:
> > src/mem/ruby/system/DMASequencer.hh, line 76
> > 
> >
> > Can we not get this through the sequencer rather?
> 
> Jason Power wrote:
> I can make this change, if you want. But I was following the same format 
> as is in the other Ruby ports (i.e. line 82 in RubyPort.hh). IMO, it's better 
> to have consistency across similar objects than minimizing the code.

Don't worry if it is too invasive. It merely seems unnecessary to pass the 
pointer around everywhere if it is already in the "bigger" objects.


> On Feb. 5, 2015, 8:25 a.m., Andreas Hansson wrote:
> > src/mem/ruby/system/DMASequencer.cc, line 216
> > 
> >
> > Should this really be functional, or should it just be "access"?
> 
> Jason Power wrote:
> I can make this change. What is the difference between access() and 
> functionalAccess()? When should one be used and not the other? I might have 
> missed it, but I can't find any guide to this on the wiki or in the code.
> 
> Also, do you want me to go back through and make this update in the 
> RubyPort as well?

functional -> do not change any state (only read/write and update data), should 
only be used by send/recvFunctional imho

normal -> anything from atomic or timing where we also need to deal with Swap, 
LLSC etc and the state (in addition to data) changes

I think in all these cases we should use access()


> On Feb. 5, 2015, 8:25 a.m., Andreas Hansson wrote:
> > src/mem/ruby/system/RubySystem.py, line 51
> > 
> >
> > I find the description rather misleading.
> > 
> > In essence it is not the case of bypassing the interconnect, it is more 
> > a case of duplicating the memory, and ignoring the "normal" one.
> > 
> > Thinking about it, why do we not just use the actual memory, but do so 
> > by faking it through system->getPhysMem().access(...)
> 
> Jason Power wrote:
> I've updated this description. I don't follow your last statement, though.

Why do we need this additional backing store, and why do we not simply bypass 
the interconnect and use C++ magic, going through the system->getPhysMem(), but 
to the same backing store we would have used had we gone through the 
interconnect. Perhaps I am missing something...


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2627/#review5828
---


On Feb. 5, 2015, 3:38 p.m., Jason Power wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2627/
> ---
> 
> (Updated Feb. 5, 2015, 3:38 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10681:b2dc4aee1a0d
> ---
> Ruby: Update backing store option to propagate through to all RubyPorts
> 
> Previously, the user would have to manually set access_backing_store=True
> on all RubyPorts (Sequencers) in the config files.
> Now, instead there is one global option that each RubyPort checks on
> initialization.
> 
> 
> Diffs
> -
> 
>   configs/ruby/Ruby.py 7639c17357dc 
>   src/mem/ruby/system/DMASequencer.hh 7639c17357dc 
>   src/mem/ruby/system/DMASequencer.cc 7639c17357dc 
>   src/mem/ruby/system/RubyPort.cc 7639c17357dc 
>   src/mem/ruby/system/RubySystem.py 7639c17357dc 
>   src/mem/ruby/system/Sequencer.py 7639c17357dc 
>   src/mem/ruby/system/System.hh 7639c17357dc 
>   src/mem/ruby/system/System.cc 7639c17357dc 
> 
> Diff: http://reviews.gem5.org/r/2627/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Power
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2635: mem: Clarification of packet crossbar timings

2015-02-05 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2635/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10701:7aa9b19210a0
---
mem: Clarification of packet crossbar timings

This patch clarifies the packet timings annotated
when going through a crossbar.

The old 'firstWordDelay' is replaced by 'headerDelay' that represents
the delay associated to the delivery of the header of the packet.

The old 'lastWordDelay' is replaced by 'payloadDelay' that represents
the delay needed to processing the payload of the packet.

For now the uses and values remain identical. However, going forward
the payloadDelay will be additive, and not include the
headerDelay. Follow-on patches will make the headerDelay capture the
pipeline latency incurred in the crossbar, whereas the payloadDelay
will capture the additional serialisation delay.


Diffs
-

  src/arch/x86/pagetable_walker.cc 7639c17357dc 
  src/dev/io_device.cc 7639c17357dc 
  src/dev/pcidev.cc 7639c17357dc 
  src/dev/x86/intdev.hh 7639c17357dc 
  src/mem/bridge.cc 7639c17357dc 
  src/mem/cache/cache_impl.hh 7639c17357dc 
  src/mem/coherent_xbar.cc 7639c17357dc 
  src/mem/dram_ctrl.cc 7639c17357dc 
  src/mem/dramsim2.cc 7639c17357dc 
  src/mem/external_slave.cc 7639c17357dc 
  src/mem/noncoherent_xbar.cc 7639c17357dc 
  src/mem/packet.hh 7639c17357dc 
  src/mem/simple_mem.cc 7639c17357dc 
  src/mem/xbar.hh 7639c17357dc 
  src/mem/xbar.cc 7639c17357dc 

Diff: http://reviews.gem5.org/r/2635/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2634: mem: Clarify usage of latency in the cache

2015-02-05 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2634/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10700:5499f76d3356
---
mem: Clarify usage of latency in the cache

This patch adds some much-needed clarity in the specification of the
cache timing. For now, hit_latency and response_latency are kept as
top-level parameters, but the cache itself has a number of local
variables to better map the individual timing variables to different
behaviours (and sub-components).

The introduced variables are:
- lookupLatency: latency of tag lookup, occuring on any access
- forwardLatency: latency that occurs in case of outbound miss
- fillLatency: latency to fill a cache block
We keep the existing responseLatency

The forwardLatency is used by allocateInternalBuffer() for:
- MSHR allocateWriteBuffer (unchached write forwarded to WriteBuffer);
- MSHR allocateMissBuffer (cacheable miss in MSHR queue);
- MSHR allocateUncachedReadBuffer (unchached read allocated in MSHR
  queue)
It is our assumption that the time for the above three buffers is the
same. Similarly, for snoop responses passing through the cache we use
forwardLatency.


Diffs
-

  src/mem/cache/base.hh 7639c17357dc 
  src/mem/cache/base.cc 7639c17357dc 
  src/mem/cache/cache_impl.hh 7639c17357dc 
  src/mem/cache/tags/base.hh 7639c17357dc 
  src/mem/cache/tags/base.cc 7639c17357dc 
  src/mem/cache/tags/base_set_assoc.hh 7639c17357dc 
  src/mem/cache/tags/base_set_assoc.cc 7639c17357dc 
  src/mem/cache/tags/fa_lru.hh 7639c17357dc 
  src/mem/cache/tags/fa_lru.cc 7639c17357dc 

Diff: http://reviews.gem5.org/r/2634/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2633: base: Do not dereference NULL in CompoundFlag creation

2015-02-05 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2633/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10694:c953f9d91908
---
base: Do not dereference NULL in CompoundFlag creation

This patch fixes the CompoundFlag constructor, ensuring that it does
not dereference NULL. Doing so has undefined behaviuor, and both clang
and gcc's undefined-behaviour sanitiser was rather unhappy.


Diffs
-

  src/SConscript 7639c17357dc 
  src/base/debug.hh 7639c17357dc 

Diff: http://reviews.gem5.org/r/2633/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2629: sim: Move the BaseTLB to src/arch/generic/

2015-02-05 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2629/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10692:51ce51ce19d6
---
sim: Move the BaseTLB to src/arch/generic/

The TLB-related code is generally architecture dependent and should
live in the arch directory to signify that.


Diffs
-

  src/arch/alpha/tlb.hh 7639c17357dc 
  src/arch/arm/stage2_lookup.hh 7639c17357dc 
  src/arch/arm/tlb.hh 7639c17357dc 
  src/arch/generic/BaseTLB.py PRE-CREATION 
  src/arch/generic/SConscript 7639c17357dc 
  src/arch/generic/tlb.hh PRE-CREATION 
  src/arch/generic/tlb.cc PRE-CREATION 
  src/arch/mips/tlb.hh 7639c17357dc 
  src/arch/power/tlb.hh 7639c17357dc 
  src/arch/sparc/tlb.hh 7639c17357dc 
  src/arch/x86/faults.hh 7639c17357dc 
  src/arch/x86/tlb.hh 7639c17357dc 
  src/cpu/base_dyn_inst.hh 7639c17357dc 
  src/cpu/checker/cpu.cc 7639c17357dc 
  src/cpu/translation.hh 7639c17357dc 
  src/sim/BaseTLB.py 7639c17357dc 
  src/sim/SConscript 7639c17357dc 
  src/sim/tlb.hh 7639c17357dc 
  src/sim/tlb.cc 7639c17357dc 

Diff: http://reviews.gem5.org/r/2629/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2632: style: Fix broken m5format command

2015-02-05 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2632/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10698:004519179fc0
---
style: Fix broken m5format command

The m5format command didn't actually work due to parameter handling
issues and missing language detection. This changeset fixes those
issues and cleans up some of the code to shared between the style
checker and the format checker.


Diffs
-

  util/style.py 7639c17357dc 

Diff: http://reviews.gem5.org/r/2632/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2628: base: Add compiler macros to add deprecation warnings

2015-02-05 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2628/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10691:d5f62b1204ad
---
base: Add compiler macros to add deprecation warnings

Gcc and clang both provide an attribute that can be used to flag a
function as deprecated at compile time. This changeset adds a gem5
compiler macro for that compiler feature. The macro can be used to
indicate that a legacy API within gem5 has been deprecated and provide
a graceful migration to the new API.


Diffs
-

  src/SConscript 7639c17357dc 
  src/base/compiler.hh 7639c17357dc 

Diff: http://reviews.gem5.org/r/2628/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2631: style: Fix incorrect style checker option name

2015-02-05 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2631/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10697:abe0bc9cbee2
---
style: Fix incorrect style checker option name

The style used to support the option -w to automatically fix white
space issues. However, this option was actually wired up to fix all
styles issues the checker encountered. This changeset cleans up the
code that handles automatic fixing and adds an option to fix all
issues, and separate options for white spaces and include ordering.


Diffs
-

  util/style.py 7639c17357dc 

Diff: http://reviews.gem5.org/r/2631/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


[gem5-dev] Review Request 2630: arm: Wire up the GIC with the platform in the base class

2015-02-05 Thread Andreas Hansson via gem5-dev

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2630/
---

Review request for Default.


Repository: gem5


Description
---

Changeset 10693:cc4bc8869ff8
---
arm: Wire up the GIC with the platform in the base class

Move the (common) GIC initialization code that notifies the platform
code of the new GIC to the base class (BaseGic) instead of the Pl390
implementation.


Diffs
-

  src/dev/arm/base_gic.cc 7639c17357dc 
  src/dev/arm/gic_pl390.cc 7639c17357dc 

Diff: http://reviews.gem5.org/r/2630/diff/


Testing
---


Thanks,

Andreas Hansson

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 2607: O3CPU: Idle CPU status logic revised

2015-02-05 Thread Andreas Hansson via gem5-dev


> On Feb. 3, 2015, 6:40 p.m., Stephan Diestelhorst wrote:
> > Looks fine to me, too.  However, when following the logic here, I found 
> > that this is a generally messy part of the O3 core.  I think it would make 
> > sense to (1) restructure similar to the Minor CPU core (where there is a 
> > state per context) and then (2) unify this whole handling in BaseCPU and 
> > pull it out of the respective cores.  AFAICS, this (active / idle state vs 
> > threading) should be generic enough to be pulled out, or at least be 
> > similar across the cores.
> 
> Steve Reinhardt wrote:
> Hi Stephan, I agree with you and Andreas that (1) this whole part of the 
> code is overly complicated and (2) it would be good to abstract as much as 
> possible and move it to BaseCPU.  However, this is just a simple bug fix, so 
> it would be nice to get it committed and move on, and defer the restructuring 
> to some future date.  Does that sound OK?

I'd say the current patch is fine and good to go (just update the summary 
keyword to cpu and make sure all regressions pass).

That said, it would be good to see a few follow-up patches addressing the 
aforementioned issues. Since you started down this path it seems like a natural 
progression.


- Andreas


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2607/#review5818
---


On Jan. 21, 2015, midnight, Alexandru Dutu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2607/
> ---
> 
> (Updated Jan. 21, 2015, midnight)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 10652:1dd33bc912af
> ---
> O3CPU: Idle CPU status logic revised
> This patch sets the CPU status to idle when the last active thread gets
> suspended.
> 
> 
> Diffs
> -
> 
>   src/cpu/o3/cpu.cc a6fe75e8296b0dd2293596d11b2ac0f842b10463 
> 
> Diff: http://reviews.gem5.org/r/2607/diff/
> 
> 
> Testing
> ---
> 
> Tested ALPHA and X86 quick SE regressions. Also, tested ALPHA multi-program 
> SMT with small applications.
> 
> 
> Thanks,
> 
> Alexandru Dutu
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


  1   2   3   4   5   6   7   8   >