[m5-dev] Checkpointing

2011-03-27 Thread Ali Saidi
Is there any reason to have a serialize function in the timing and o3 cpus? 
Creating a checkpoint from them will be broken since if you're using cache the 
dirty data won't be saved? Shouldn't we change their implementation to fatal()?

Thanks,
Ali

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Early Branch Resolution in O3

2011-03-27 Thread Ali Saidi

On Mar 26, 2011, at 4:48 PM, Korey Sewell wrote:

 I'm bumping the below e-mail from the users lists to dev. I believe it
 is a legit problem with decode not actually passing back the correct
 value for taken/not taken to the branch predictor when it detects a
 pc-relative, unconditional control branch in decode.
 
 The relevant line in decode_impl.hh is this:
 toFetch-decodeInfo[tid].branchTaken = inst-pcState().branching();

Since it's just resolving unconditional branches (and actually just when there 
is a mis-predict), doesn't that mean branchTaken should = true?


 
 However, the branch isnt technically resolved at that point so you
 wont get the right resolution back to BpHistory.
 
 I'm thinking the fix might be to:
 1. Call execute on the branch in decode so that the PC/NPC/NNPC values
 can be updated.
 2. Call squash in decode
 3. Bypass execution of the unconditonal, pc-relative branch in IEW...
 (this may be trickier than it sounds, but I'm not fond of the idea of
 just executing the branch 2x)
 
 Any thoughts?


The actual instruction isn't sent back, it's just the PC and take/not-taken so 
I don't think other things need to change. In the case of a successful branch, 
the bp is updated when that branch commits (in commit).

Ali


 
 
 -- Forwarded message --
 From: reena panda reena.pa...@gmail.com
 Date: Sat, Mar 26, 2011 at 4:17 PM
 Subject: [m5-users] Help with Branch Misprediction recovery in m5.
 To: M5 users mailing list m5-us...@m5sim.org
 
 
 Hi,
 
 I am using m5 in ALPHA FS mode, with O3 CPU model. I was going through
 the fetch/decode stage implementation in m5. But I can't understand
 properly, the way branch misprediction is handled in the decode/fetch
 stage of pipeline. Please correct me if I am wrong, but the way it is
 currently implemented in m5 is as follows:-
 
 Suppose an unconditional branch(PC = x, say) is fetched in the fetch
 cycle, its branch prediction history is immediately updated as a
 taken branch. Now lets say, its actual target is Y. But suppose
 the entry corresponding to the branch PC (x) is not found in the BTB,
 then the next PC and nextNPC are still updated to x+4, x+8
 respectively. Since unconditional branches can be resolved in the
 decode stage, the following check is correctly performed in
 decodeInsts function (in decode_impl.hh):-
 
 if (inst-branchTarget() != inst-readPredPC()) {
 ++decodeBranchMispred;
 squash(inst, inst-threadNumber);
 }
 
 But what is odd is that in squash function, the following information
 is sent back to fetch stage:-
 
 toFetch-decodeInfo[tid].nextPC  = inst-branchTarget();
 toFetch-decodeInfo[tid].nextNPC = inst-branchTarget() +
 sizeof(TheISA::MachInst);
 toFetch-decodeInfo[tid].branchTaken = inst-readNextPC() !=
 (inst-readPC() + sizeof(TheISA::MachInst));
 
 The third statement is odd because it compares nextPC with PC( i.e,
 x+4 with x+4) yields branch direction as not-taken, Which is wrong
 and would update the branch predictors incorrectly. Branch is actually
 an unconditional taken branch in the example. Then, should not the
 last line be something like this:-
 toFetch-decodeInfo[tid].branchTaken =  inst-branchTarget()
 != (inst-readPC() + sizeof(TheISA::MachInst));
 
 Please point if I am missing something here? I can't understand the
 working correctly. Also, can some one give me pointers on how to infer
 total branch misprediction statistics from the stats.txt file, the
 stats seem to be scattered across the different pipeline stages. Are
 they all disjoint/or is there any degree of overlap between them?
 
 Thanks,
 Reena
 
 ___
 m5-users mailing list
 m5-us...@m5sim.org
 http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
 
 
 
 -- 
 - Korey
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev
 

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


[m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression --scratch all

2011-03-27 Thread Cron Daemon
M5 exited with non-zero status* 
build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby FAILED!
* build/ALPHA_SE/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby 
FAILED!
* build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby 
FAILED!
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby 
FAILED!
* build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing-ruby 
FAILED!
* build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-timing-ruby 
FAILED!
* build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-timing-ruby 
FAILED!
* build/X86_SE/tests/fast/long/20.parser/x86/linux/o3-timing FAILED!
scons: *** [build/POWER_SE/kern/linux/linux.fo] Error 1
scons: *** [build/POWER_SE/arch/power/insts/branch.fo] Error 1
scons: *** [build/POWER_SE/arch/power/insts/mem.fo] Error 1
scons: *** [build/POWER_SE/arch/power/insts/integer.fo] Error 1
scons: *** [build/POWER_SE/arch/power/insts/floating.fo] Error 1
scons: *** [build/POWER_SE/arch/power/insts/condition.fo] Error 1
scons: *** [build/POWER_SE/arch/power/insts/static_inst.fo] Error 1
scons: *** [build/POWER_SE/arch/power/pagetable.fo] Error 1
scons: *** [build/POWER_SE/arch/power/utility.fo] Error 1
scons: *** [build/POWER_SE/arch/power/tlb.fo] Error 1
scons: *** [build/POWER_SE/arch/power/process.fo] Error 1
scons: *** [build/POWER_SE/arch/power/linux/process.fo] Error 1
scons: *** [build/POWER_SE/arch/power/decoder.fo] Error 1
scons: *** [build/POWER_SE/arch/power/atomic_simple_cpu_exec.fo] Error 1
scons: *** [build/POWER_SE/arch/power/o3_cpu_exec.fo] Error 1
scons: *** [build/POWER_SE/arch/power/timing_simple_cpu_exec.fo] Error 1
scons: *** [build/POWER_SE/sim/stat_control.fo] Error 1
scons: *** [build/POWER_SE/sim/faults.fo] Error 1
scons: *** [build/POWER_SE/sim/pseudo_inst.fo] Error 1
scons: *** [build/POWER_SE/sim/system.fo] Error 1
scons: *** [build/POWER_SE/sim/tlb.fo] Error 1
scons: *** [build/POWER_SE/sim/process.fo] Error 1
scons: *** [build/POWER_SE/sim/syscall_emul.fo] Error 1
scons: *** [build/POWER_SE/mem/physical.fo] Error 1
scons: *** [build/POWER_SE/mem/page_table.fo] Error 1
scons: *** [build/POWER_SE/mem/translating_port.fo] Error 1
scons: *** [build/POWER_SE/mem/cache/base.fo] Error 1
scons: *** [build/POWER_SE/mem/cache/prefetch/base.fo] Error 1
scons: *** [build/POWER_SE/cpu/base.fo] Error 1
scons: *** [build/POWER_SE/cpu/exetrace.fo] Error 1
scons: *** [build/POWER_SE/cpu/inteltrace.fo] Error 1
scons: *** [build/POWER_SE/cpu/nativetrace.fo] Error 1
scons: *** [build/POWER_SE/cpu/quiesce_event.fo] Error 1
scons: *** [build/POWER_SE/cpu/pc_event.fo] Error 1
scons: *** [build/POWER_SE/cpu/static_inst.fo] Error 1
scons: *** [build/POWER_SE/cpu/thread_context.fo] Error 1
scons: *** [build/POWER_SE/cpu/simple_thread.fo] Error 1
scons: *** [build/POWER_SE/cpu/thread_state.fo] Error 1
scons: *** [build/POWER_SE/cpu/simple/atomic.fo] Error 1
scons: *** [build/POWER_SE/cpu/simple/timing.fo] Error 1
scons: *** [build/POWER_SE/cpu/simple/base.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/base_dyn_inst.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/bpred_unit.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/commit.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/cpu.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/cpu_builder.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/decode.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/dyn_inst.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/fetch.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/free_list.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/iew.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/inst_queue.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/lsq.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/lsq_unit.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/mem_dep_unit.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/rename.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/rename_map.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/rob.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/scoreboard.fo] Error 1
scons: *** [build/POWER_SE/cpu/o3/thread_context.fo] Error 1
scons: *** [build/POWER_SE/cpu/pred/btb.fo] Error 1
scons: *** [build/POWER_SE/cpu/pred/ras.fo] Error 1
scons: *** [build/POWER_SE/base/remote_gdb.fo] Error 1
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing passed.
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-atomic passed.
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-atomic passed.
* build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple-timing-mp 
passed.
* build/ALPHA_SE/tests/fast/quick/01.hello-2T-smt/alpha/linux/o3-timing 
passed.
* build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/simple-atomic passed.
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/o3-timing passed.
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder-timing 
passed.
* build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest passed.

Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression --scratch all

2011-03-27 Thread Gabe Black
Traceback (most recent call last):
  File string, line 1, in module
  File /z/m5/regression/zizzer/m5/src/python/m5/main.py, line 348, in main
exec filecode in scope
  File tests/run.py, line 70, in module
execfile(joinpath(tests_root, 'configs', test_filename + '.py'))
  File tests/configs/simple-timing-ruby.py, line 77, in module
system.ruby = Ruby.create_system(options, system)
  File /z/m5/regression/zizzer/m5/configs/ruby/Ruby.py, line 70, in
create_system
% protocol)
  File string, line 1, in module
  File /z/m5/regression/zizzer/m5/configs/ruby/MI_example.py, line 63,
in create_system
block_size_bits = int(math.log(options.cacheline_size, 2))
NameError: name 'math' is not defined

This is the changeset that added a call to the log function in the math
package without actually importing it:

changeset:   8180:d8587c913ccf
user:Brad Beckmann brad.beckm...@amd.com
date:Fri Mar 25 10:13:50 2011 -0700
summary: ruby: fixed cache index setting

I'm not sure how it would have worked in testing since it really isn't
imported or defined anywhere else, unless there was some other change
ahead of this originally or this was modified somehow. It could also be
the case that something got imported indirectly through from m5.objects
import *

Gabe

On 03/27/11 13:15, Cron Daemon wrote:
 M5 exited with non-zero status* 
 build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby 
 FAILED!
 * build/ALPHA_SE/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby 
 FAILED!
 * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby 
 FAILED!
 * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby 
 FAILED!
 * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing-ruby 
 FAILED!
 * build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-timing-ruby 
 FAILED!
 * build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-timing-ruby 
 FAILED!
 * build/X86_SE/tests/fast/long/20.parser/x86/linux/o3-timing FAILED!
 scons: *** [build/POWER_SE/kern/linux/linux.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/insts/branch.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/insts/mem.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/insts/integer.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/insts/floating.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/insts/condition.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/insts/static_inst.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/pagetable.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/utility.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/tlb.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/process.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/linux/process.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/decoder.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/atomic_simple_cpu_exec.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/o3_cpu_exec.fo] Error 1
 scons: *** [build/POWER_SE/arch/power/timing_simple_cpu_exec.fo] Error 1
 scons: *** [build/POWER_SE/sim/stat_control.fo] Error 1
 scons: *** [build/POWER_SE/sim/faults.fo] Error 1
 scons: *** [build/POWER_SE/sim/pseudo_inst.fo] Error 1
 scons: *** [build/POWER_SE/sim/system.fo] Error 1
 scons: *** [build/POWER_SE/sim/tlb.fo] Error 1
 scons: *** [build/POWER_SE/sim/process.fo] Error 1
 scons: *** [build/POWER_SE/sim/syscall_emul.fo] Error 1
 scons: *** [build/POWER_SE/mem/physical.fo] Error 1
 scons: *** [build/POWER_SE/mem/page_table.fo] Error 1
 scons: *** [build/POWER_SE/mem/translating_port.fo] Error 1
 scons: *** [build/POWER_SE/mem/cache/base.fo] Error 1
 scons: *** [build/POWER_SE/mem/cache/prefetch/base.fo] Error 1
 scons: *** [build/POWER_SE/cpu/base.fo] Error 1
 scons: *** [build/POWER_SE/cpu/exetrace.fo] Error 1
 scons: *** [build/POWER_SE/cpu/inteltrace.fo] Error 1
 scons: *** [build/POWER_SE/cpu/nativetrace.fo] Error 1
 scons: *** [build/POWER_SE/cpu/quiesce_event.fo] Error 1
 scons: *** [build/POWER_SE/cpu/pc_event.fo] Error 1
 scons: *** [build/POWER_SE/cpu/static_inst.fo] Error 1
 scons: *** [build/POWER_SE/cpu/thread_context.fo] Error 1
 scons: *** [build/POWER_SE/cpu/simple_thread.fo] Error 1
 scons: *** [build/POWER_SE/cpu/thread_state.fo] Error 1
 scons: *** [build/POWER_SE/cpu/simple/atomic.fo] Error 1
 scons: *** [build/POWER_SE/cpu/simple/timing.fo] Error 1
 scons: *** [build/POWER_SE/cpu/simple/base.fo] Error 1
 scons: *** [build/POWER_SE/cpu/o3/base_dyn_inst.fo] Error 1
 scons: *** [build/POWER_SE/cpu/o3/bpred_unit.fo] Error 1
 scons: *** [build/POWER_SE/cpu/o3/commit.fo] Error 1
 scons: *** [build/POWER_SE/cpu/o3/cpu.fo] Error 1
 scons: *** [build/POWER_SE/cpu/o3/cpu_builder.fo] Error 1
 scons: *** [build/POWER_SE/cpu/o3/decode.fo] Error 1
 scons: *** [build/POWER_SE/cpu/o3/dyn_inst.fo] Error 1
 scons: *** [build/POWER_SE/cpu/o3/fetch.fo] Error 1
 scons: *** [build/POWER_SE/cpu/o3/free_list.fo] Error 1
 scons: *** 

Re: [m5-dev] Review Request: config: revamp x86 config to avoid appending to SimObjectVectors

2011-03-27 Thread Gabe Black


 On 2011-03-26 12:31:49, Gabe Black wrote:
  src/arch/x86/bios/E820.py, line 53
  http://reviews.m5sim.org/r/609/diff/1/?file=11254#file11254line53
 
  I think at least most of these lists should be allowed to be empty 
  regardless of if they're being appended to.
 
 Steve Reinhardt wrote:
 They're still allowed to be empty... the question is, does that make 
 sense as a default value?  Would the system work with these parameters left 
 as the empty list?  If so, I can put that back.

For int_lines it definitely does since that's just for making sure interrupt 
lines are instantiated by making sure they have a parent. Those are objects 
that avoid cycles of simobjects by pointing outwards to their two ends. The 
ends don't refer the line object, so no cycle can cross it. The side effect is 
that nothing inherently refers to the line itself, so those need to be 
explicitly parented somehow. I put them all in a VectorParam to get them all at 
once instead of making up junk names for everything individually.

The E820 table entries make less sense to be empty since you don't have much of 
a table with no entries. There may be a situation where you want to have a 
place holder E820 table with nothing in it since you have to have it as a 
parameter to something but don't want to actually have a table. The E820 table 
is provided by the BIOS through an interrupt call where I think ax is set to 
e820 or something like that. It provides entries which describe regions of 
memory that exist and labels them as available for use, reserved, or one of two 
types related to ACPI, if I remember correctly. A table like that is pretty 
important, but as I said I can imagine when you'd want to be able to have an 
empty one.


- Gabe


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/609/#review1019
---


On 2011-03-26 12:17:28, Steve Reinhardt wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/609/
 ---
 
 (Updated 2011-03-26 12:17:28)
 
 
 Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and 
 Nathan Binkert.
 
 
 Summary
 ---
 
 config: revamp x86 config to avoid appending to SimObjectVectors
 A significant contributor to the need for adoptOrphanParams()
 is the practice of appending to SimObjectVectors which have
 already been assigned as children.  This practice sidesteps the
 assignment operation for those appended SimObjects, which is
 where parent/child relationships are typically established.
 
 This patch reworks the config scripts that use append() on
 SimObjectVectors, which all happen to be in the x86 system
 configuration.  At some point in the future, I hope to make
 SimObjectVectors immutable (by deriving from tuple rather than
 list), at which time this patch will be necessary for correct
 operation.  For now, it just avoids some of the warning
 messages that get printed in adoptOrphanParams().
 
 
 Diffs
 -
 
   configs/common/FSConfig.py d8587c913ccf 
   src/arch/x86/bios/E820.py d8587c913ccf 
   src/dev/x86/SouthBridge.py d8587c913ccf 
 
 Diff: http://reviews.m5sim.org/r/609/diff
 
 
 Testing
 ---
 
 
 Thanks,
 
 Steve
 


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Early Branch Resolution in O3

2011-03-27 Thread Gabe Black
On 03/27/11 13:13, Ali Saidi wrote:
 On Mar 26, 2011, at 4:48 PM, Korey Sewell wrote:

 I'm bumping the below e-mail from the users lists to dev. I believe it
 is a legit problem with decode not actually passing back the correct
 value for taken/not taken to the branch predictor when it detects a
 pc-relative, unconditional control branch in decode.

 The relevant line in decode_impl.hh is this:
 toFetch-decodeInfo[tid].branchTaken = inst-pcState().branching();
 Since it's just resolving unconditional branches (and actually just when 
 there is a mis-predict), doesn't that mean branchTaken should = true?

Possibly, but that makes the code less general. You could use the
advancePC function to get the straight line PC of the next instruction
and then compare that with the branch target. If they don't match the
branch is taken. Calling execute in decode just to straighten this out
doesn't seem like a good idea. Changes from that that ripple into
execute seem worse.

Gabe
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Checkpointing

2011-03-27 Thread nathan binkert
 Is there any reason to have a serialize function in the timing and o3 cpus? 
 Creating a checkpoint from them will be broken since if you're using cache 
 the dirty data won't be saved? Shouldn't we change their implementation to 
 fatal()?

Is the implementation of the CPUs correct?  Arguably, it should be the
caches that cause fatal() if they're what cause the problem, no?

  Nate
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Checkpointing

2011-03-27 Thread Steve Reinhardt
On Sun, Mar 27, 2011 at 1:27 PM, nathan binkert n...@binkert.org wrote:
 Is there any reason to have a serialize function in the timing and o3 cpus? 
 Creating a checkpoint from them will be broken since if you're using cache 
 the dirty data won't be saved? Shouldn't we change their implementation to 
 fatal()?

 Is the implementation of the CPUs correct?  Arguably, it should be the
 caches that cause fatal() if they're what cause the problem, no?

I agree with Nate... in fact the Ruby caches do have a warm-up
facility that Brad is working on porting (or says he will), so we
don't want to assume that caches can't be checkpointed.  Also it's
possible to have a timing CPU with no caches (even though it doesn't
make a lot of sense).

What I would like to see is to have the O3 unserialize function fixed
so that we can avoid the silly switch_cpus thing when you want to
restore directly into O3... at least my understanding is that that's
why we don't do it that way.

Steve
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Early Branch Resolution in O3

2011-03-27 Thread Ali Saidi

On Mar 27, 2011, at 3:19 PM, Gabe Black wrote:

 On 03/27/11 13:13, Ali Saidi wrote:
 On Mar 26, 2011, at 4:48 PM, Korey Sewell wrote:
 
 I'm bumping the below e-mail from the users lists to dev. I believe it
 is a legit problem with decode not actually passing back the correct
 value for taken/not taken to the branch predictor when it detects a
 pc-relative, unconditional control branch in decode.
 
 The relevant line in decode_impl.hh is this:
 toFetch-decodeInfo[tid].branchTaken = inst-pcState().branching();
 Since it's just resolving unconditional branches (and actually just when 
 there is a mis-predict), doesn't that mean branchTaken should = true?
 
 Possibly, but that makes the code less general. You could use the
 advancePC function to get the straight line PC of the next instruction
 and then compare that with the branch target. If they don't match the
 branch is taken. Calling execute in decode just to straighten this out
 doesn't seem like a good idea. Changes from that that ripple into
 execute seem worse.
The code doesn't need to be general. In only gets called if the branch in 
unconditional, by definition that means it must be taken. If you're violently 
opposed to that, I think Gabe is right, the simplest thing to do is:
PCState nPc = inst-pcState();
nPc.advance();
branchTaken  = nPc.pc() != inst-branchTarget();

Ali


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Early Branch Resolution in O3

2011-03-27 Thread Gabe Black
On 03/27/11 20:14, Ali Saidi wrote:
 On Mar 27, 2011, at 3:19 PM, Gabe Black wrote:

 On 03/27/11 13:13, Ali Saidi wrote:
 On Mar 26, 2011, at 4:48 PM, Korey Sewell wrote:

 I'm bumping the below e-mail from the users lists to dev. I believe it
 is a legit problem with decode not actually passing back the correct
 value for taken/not taken to the branch predictor when it detects a
 pc-relative, unconditional control branch in decode.

 The relevant line in decode_impl.hh is this:
 toFetch-decodeInfo[tid].branchTaken = inst-pcState().branching();
 Since it's just resolving unconditional branches (and actually just when 
 there is a mis-predict), doesn't that mean branchTaken should = true?
 Possibly, but that makes the code less general. You could use the
 advancePC function to get the straight line PC of the next instruction
 and then compare that with the branch target. If they don't match the
 branch is taken. Calling execute in decode just to straighten this out
 doesn't seem like a good idea. Changes from that that ripple into
 execute seem worse.
 The code doesn't need to be general. In only gets called if the branch in 
 unconditional, by definition that means it must be taken. If you're violently 
 opposed to that, I think Gabe is right, the simplest thing to do is:
 PCState nPc = inst-pcState();
 nPc.advance();
 branchTaken  = nPc.pc() != inst-branchTarget();

 Ali


 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev

Except you don't want to call advance on the pc, you want to call
advancePC(nPc) on the inst. The inst knows how to advance the PC
properly (next microop, next instruction, both) and the functions that
advance the pc in particular ways aren't going to be available on every
different PCState type.
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


[m5-dev] Ruby Store/Coaslescing Buffer Implementation for TimingSimpleCPU

2011-03-27 Thread Malek Musleh
Hello,

I am interested in implementing a storebuffer (coalescing buffer) for
Ruby's Memory Model in M5/GEM5 for use in my current research.

I wish to be able to coalesce speculative stores + non-speculative
stores to the same cache line and then flush them to the cache during
certain acquire/release constructs.

I see that there was an existing directory called storebuffer, but was
removed not too long ago. Reading the associated thread on the mailing
list it seems that it was removed because it is not in use (given that
O3 is not yet functional with Ruby), nor was never actually even used
in the original GEM implementation.

Here is the link to that thread:
http://www.mail-archive.com/m5-dev@m5sim.org/msg10575.html

In further reading of that thread, I see that there is/was general
consensus that the Ruby Store Buffer will be merged with M5 O3's LSQ.

For my research, O3 CPU Model is not a requirement, although
storebuffers tend to be used typically only in O3 execution.

For what I need to do, my specific question is as follows:

A) Would it be better/easier to implement a new Buffer (similar to the
MessageBuffer class) from the Ruby Side
or
B) actually reuse M5's existing O3's LSQ buffer in the Timing CPU Model.

I think that A) might be the easier method to go for the following reasons:

1) It seems that the Sequencer class already has functionality to
support coalescing stores to the same cache line (in reading the
previous storebuffer thread)

2) This would make the coalescing buffer CPU Model independent

3) Avoid having to change the Timing CPU Code which may make it more
likely to mess up how the CPU Model handles other memory related
things (ISA-Dependent Memory references, split data requests,
prefetching, etc).

4) Allows me to make it a Ruby Only change on the Ruby Code side of
things as opposed to the M5 side of things.

However, my hesitation with this approach is because

1) the way the Sequencer operates, it is the interface between the CPU
Core and the Ruby Memory Model (converting M5 requests to Ruby
Requests and what not), so 'logically' I guess it might make more
sense to implement the store buffer before Ruby sees the store
requests, and just have the sequencer do its thing with the
coalescing?

2) The conclusion of the previous storebuffer thread was that work is
currently?/will be done implementing the store buffer on the M5 side
of things.

Depending on if I go with Approach A), I know I would have to change
which message buffer L1 communicates with L2, such that instead of
sending stores through the L2 Request Buffer, I would send it as
follows:

L1 - Coaslescing Buffer - L2 Request Network Buffer - L2
instead of
L1 - L2 Request Network Buffer - L2

But I am not sure how exactly I would go about this if I want to add
this coalescing buffer to sit between the CPU Core and L1 as well?

Could those familiar with Ruby comment on my thoughts/offer suggestions?

Thanks

Malek
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev