[m5-dev] Checkpointing
Is there any reason to have a serialize function in the timing and o3 cpus? Creating a checkpoint from them will be broken since if you're using cache the dirty data won't be saved? Shouldn't we change their implementation to fatal()? Thanks, Ali ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Early Branch Resolution in O3
On Mar 26, 2011, at 4:48 PM, Korey Sewell wrote: I'm bumping the below e-mail from the users lists to dev. I believe it is a legit problem with decode not actually passing back the correct value for taken/not taken to the branch predictor when it detects a pc-relative, unconditional control branch in decode. The relevant line in decode_impl.hh is this: toFetch-decodeInfo[tid].branchTaken = inst-pcState().branching(); Since it's just resolving unconditional branches (and actually just when there is a mis-predict), doesn't that mean branchTaken should = true? However, the branch isnt technically resolved at that point so you wont get the right resolution back to BpHistory. I'm thinking the fix might be to: 1. Call execute on the branch in decode so that the PC/NPC/NNPC values can be updated. 2. Call squash in decode 3. Bypass execution of the unconditonal, pc-relative branch in IEW... (this may be trickier than it sounds, but I'm not fond of the idea of just executing the branch 2x) Any thoughts? The actual instruction isn't sent back, it's just the PC and take/not-taken so I don't think other things need to change. In the case of a successful branch, the bp is updated when that branch commits (in commit). Ali -- Forwarded message -- From: reena panda reena.pa...@gmail.com Date: Sat, Mar 26, 2011 at 4:17 PM Subject: [m5-users] Help with Branch Misprediction recovery in m5. To: M5 users mailing list m5-us...@m5sim.org Hi, I am using m5 in ALPHA FS mode, with O3 CPU model. I was going through the fetch/decode stage implementation in m5. But I can't understand properly, the way branch misprediction is handled in the decode/fetch stage of pipeline. Please correct me if I am wrong, but the way it is currently implemented in m5 is as follows:- Suppose an unconditional branch(PC = x, say) is fetched in the fetch cycle, its branch prediction history is immediately updated as a taken branch. Now lets say, its actual target is Y. But suppose the entry corresponding to the branch PC (x) is not found in the BTB, then the next PC and nextNPC are still updated to x+4, x+8 respectively. Since unconditional branches can be resolved in the decode stage, the following check is correctly performed in decodeInsts function (in decode_impl.hh):- if (inst-branchTarget() != inst-readPredPC()) { ++decodeBranchMispred; squash(inst, inst-threadNumber); } But what is odd is that in squash function, the following information is sent back to fetch stage:- toFetch-decodeInfo[tid].nextPC = inst-branchTarget(); toFetch-decodeInfo[tid].nextNPC = inst-branchTarget() + sizeof(TheISA::MachInst); toFetch-decodeInfo[tid].branchTaken = inst-readNextPC() != (inst-readPC() + sizeof(TheISA::MachInst)); The third statement is odd because it compares nextPC with PC( i.e, x+4 with x+4) yields branch direction as not-taken, Which is wrong and would update the branch predictors incorrectly. Branch is actually an unconditional taken branch in the example. Then, should not the last line be something like this:- toFetch-decodeInfo[tid].branchTaken = inst-branchTarget() != (inst-readPC() + sizeof(TheISA::MachInst)); Please point if I am missing something here? I can't understand the working correctly. Also, can some one give me pointers on how to infer total branch misprediction statistics from the stats.txt file, the stats seem to be scattered across the different pipeline stages. Are they all disjoint/or is there any degree of overlap between them? Thanks, Reena ___ m5-users mailing list m5-us...@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression --scratch all
M5 exited with non-zero status* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby FAILED! * build/ALPHA_SE/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby FAILED! * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby FAILED! * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby FAILED! * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing-ruby FAILED! * build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-timing-ruby FAILED! * build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-timing-ruby FAILED! * build/X86_SE/tests/fast/long/20.parser/x86/linux/o3-timing FAILED! scons: *** [build/POWER_SE/kern/linux/linux.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/branch.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/mem.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/integer.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/floating.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/condition.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/static_inst.fo] Error 1 scons: *** [build/POWER_SE/arch/power/pagetable.fo] Error 1 scons: *** [build/POWER_SE/arch/power/utility.fo] Error 1 scons: *** [build/POWER_SE/arch/power/tlb.fo] Error 1 scons: *** [build/POWER_SE/arch/power/process.fo] Error 1 scons: *** [build/POWER_SE/arch/power/linux/process.fo] Error 1 scons: *** [build/POWER_SE/arch/power/decoder.fo] Error 1 scons: *** [build/POWER_SE/arch/power/atomic_simple_cpu_exec.fo] Error 1 scons: *** [build/POWER_SE/arch/power/o3_cpu_exec.fo] Error 1 scons: *** [build/POWER_SE/arch/power/timing_simple_cpu_exec.fo] Error 1 scons: *** [build/POWER_SE/sim/stat_control.fo] Error 1 scons: *** [build/POWER_SE/sim/faults.fo] Error 1 scons: *** [build/POWER_SE/sim/pseudo_inst.fo] Error 1 scons: *** [build/POWER_SE/sim/system.fo] Error 1 scons: *** [build/POWER_SE/sim/tlb.fo] Error 1 scons: *** [build/POWER_SE/sim/process.fo] Error 1 scons: *** [build/POWER_SE/sim/syscall_emul.fo] Error 1 scons: *** [build/POWER_SE/mem/physical.fo] Error 1 scons: *** [build/POWER_SE/mem/page_table.fo] Error 1 scons: *** [build/POWER_SE/mem/translating_port.fo] Error 1 scons: *** [build/POWER_SE/mem/cache/base.fo] Error 1 scons: *** [build/POWER_SE/mem/cache/prefetch/base.fo] Error 1 scons: *** [build/POWER_SE/cpu/base.fo] Error 1 scons: *** [build/POWER_SE/cpu/exetrace.fo] Error 1 scons: *** [build/POWER_SE/cpu/inteltrace.fo] Error 1 scons: *** [build/POWER_SE/cpu/nativetrace.fo] Error 1 scons: *** [build/POWER_SE/cpu/quiesce_event.fo] Error 1 scons: *** [build/POWER_SE/cpu/pc_event.fo] Error 1 scons: *** [build/POWER_SE/cpu/static_inst.fo] Error 1 scons: *** [build/POWER_SE/cpu/thread_context.fo] Error 1 scons: *** [build/POWER_SE/cpu/simple_thread.fo] Error 1 scons: *** [build/POWER_SE/cpu/thread_state.fo] Error 1 scons: *** [build/POWER_SE/cpu/simple/atomic.fo] Error 1 scons: *** [build/POWER_SE/cpu/simple/timing.fo] Error 1 scons: *** [build/POWER_SE/cpu/simple/base.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/base_dyn_inst.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/bpred_unit.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/commit.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/cpu.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/cpu_builder.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/decode.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/dyn_inst.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/fetch.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/free_list.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/iew.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/inst_queue.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/lsq.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/lsq_unit.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/mem_dep_unit.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/rename.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/rename_map.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/rob.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/scoreboard.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/thread_context.fo] Error 1 scons: *** [build/POWER_SE/cpu/pred/btb.fo] Error 1 scons: *** [build/POWER_SE/cpu/pred/ras.fo] Error 1 scons: *** [build/POWER_SE/base/remote_gdb.fo] Error 1 * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-atomic passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-atomic passed. * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple-timing-mp passed. * build/ALPHA_SE/tests/fast/quick/01.hello-2T-smt/alpha/linux/o3-timing passed. * build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/simple-atomic passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/o3-timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder-timing passed. * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest passed.
Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression --scratch all
Traceback (most recent call last): File string, line 1, in module File /z/m5/regression/zizzer/m5/src/python/m5/main.py, line 348, in main exec filecode in scope File tests/run.py, line 70, in module execfile(joinpath(tests_root, 'configs', test_filename + '.py')) File tests/configs/simple-timing-ruby.py, line 77, in module system.ruby = Ruby.create_system(options, system) File /z/m5/regression/zizzer/m5/configs/ruby/Ruby.py, line 70, in create_system % protocol) File string, line 1, in module File /z/m5/regression/zizzer/m5/configs/ruby/MI_example.py, line 63, in create_system block_size_bits = int(math.log(options.cacheline_size, 2)) NameError: name 'math' is not defined This is the changeset that added a call to the log function in the math package without actually importing it: changeset: 8180:d8587c913ccf user:Brad Beckmann brad.beckm...@amd.com date:Fri Mar 25 10:13:50 2011 -0700 summary: ruby: fixed cache index setting I'm not sure how it would have worked in testing since it really isn't imported or defined anywhere else, unless there was some other change ahead of this originally or this was modified somehow. It could also be the case that something got imported indirectly through from m5.objects import * Gabe On 03/27/11 13:15, Cron Daemon wrote: M5 exited with non-zero status* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby FAILED! * build/ALPHA_SE/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby FAILED! * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby FAILED! * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby FAILED! * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing-ruby FAILED! * build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-timing-ruby FAILED! * build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-timing-ruby FAILED! * build/X86_SE/tests/fast/long/20.parser/x86/linux/o3-timing FAILED! scons: *** [build/POWER_SE/kern/linux/linux.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/branch.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/mem.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/integer.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/floating.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/condition.fo] Error 1 scons: *** [build/POWER_SE/arch/power/insts/static_inst.fo] Error 1 scons: *** [build/POWER_SE/arch/power/pagetable.fo] Error 1 scons: *** [build/POWER_SE/arch/power/utility.fo] Error 1 scons: *** [build/POWER_SE/arch/power/tlb.fo] Error 1 scons: *** [build/POWER_SE/arch/power/process.fo] Error 1 scons: *** [build/POWER_SE/arch/power/linux/process.fo] Error 1 scons: *** [build/POWER_SE/arch/power/decoder.fo] Error 1 scons: *** [build/POWER_SE/arch/power/atomic_simple_cpu_exec.fo] Error 1 scons: *** [build/POWER_SE/arch/power/o3_cpu_exec.fo] Error 1 scons: *** [build/POWER_SE/arch/power/timing_simple_cpu_exec.fo] Error 1 scons: *** [build/POWER_SE/sim/stat_control.fo] Error 1 scons: *** [build/POWER_SE/sim/faults.fo] Error 1 scons: *** [build/POWER_SE/sim/pseudo_inst.fo] Error 1 scons: *** [build/POWER_SE/sim/system.fo] Error 1 scons: *** [build/POWER_SE/sim/tlb.fo] Error 1 scons: *** [build/POWER_SE/sim/process.fo] Error 1 scons: *** [build/POWER_SE/sim/syscall_emul.fo] Error 1 scons: *** [build/POWER_SE/mem/physical.fo] Error 1 scons: *** [build/POWER_SE/mem/page_table.fo] Error 1 scons: *** [build/POWER_SE/mem/translating_port.fo] Error 1 scons: *** [build/POWER_SE/mem/cache/base.fo] Error 1 scons: *** [build/POWER_SE/mem/cache/prefetch/base.fo] Error 1 scons: *** [build/POWER_SE/cpu/base.fo] Error 1 scons: *** [build/POWER_SE/cpu/exetrace.fo] Error 1 scons: *** [build/POWER_SE/cpu/inteltrace.fo] Error 1 scons: *** [build/POWER_SE/cpu/nativetrace.fo] Error 1 scons: *** [build/POWER_SE/cpu/quiesce_event.fo] Error 1 scons: *** [build/POWER_SE/cpu/pc_event.fo] Error 1 scons: *** [build/POWER_SE/cpu/static_inst.fo] Error 1 scons: *** [build/POWER_SE/cpu/thread_context.fo] Error 1 scons: *** [build/POWER_SE/cpu/simple_thread.fo] Error 1 scons: *** [build/POWER_SE/cpu/thread_state.fo] Error 1 scons: *** [build/POWER_SE/cpu/simple/atomic.fo] Error 1 scons: *** [build/POWER_SE/cpu/simple/timing.fo] Error 1 scons: *** [build/POWER_SE/cpu/simple/base.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/base_dyn_inst.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/bpred_unit.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/commit.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/cpu.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/cpu_builder.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/decode.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/dyn_inst.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/fetch.fo] Error 1 scons: *** [build/POWER_SE/cpu/o3/free_list.fo] Error 1 scons: ***
Re: [m5-dev] Review Request: config: revamp x86 config to avoid appending to SimObjectVectors
On 2011-03-26 12:31:49, Gabe Black wrote: src/arch/x86/bios/E820.py, line 53 http://reviews.m5sim.org/r/609/diff/1/?file=11254#file11254line53 I think at least most of these lists should be allowed to be empty regardless of if they're being appended to. Steve Reinhardt wrote: They're still allowed to be empty... the question is, does that make sense as a default value? Would the system work with these parameters left as the empty list? If so, I can put that back. For int_lines it definitely does since that's just for making sure interrupt lines are instantiated by making sure they have a parent. Those are objects that avoid cycles of simobjects by pointing outwards to their two ends. The ends don't refer the line object, so no cycle can cross it. The side effect is that nothing inherently refers to the line itself, so those need to be explicitly parented somehow. I put them all in a VectorParam to get them all at once instead of making up junk names for everything individually. The E820 table entries make less sense to be empty since you don't have much of a table with no entries. There may be a situation where you want to have a place holder E820 table with nothing in it since you have to have it as a parameter to something but don't want to actually have a table. The E820 table is provided by the BIOS through an interrupt call where I think ax is set to e820 or something like that. It provides entries which describe regions of memory that exist and labels them as available for use, reserved, or one of two types related to ACPI, if I remember correctly. A table like that is pretty important, but as I said I can imagine when you'd want to be able to have an empty one. - Gabe --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/609/#review1019 --- On 2011-03-26 12:17:28, Steve Reinhardt wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/609/ --- (Updated 2011-03-26 12:17:28) Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. Summary --- config: revamp x86 config to avoid appending to SimObjectVectors A significant contributor to the need for adoptOrphanParams() is the practice of appending to SimObjectVectors which have already been assigned as children. This practice sidesteps the assignment operation for those appended SimObjects, which is where parent/child relationships are typically established. This patch reworks the config scripts that use append() on SimObjectVectors, which all happen to be in the x86 system configuration. At some point in the future, I hope to make SimObjectVectors immutable (by deriving from tuple rather than list), at which time this patch will be necessary for correct operation. For now, it just avoids some of the warning messages that get printed in adoptOrphanParams(). Diffs - configs/common/FSConfig.py d8587c913ccf src/arch/x86/bios/E820.py d8587c913ccf src/dev/x86/SouthBridge.py d8587c913ccf Diff: http://reviews.m5sim.org/r/609/diff Testing --- Thanks, Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Early Branch Resolution in O3
On 03/27/11 13:13, Ali Saidi wrote: On Mar 26, 2011, at 4:48 PM, Korey Sewell wrote: I'm bumping the below e-mail from the users lists to dev. I believe it is a legit problem with decode not actually passing back the correct value for taken/not taken to the branch predictor when it detects a pc-relative, unconditional control branch in decode. The relevant line in decode_impl.hh is this: toFetch-decodeInfo[tid].branchTaken = inst-pcState().branching(); Since it's just resolving unconditional branches (and actually just when there is a mis-predict), doesn't that mean branchTaken should = true? Possibly, but that makes the code less general. You could use the advancePC function to get the straight line PC of the next instruction and then compare that with the branch target. If they don't match the branch is taken. Calling execute in decode just to straighten this out doesn't seem like a good idea. Changes from that that ripple into execute seem worse. Gabe ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Checkpointing
Is there any reason to have a serialize function in the timing and o3 cpus? Creating a checkpoint from them will be broken since if you're using cache the dirty data won't be saved? Shouldn't we change their implementation to fatal()? Is the implementation of the CPUs correct? Arguably, it should be the caches that cause fatal() if they're what cause the problem, no? Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Checkpointing
On Sun, Mar 27, 2011 at 1:27 PM, nathan binkert n...@binkert.org wrote: Is there any reason to have a serialize function in the timing and o3 cpus? Creating a checkpoint from them will be broken since if you're using cache the dirty data won't be saved? Shouldn't we change their implementation to fatal()? Is the implementation of the CPUs correct? Arguably, it should be the caches that cause fatal() if they're what cause the problem, no? I agree with Nate... in fact the Ruby caches do have a warm-up facility that Brad is working on porting (or says he will), so we don't want to assume that caches can't be checkpointed. Also it's possible to have a timing CPU with no caches (even though it doesn't make a lot of sense). What I would like to see is to have the O3 unserialize function fixed so that we can avoid the silly switch_cpus thing when you want to restore directly into O3... at least my understanding is that that's why we don't do it that way. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Early Branch Resolution in O3
On Mar 27, 2011, at 3:19 PM, Gabe Black wrote: On 03/27/11 13:13, Ali Saidi wrote: On Mar 26, 2011, at 4:48 PM, Korey Sewell wrote: I'm bumping the below e-mail from the users lists to dev. I believe it is a legit problem with decode not actually passing back the correct value for taken/not taken to the branch predictor when it detects a pc-relative, unconditional control branch in decode. The relevant line in decode_impl.hh is this: toFetch-decodeInfo[tid].branchTaken = inst-pcState().branching(); Since it's just resolving unconditional branches (and actually just when there is a mis-predict), doesn't that mean branchTaken should = true? Possibly, but that makes the code less general. You could use the advancePC function to get the straight line PC of the next instruction and then compare that with the branch target. If they don't match the branch is taken. Calling execute in decode just to straighten this out doesn't seem like a good idea. Changes from that that ripple into execute seem worse. The code doesn't need to be general. In only gets called if the branch in unconditional, by definition that means it must be taken. If you're violently opposed to that, I think Gabe is right, the simplest thing to do is: PCState nPc = inst-pcState(); nPc.advance(); branchTaken = nPc.pc() != inst-branchTarget(); Ali ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Early Branch Resolution in O3
On 03/27/11 20:14, Ali Saidi wrote: On Mar 27, 2011, at 3:19 PM, Gabe Black wrote: On 03/27/11 13:13, Ali Saidi wrote: On Mar 26, 2011, at 4:48 PM, Korey Sewell wrote: I'm bumping the below e-mail from the users lists to dev. I believe it is a legit problem with decode not actually passing back the correct value for taken/not taken to the branch predictor when it detects a pc-relative, unconditional control branch in decode. The relevant line in decode_impl.hh is this: toFetch-decodeInfo[tid].branchTaken = inst-pcState().branching(); Since it's just resolving unconditional branches (and actually just when there is a mis-predict), doesn't that mean branchTaken should = true? Possibly, but that makes the code less general. You could use the advancePC function to get the straight line PC of the next instruction and then compare that with the branch target. If they don't match the branch is taken. Calling execute in decode just to straighten this out doesn't seem like a good idea. Changes from that that ripple into execute seem worse. The code doesn't need to be general. In only gets called if the branch in unconditional, by definition that means it must be taken. If you're violently opposed to that, I think Gabe is right, the simplest thing to do is: PCState nPc = inst-pcState(); nPc.advance(); branchTaken = nPc.pc() != inst-branchTarget(); Ali ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev Except you don't want to call advance on the pc, you want to call advancePC(nPc) on the inst. The inst knows how to advance the PC properly (next microop, next instruction, both) and the functions that advance the pc in particular ways aren't going to be available on every different PCState type. ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Ruby Store/Coaslescing Buffer Implementation for TimingSimpleCPU
Hello, I am interested in implementing a storebuffer (coalescing buffer) for Ruby's Memory Model in M5/GEM5 for use in my current research. I wish to be able to coalesce speculative stores + non-speculative stores to the same cache line and then flush them to the cache during certain acquire/release constructs. I see that there was an existing directory called storebuffer, but was removed not too long ago. Reading the associated thread on the mailing list it seems that it was removed because it is not in use (given that O3 is not yet functional with Ruby), nor was never actually even used in the original GEM implementation. Here is the link to that thread: http://www.mail-archive.com/m5-dev@m5sim.org/msg10575.html In further reading of that thread, I see that there is/was general consensus that the Ruby Store Buffer will be merged with M5 O3's LSQ. For my research, O3 CPU Model is not a requirement, although storebuffers tend to be used typically only in O3 execution. For what I need to do, my specific question is as follows: A) Would it be better/easier to implement a new Buffer (similar to the MessageBuffer class) from the Ruby Side or B) actually reuse M5's existing O3's LSQ buffer in the Timing CPU Model. I think that A) might be the easier method to go for the following reasons: 1) It seems that the Sequencer class already has functionality to support coalescing stores to the same cache line (in reading the previous storebuffer thread) 2) This would make the coalescing buffer CPU Model independent 3) Avoid having to change the Timing CPU Code which may make it more likely to mess up how the CPU Model handles other memory related things (ISA-Dependent Memory references, split data requests, prefetching, etc). 4) Allows me to make it a Ruby Only change on the Ruby Code side of things as opposed to the M5 side of things. However, my hesitation with this approach is because 1) the way the Sequencer operates, it is the interface between the CPU Core and the Ruby Memory Model (converting M5 requests to Ruby Requests and what not), so 'logically' I guess it might make more sense to implement the store buffer before Ruby sees the store requests, and just have the sequencer do its thing with the coalescing? 2) The conclusion of the previous storebuffer thread was that work is currently?/will be done implementing the store buffer on the M5 side of things. Depending on if I go with Approach A), I know I would have to change which message buffer L1 communicates with L2, such that instead of sending stores through the L2 Request Buffer, I would send it as follows: L1 - Coaslescing Buffer - L2 Request Network Buffer - L2 instead of L1 - L2 Request Network Buffer - L2 But I am not sure how exactly I would go about this if I want to add this coalescing buffer to sit between the CPU Core and L1 as well? Could those familiar with Ruby comment on my thoughts/offer suggestions? Thanks Malek ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev