[gem5-dev] Cron m5test@zizzer /z/m5/regression/do-regression quick
scons: *** [build/ALPHA/mem/ruby/structures/RubyMemoryControl.do] Error 1 scons: *** [build/ALPHA/mem/protocol/DMARequestMsg.do] Error 1 scons: *** [build/ALPHA/mem/protocol/DMA_Controller.do] Error 1 scons: *** [build/ALPHA/mem/protocol/DMA_Transitions.do] Error 1 scons: *** [build/ALPHA/mem/protocol/DMA_Wakeup.do] Error 1 scons: *** [build/ALPHA/mem/protocol/Directory_Controller.do] Error 1 scons: *** [build/ALPHA/mem/protocol/Directory_TBE.do] Error 1 scons: *** [build/ALPHA/mem/protocol/Directory_Transitions.do] Error 1 scons: *** [build/ALPHA/mem/protocol/Directory_Wakeup.do] Error 1 scons: *** [build/ALPHA/mem/protocol/L1Cache_Controller.do] Error 1 scons: *** [build/ALPHA/mem/protocol/L1Cache_Transitions.do] Error 1 scons: *** [build/ALPHA/mem/protocol/L1Cache_Wakeup.do] Error 1 scons: *** [build/ALPHA/mem/protocol/MachineType.do] Error 1 scons: *** [build/ALPHA/mem/protocol/MemoryMsg.do] Error 1 scons: *** [build/ALPHA/mem/protocol/RequestMsg.do] Error 1 scons: *** [build/ALPHA/mem/protocol/ResponseMsg.do] Error 1 scons: *** [build/ALPHA/python/m5/internal/param_RubyMemoryControl_wrap.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/ruby/structures/RubyMemoryControl.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/DMARequestMsg.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/DMA_Controller.do] Error 1 scons: *** [build/ALPHA/python/m5/internal/param_L1Cache_Controller_wrap.do] Error 1 scons: *** [build/ALPHA/python/m5/internal/param_Directory_Controller_wrap.do] Error 1 scons: *** [build/ALPHA/python/m5/internal/param_DMA_Controller_wrap.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/DMA_Transitions.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/DMA_Wakeup.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/Directory_Controller.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/Directory_PfEntry.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/Directory_TBE.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/Directory_Transitions.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/Directory_Wakeup.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/L1Cache_Controller.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/L1Cache_TBE.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/L1Cache_Transitions.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/L1Cache_Wakeup.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/MachineType.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/MemoryMsg.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/RequestMsg.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/ResponseMsg.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/python/m5/internal/param_RubyMemoryControl_wrap.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/ruby/structures/RubyMemoryControl.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/DMA_Controller.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/DMA_Transitions.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/python/m5/internal/param_L1Cache_Controller_wrap.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/python/m5/internal/param_Directory_Controller_wrap.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/python/m5/internal/param_DMA_Controller_wrap.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/DMA_Wakeup.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/Directory_Controller.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/Directory_Entry.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/Directory_Transitions.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/Directory_Wakeup.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L1Cache_Controller.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L1Cache_Transitions.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L1Cache_Wakeup.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L2Cache_Controller.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L2Cache_Entry.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L2Cache_TBE.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L2Cache_Transitions.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L2Cache_Wakeup.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/MachineType.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/MemoryMsg.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/RequestMsg.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/ResponseMsg.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/python/m5/internal/param_RubyMemoryControl_wrap.do] Error 1 scons: *** [build/ALPHA_MOESI_CMP_directory/mem/ruby/structures/RubyMemoryControl.do] Error 1 scons: ***
Re: [gem5-dev] Cron m5test@zizzer /z/m5/regression/do-regression quick
Would someone be able to fry the build directory? Thanks, Andreas On 03/09/2014 08:11, Cron Daemon via gem5-dev gem5-dev@gem5.org wrote: scons: *** [build/ALPHA/mem/ruby/structures/RubyMemoryControl.do] Error 1 scons: *** [build/ALPHA/mem/protocol/DMARequestMsg.do] Error 1 scons: *** [build/ALPHA/mem/protocol/DMA_Controller.do] Error 1 scons: *** [build/ALPHA/mem/protocol/DMA_Transitions.do] Error 1 scons: *** [build/ALPHA/mem/protocol/DMA_Wakeup.do] Error 1 scons: *** [build/ALPHA/mem/protocol/Directory_Controller.do] Error 1 scons: *** [build/ALPHA/mem/protocol/Directory_TBE.do] Error 1 scons: *** [build/ALPHA/mem/protocol/Directory_Transitions.do] Error 1 scons: *** [build/ALPHA/mem/protocol/Directory_Wakeup.do] Error 1 scons: *** [build/ALPHA/mem/protocol/L1Cache_Controller.do] Error 1 scons: *** [build/ALPHA/mem/protocol/L1Cache_Transitions.do] Error 1 scons: *** [build/ALPHA/mem/protocol/L1Cache_Wakeup.do] Error 1 scons: *** [build/ALPHA/mem/protocol/MachineType.do] Error 1 scons: *** [build/ALPHA/mem/protocol/MemoryMsg.do] Error 1 scons: *** [build/ALPHA/mem/protocol/RequestMsg.do] Error 1 scons: *** [build/ALPHA/mem/protocol/ResponseMsg.do] Error 1 scons: *** [build/ALPHA/python/m5/internal/param_RubyMemoryControl_wrap.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/ruby/structures/RubyMemoryControl.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/DMARequestMsg.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/DMA_Controller.do] Error 1 scons: *** [build/ALPHA/python/m5/internal/param_L1Cache_Controller_wrap.do] Error 1 scons: *** [build/ALPHA/python/m5/internal/param_Directory_Controller_wrap.do] Error 1 scons: *** [build/ALPHA/python/m5/internal/param_DMA_Controller_wrap.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/DMA_Transitions.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/DMA_Wakeup.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/Directory_Controller.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/Directory_PfEntry.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/Directory_TBE.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/Directory_Transitions.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/Directory_Wakeup.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/L1Cache_Controller.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/L1Cache_TBE.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/L1Cache_Transitions.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/L1Cache_Wakeup.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/MachineType.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/MemoryMsg.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/RequestMsg.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/mem/protocol/ResponseMsg.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/python/m5/internal/param_RubyMemoryControl_wrap. do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/ruby/structures/RubyMemoryControl.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/DMA_Controller.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/DMA_Transitions.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/python/m5/internal/param_L1Cache_Controller_wrap .do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/python/m5/internal/param_Directory_Controller_wr ap.do] Error 1 scons: *** [build/ALPHA_MOESI_hammer/python/m5/internal/param_DMA_Controller_wrap.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/DMA_Wakeup.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/Directory_Controller.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/Directory_Entry.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/Directory_Transitions.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/Directory_Wakeup.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L1Cache_Controller.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L1Cache_Transitions.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L1Cache_Wakeup.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L2Cache_Controller.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L2Cache_Entry.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L2Cache_TBE.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L2Cache_Transitions.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/L2Cache_Wakeup.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/MachineType.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/MemoryMsg.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/RequestMsg.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/mem/protocol/ResponseMsg.do] Error 1 scons: *** [build/ALPHA_MESI_Two_Level/python/m5/internal/param_RubyMemoryControl_wra
Re: [gem5-dev] Ruby regression tests and null isa
Hi Nilay, That all sounds good. I am not adverse to the idea of including a ruby protocol in the NULL build, but I??d like it to be for a good reason as it does indeed add quite some time to the build. That??s all... Andreas On 03/09/2014 05:44, Nilay Vaish ni...@cs.wisc.edu wrote: On Mon, 1 Sep 2014, Andreas Hansson wrote: Hi Nilay, That is a very good point, and thanks for spending some cycles on this. I??m not pushing for a transition, I merely thought it made more sense, but I forgot about the hello world tests. Does the ??hello world?? actually add any value to the regressions? Would it not be better to: 1) run a more extensive regression using Ruby + an o3 CPU model (linux boot etc), or 2) use a more extensive synthetic tester (e.g. memtester with actual sharing, which is something we??re working on...) for some of these protocols? I am fine with adding more tests. I do sometimes test by booting Linux so as to ensure things are in a working state. I am not sure if we would like to see the time for regressions going up. I am unable to recall the inner workings of the testers that we use for ruby, but I am sure they test sharing. As a side note, I??ve managed to make the memory system (src/mem) completely ISA independent, so we could compile the entire memory directory once for all ISAs. Unfortunately we also need to compile it once for every coherency protocol in Ruby. I??m not sure there is any sensible way around it, but it would be good to get your thoughts on this. If I remember correctly, there is one particular file (MachineType.hh) that is the stumbling block in compiling all protocols together. I might look at this again once I am done with another ruby thing I am working on currently. Thanks Nilay -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England Wales, Company No: 2548782 ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: arm: Fix ExtMachInst hash operator underlying...
changeset d2850235e31c in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=d2850235e31c description: arm: Fix ExtMachInst hash operator underlying type This patch fixes the hash operator used for ARM ExtMachInst, which incorrectly was still using uint32_t. Instead of changing it to uint64_t it is not using the underlying data type of the BitUnion. diffstat: src/arch/arm/types.hh | 17 +++-- 1 files changed, 11 insertions(+), 6 deletions(-) diffs (27 lines): diff -r 9e02c14446bb -r d2850235e31c src/arch/arm/types.hh --- a/src/arch/arm/types.hh Mon Sep 01 16:55:52 2014 -0500 +++ b/src/arch/arm/types.hh Wed Sep 03 07:42:19 2014 -0400 @@ -727,12 +727,17 @@ } // namespace ArmISA __hash_namespace_begin -template -struct hashArmISA::ExtMachInst : public hashuint32_t { -size_t operator()(const ArmISA::ExtMachInst emi) const { -return hashuint32_t::operator()((uint32_t)emi); -}; -}; + +template +struct hashArmISA::ExtMachInst : +public hashArmISA::ExtMachInst::__DataType { + +size_t operator()(const ArmISA::ExtMachInst emi) const { +return hashArmISA::ExtMachInst::__DataType::operator()(emi); +} + +}; + __hash_namespace_end #endif ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: arch: Cleanup unused ISA traits constants
changeset 98771a936b61 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=98771a936b61 description: arch: Cleanup unused ISA traits constants This patch prunes unused values, and also unifies how the values are defined (not using an enum for ALPHA), aligning the use of int vs Addr etc. The patch also removes the duplication of PageBytes/PageShift and VMPageSize/LogVMPageSize. For all ISAs the two pairs had identical values and the latter has been removed. diffstat: src/arch/alpha/isa_traits.hh | 14 +- src/arch/alpha/process.cc | 6 +++--- src/arch/arm/isa_traits.hh | 11 --- src/arch/arm/process.cc| 10 +- src/arch/arm/utility.cc| 1 + src/arch/mips/isa_traits.hh| 10 -- src/arch/mips/process.cc | 6 +++--- src/arch/null/isa_traits.hh| 3 --- src/arch/power/isa_traits.hh | 3 --- src/arch/power/process.cc | 6 +++--- src/arch/sparc/isa_traits.hh | 14 ++ src/arch/sparc/process.cc | 8 src/arch/x86/isa_traits.hh | 10 ++ src/arch/x86/process.cc| 14 +++--- src/kern/tru64/tru64.hh| 8 src/mem/cache/prefetch/base.cc | 2 +- src/mem/multi_level_page_table_impl.hh | 24 src/mem/page_table.hh | 6 +++--- src/mem/ruby/common/Address.cc | 2 +- src/mem/se_translating_port_proxy.cc | 14 +++--- src/sim/process.cc | 4 ++-- src/sim/syscall_emul.cc| 12 ++-- src/sim/syscall_emul.hh| 10 +- src/sim/system.cc | 6 +++--- 24 files changed, 75 insertions(+), 129 deletions(-) diffs (truncated from 730 to 300 lines): diff -r 19f5df7ac6a1 -r 98771a936b61 src/arch/alpha/isa_traits.hh --- a/src/arch/alpha/isa_traits.hh Wed Sep 03 07:42:20 2014 -0400 +++ b/src/arch/alpha/isa_traits.hh Wed Sep 03 07:42:21 2014 -0400 @@ -109,19 +109,7 @@ mode_number // number of modes }; -// Constants Related to the number of registers - -enum { -LogVMPageSize = 13, // 8K bytes -VMPageSize = (1 LogVMPageSize), - -BranchPredAddrShiftAmt = 2, // instructions are 4-byte aligned - -MachineBytes = 8, -WordBytes = 4, -HalfwordBytes = 2, -ByteBytes = 1 -}; +const int MachineBytes = 8; // return a no-op instruction... used for instruction fetch faults // Alpha UNOP (ldq_u r31,0(r0)) diff -r 19f5df7ac6a1 -r 98771a936b61 src/arch/alpha/process.cc --- a/src/arch/alpha/process.cc Wed Sep 03 07:42:20 2014 -0400 +++ b/src/arch/alpha/process.cc Wed Sep 03 07:42:21 2014 -0400 @@ -49,7 +49,7 @@ : LiveProcess(params, objFile) { brk_point = objFile-dataBase() + objFile-dataSize() + objFile-bssSize(); -brk_point = roundUp(brk_point, VMPageSize); +brk_point = roundUp(brk_point, PageBytes); // Set up stack. On Alpha, stack goes below text section. This // code should get moved to some architecture-specific spot. @@ -83,7 +83,7 @@ // seem to be a problem. // check out _dl_aux_init() in glibc/elf/dl-support.c for details // --Lisa -auxv.push_back(auxv_t(M5_AT_PAGESZ, AlphaISA::VMPageSize)); +auxv.push_back(auxv_t(M5_AT_PAGESZ, AlphaISA::PageBytes)); auxv.push_back(auxv_t(M5_AT_CLKTCK, 100)); auxv.push_back(auxv_t(M5_AT_PHDR, elfObject-programHeaderTable())); DPRINTF(Loader, auxv at PHDR %08p\n, elfObject-programHeaderTable()); @@ -193,7 +193,7 @@ LiveProcess::initState(); -argsInit(MachineBytes, VMPageSize); +argsInit(MachineBytes, PageBytes); ThreadContext *tc = system-getThreadContext(contextIds[0]); tc-setIntReg(GlobalPointerReg, objFile-globalPointer()); diff -r 19f5df7ac6a1 -r 98771a936b61 src/arch/arm/isa_traits.hh --- a/src/arch/arm/isa_traits.hhWed Sep 03 07:42:20 2014 -0400 +++ b/src/arch/arm/isa_traits.hhWed Sep 03 07:42:21 2014 -0400 @@ -51,8 +51,6 @@ namespace LittleEndianGuest {} -#define TARGET_ARM - namespace ArmISA { using namespace LittleEndianGuest; @@ -101,16 +99,7 @@ // return a no-op instruction... used for instruction fetch faults const ExtMachInst NoopMachInst = 0x01E320F000ULL; -const int LogVMPageSize = 12; // 4K bytes -const int VMPageSize = (1 LogVMPageSize); - -// Shouldn't this be 1 because of Thumb?! Dynamic? --Ali -const int BranchPredAddrShiftAmt = 2; // instructions are 4-byte aligned - const int MachineBytes = 4; -const int WordBytes = 4; -const int HalfwordBytes = 2; -const int ByteBytes = 1; const uint32_t HighVecs = 0x; diff -r 19f5df7ac6a1 -r 98771a936b61 src/arch/arm/process.cc ---
[gem5-dev] changeset in gem5: sim: Fix checkpoint restore for Ticked
changeset 82a4fa2d19a0 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=82a4fa2d19a0 description: sim: Fix checkpoint restore for Ticked This patch makes restoring the 'lastStopped' value for Ticked-containing objects (including MinorCPU) optional so that Ticked-containing objects can be restored from non-Ticked-containing objects (such as AtomicSimpleCPU). diffstat: src/sim/ticked_object.cc | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) diffs (21 lines): diff -r 4207f9bfcceb -r 82a4fa2d19a0 src/sim/ticked_object.cc --- a/src/sim/ticked_object.cc Wed Sep 03 07:42:22 2014 -0400 +++ b/src/sim/ticked_object.cc Wed Sep 03 07:42:25 2014 -0400 @@ -82,9 +82,15 @@ void Ticked::unserialize(Checkpoint *cp, const std::string section) { -uint64_t lastStoppedUint; +uint64_t lastStoppedUint = 0; -paramIn(cp, section, lastStopped, lastStoppedUint); +/* lastStopped is optional on checkpoint restore as this object may be + * being restored from one which has a common base (and so possibly + * many common checkpointed values) but where Ticked is used in the + * checkpointed object but not this one. + * An example would be a CPU model using Ticked restores from a + * simple CPU without without Ticked */ +optParamIn(cp, section, lastStopped, lastStoppedUint); lastStopped = Cycles(lastStoppedUint); } ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: arch, cpu: Factor out the ExecContext into a ...
changeset 4207f9bfcceb in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=4207f9bfcceb description: arch, cpu: Factor out the ExecContext into a proper base class We currently generate and compile one version of the ISA code per CPU model. This is obviously wasting a lot of resources at compile time. This changeset factors out the interface into a separate ExecContext class, which also serves as documentation for the interface between CPUs and the ISA code. While doing so, this changeset also fixes up interface inconsistencies between the different CPU models. The main argument for using one set of ISA code per CPU model has always been performance as this avoid indirect branches in the generated code. However, this argument does not hold water. Booting Linux on a simulated ARM system running in atomic mode (opt/10.linux-boot/realview-simple-atomic) is actually 2% faster (compiled using clang 3.4) after applying this patch. Additionally, compilation time is decreased by 35%. diffstat: SConstruct | 12 +- src/arch/SConscript | 13 +- src/arch/arm/isa/includes.isa |1 + src/arch/isa_parser.py | 22 ++- src/cpu/SConscript | 64 + src/cpu/base_dyn_inst.hh| 43 + src/cpu/checker/SConsopts |4 +- src/cpu/checker/cpu.hh | 27 ++- src/cpu/exec_context.cc | 40 + src/cpu/exec_context.hh | 264 +++ src/cpu/inorder/SConsopts |5 +- src/cpu/inorder/inorder_dyn_inst.cc |5 +- src/cpu/inorder/inorder_dyn_inst.hh | 46 - src/cpu/minor/SConsopts |5 +- src/cpu/minor/exec_context.hh | 25 +- src/cpu/nocpu/SConsopts |2 +- src/cpu/o3/SConsopts|5 +- src/cpu/o3/dyn_inst.hh | 15 +- src/cpu/ozone/SConsopts |8 +- src/cpu/simple/SConsopts| 10 +- src/cpu/simple/base.hh | 30 ++-- src/cpu/simple_thread.cc| 16 ++ src/cpu/static_inst.hh | 38 ++-- 23 files changed, 406 insertions(+), 294 deletions(-) diffs (truncated from 1355 to 300 lines): diff -r 98771a936b61 -r 4207f9bfcceb SConstruct --- a/SConstructWed Sep 03 07:42:21 2014 -0400 +++ b/SConstructWed Sep 03 07:42:22 2014 -0400 @@ -1025,17 +1025,10 @@ # Dict of available CPU model objects. Accessible as CpuModel.dict. dict = {} -list = [] -defaults = [] # Constructor. Automatically adds models to CpuModel.dict. -def __init__(self, name, filename, includes, strings, default=False): +def __init__(self, name, default=False): self.name = name # name of model -self.filename = filename # filename for output exec code -self.includes = includes # include files needed in exec file -# The 'strings' dict holds all the per-CPU symbols we can -# substitute into templates etc. -self.strings = strings # This cpu is enabled by default self.default = default @@ -1044,7 +1037,6 @@ if name in CpuModel.dict: raise AttributeError, CpuModel '%s' already registered % name CpuModel.dict[name] = self -CpuModel.list.append(name) Export('CpuModel') @@ -1086,7 +1078,7 @@ EnumVariable('TARGET_ISA', 'Target ISA', 'alpha', all_isa_list), ListVariable('CPU_MODELS', 'CPU models', sorted(n for n,m in CpuModel.dict.iteritems() if m.default), - sorted(CpuModel.list)), + sorted(CpuModel.dict.keys())), BoolVariable('EFENCE', 'Link with Electric Fence malloc debugger', False), BoolVariable('SS_COMPATIBLE_FP', diff -r 98771a936b61 -r 4207f9bfcceb src/arch/SConscript --- a/src/arch/SConscript Wed Sep 03 07:42:21 2014 -0400 +++ b/src/arch/SConscript Wed Sep 03 07:42:22 2014 -0400 @@ -95,13 +95,11 @@ # The emitter patches up the sources targets to include the # autogenerated files as targets and isa parser itself as a source. def isa_desc_emitter(target, source, env): -cpu_models = list(env['CPU_MODELS']) -cpu_models.append('CheckerCPU') - # List the isa parser as a source. -source += [ isa_parser ] -# Add in the CPU models. -source += [ Value(m) for m in cpu_models ] +source += [ +isa_parser, +Value(ExecContext), +] # Specify different targets depending on if we're running the ISA # parser for its dependency information, or for the generated files. @@ -137,8 +135,7 @@ # Skip over the ISA description itself and the parser to the CPU models. models = [ s.get_contents() for s in source[2:] ] -cpu_models = [CpuModel.dict[cpu] for cpu in models] -
[gem5-dev] changeset in gem5: mem: Packet queue clean up
changeset 7f4059e4f2d5 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=7f4059e4f2d5 description: mem: Packet queue clean up No change in functionality, just a bit of tidying up. diffstat: src/mem/packet_queue.cc | 20 +++- src/mem/packet_queue.hh | 5 ++--- 2 files changed, 9 insertions(+), 16 deletions(-) diffs (84 lines): diff -r 72890a571a7b -r 7f4059e4f2d5 src/mem/packet_queue.cc --- a/src/mem/packet_queue.cc Wed Sep 03 07:42:27 2014 -0400 +++ b/src/mem/packet_queue.cc Wed Sep 03 07:42:28 2014 -0400 @@ -71,11 +71,10 @@ { pkt-pushLabel(label); -DeferredPacketIterator i = transmitList.begin(); -DeferredPacketIterator end = transmitList.end(); +auto i = transmitList.begin(); bool found = false; -while (!found i != end) { +while (!found i != transmitList.end()) { // If the buffered packet contains data, and it overlaps the // current packet, then update data found = pkt-checkFunctional(i-pkt); @@ -140,7 +139,7 @@ } // this belongs in the middle somewhere, insertion sort -DeferredPacketIterator i = transmitList.begin(); +auto i = transmitList.begin(); ++i; // already checked for insertion at front while (i != transmitList.end() when = i-tick) ++i; @@ -151,21 +150,16 @@ { assert(deferredPacketReady()); -// take the next packet off the list here, as we might return to -// ourselves through the sendTiming call below DeferredPacket dp = transmitList.front(); -transmitList.pop_front(); // use the appropriate implementation of sendTiming based on the // type of port associated with the queue, and whether the packet // is to be sent as a snoop or not waitingOnRetry = !sendTiming(dp.pkt, dp.sendAsSnoop); -if (waitingOnRetry) { -// put the packet back at the front of the list (packet should -// not have changed since it wasn't accepted) -assert(!sendEvent.scheduled()); -transmitList.push_front(dp); +if (!waitingOnRetry) { +// take the packet off the list +transmitList.pop_front(); } } @@ -216,7 +210,7 @@ unsigned int PacketQueue::drain(DrainManager *dm) { -if (transmitList.empty() !sendEvent.scheduled()) +if (transmitList.empty()) return 0; DPRINTF(Drain, PacketQueue not drained\n); drainManager = dm; diff -r 72890a571a7b -r 7f4059e4f2d5 src/mem/packet_queue.hh --- a/src/mem/packet_queue.hh Wed Sep 03 07:42:27 2014 -0400 +++ b/src/mem/packet_queue.hh Wed Sep 03 07:42:28 2014 -0400 @@ -78,7 +78,6 @@ }; typedef std::listDeferredPacket DeferredPacketList; -typedef std::listDeferredPacket::iterator DeferredPacketIterator; /** A list of outgoing timing response packets that haven't been * serviced yet. */ @@ -109,10 +108,10 @@ bool waitingOnRetry; /** Check whether we have a packet ready to go on the transmit list. */ -bool deferredPacketReady() +bool deferredPacketReady() const { return !transmitList.empty() transmitList.front().tick = curTick(); } -Tick deferredPacketReadyTime() +Tick deferredPacketReadyTime() const { return transmitList.empty() ? MaxTick : transmitList.front().tick; } /** ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: cpu: Change writeback modeling for outstandin...
changeset 5b6279635c49 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=5b6279635c49 description: cpu: Change writeback modeling for outstanding instructions As highlighed on the mailing list gem5's writeback modeling can impact performance. This patch removes the limitation on maximum outstanding issued instructions, however the number that can writeback in a single cycle is still respected in instToCommit(). diffstat: configs/common/O3_ARM_v7a.py | 1 - src/cpu/o3/O3CPU.py | 1 - src/cpu/o3/iew.hh | 53 --- src/cpu/o3/iew_impl.hh| 10 src/cpu/o3/inst_queue_impl.hh | 2 - src/cpu/o3/lsq_unit.hh| 7 - src/cpu/o3/lsq_unit_impl.hh | 5 +--- 7 files changed, 1 insertions(+), 78 deletions(-) diffs (210 lines): diff -r 43516d8eabe9 -r 5b6279635c49 configs/common/O3_ARM_v7a.py --- a/configs/common/O3_ARM_v7a.py Wed Sep 03 07:42:32 2014 -0400 +++ b/configs/common/O3_ARM_v7a.py Wed Sep 03 07:42:33 2014 -0400 @@ -126,7 +126,6 @@ dispatchWidth = 6 issueWidth = 8 wbWidth = 8 -wbDepth = 1 fuPool = O3_ARM_v7a_FUP() iewToCommitDelay = 1 renameToROBDelay = 1 diff -r 43516d8eabe9 -r 5b6279635c49 src/cpu/o3/O3CPU.py --- a/src/cpu/o3/O3CPU.py Wed Sep 03 07:42:32 2014 -0400 +++ b/src/cpu/o3/O3CPU.py Wed Sep 03 07:42:33 2014 -0400 @@ -84,7 +84,6 @@ dispatchWidth = Param.Unsigned(8, Dispatch width) issueWidth = Param.Unsigned(8, Issue width) wbWidth = Param.Unsigned(8, Writeback width) -wbDepth = Param.Unsigned(1, Writeback depth) fuPool = Param.FUPool(DefaultFUPool(), Functional Unit pool) iewToCommitDelay = Param.Cycles(1, Issue/Execute/Writeback to commit diff -r 43516d8eabe9 -r 5b6279635c49 src/cpu/o3/iew.hh --- a/src/cpu/o3/iew.hh Wed Sep 03 07:42:32 2014 -0400 +++ b/src/cpu/o3/iew.hh Wed Sep 03 07:42:33 2014 -0400 @@ -219,49 +219,6 @@ /** Returns if the LSQ has any stores to writeback. */ bool hasStoresToWB(ThreadID tid) { return ldstQueue.hasStoresToWB(tid); } -void incrWb(InstSeqNum sn) -{ -++wbOutstanding; -if (wbOutstanding == wbMax) -ableToIssue = false; -DPRINTF(IEW, wbOutstanding: %i [sn:%lli]\n, wbOutstanding, sn); -assert(wbOutstanding = wbMax); -#ifdef DEBUG -wbList.insert(sn); -#endif -} - -void decrWb(InstSeqNum sn) -{ -if (wbOutstanding == wbMax) -ableToIssue = true; -wbOutstanding--; -DPRINTF(IEW, wbOutstanding: %i [sn:%lli]\n, wbOutstanding, sn); -assert(wbOutstanding = 0); -#ifdef DEBUG -assert(wbList.find(sn) != wbList.end()); -wbList.erase(sn); -#endif -} - -#ifdef DEBUG -std::setInstSeqNum wbList; - -void dumpWb() -{ -std::setInstSeqNum::iterator wb_it = wbList.begin(); -while (wb_it != wbList.end()) { -cprintf([sn:%lli]\n, -(*wb_it)); -wb_it++; -} -} -#endif - -bool canIssue() { return ableToIssue; } - -bool ableToIssue; - /** Check misprediction */ void checkMisprediction(DynInstPtr inst); @@ -452,19 +409,9 @@ */ unsigned wbCycle; -/** Number of instructions in flight that will writeback. */ - -/** Number of instructions in flight that will writeback. */ -int wbOutstanding; - /** Writeback width. */ unsigned wbWidth; -/** Writeback width * writeback depth, where writeback depth is - * the number of cycles of writing back instructions that can be - * buffered. */ -unsigned wbMax; - /** Number of active threads. */ ThreadID numThreads; diff -r 43516d8eabe9 -r 5b6279635c49 src/cpu/o3/iew_impl.hh --- a/src/cpu/o3/iew_impl.hhWed Sep 03 07:42:32 2014 -0400 +++ b/src/cpu/o3/iew_impl.hhWed Sep 03 07:42:33 2014 -0400 @@ -76,7 +76,6 @@ issueToExecuteDelay(params-issueToExecuteDelay), dispatchWidth(params-dispatchWidth), issueWidth(params-issueWidth), - wbOutstanding(0), wbWidth(params-wbWidth), numThreads(params-numThreads) { @@ -109,12 +108,8 @@ fetchRedirect[tid] = false; } -wbMax = wbWidth * params-wbDepth; - updateLSQNextCycle = false; -ableToIssue = true; - skidBufferMax = (3 * (renameToIEWDelay * params-renameWidth)) + issueWidth; } @@ -635,8 +630,6 @@ ++wbCycle; wbNumInst = 0; } - -assert((wbCycle * wbWidth + wbNumInst) = wbMax); } DPRINTF(IEW, Current wb cycle: %i, width: %i, numInst: %i\nwbActual:%i\n, @@ -1263,7 +1256,6 @@ ++iewExecSquashedInsts; -decrWb(inst-seqNum); continue; } @@ -1502,8 +1494,6 @@ } writebackCount[tid]++; } - -decrWb(inst-seqNum); } } diff -r 43516d8eabe9 -r
[gem5-dev] changeset in gem5: cache: Fix handling of LL/SC requests under c...
changeset 7aacec2a247d in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=7aacec2a247d description: cache: Fix handling of LL/SC requests under contention If a set of LL/SC requests contend on the same cache block we can get into a situation where CPUs will deadlock if they expect a failed SC to supply them data. This case happens where 3 or more cores are contending for a cache block using LL/SC and the system is configured where 2 cores are connected to a local bus and the third is connected to a remote bus. If a core on the local bus sends an SCUpgrade and the core on the remote bus sends and SCUpgrade they will race to see who will win the SC access. In the meantime if the other core appends a read to one of the SCUpgrades it will expect to be supplied data by that SCUpgrade transaction. If it happens that the SCUpgrade that was picked to supply the data is failed, it will drop the appended request for data and never respond, leaving the requesting core to deadlock. This patch makes all SC's behave as normal stores to prevent this case but still makes sure to check whether it can perform the update. diffstat: src/mem/cache/cache_impl.hh | 28 ++-- src/mem/packet.cc | 15 --- 2 files changed, 22 insertions(+), 21 deletions(-) diffs (96 lines): diff -r f40134eb3f85 -r 7aacec2a247d src/mem/cache/cache_impl.hh --- a/src/mem/cache/cache_impl.hh Tue May 27 11:00:56 2014 -0500 +++ b/src/mem/cache/cache_impl.hh Wed Sep 03 07:42:31 2014 -0400 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2010-2013 ARM Limited + * Copyright (c) 2010-2014 ARM Limited * All rights reserved. * * The license below extends only to copyright in the software and shall @@ -166,8 +166,12 @@ } else if (pkt-isWrite()) { if (blk-checkWrite(pkt)) { pkt-writeDataToBlock(blk-data, blkSize); -blk-status |= BlkDirty; } +// Always mark the line as dirty even if we are a failed +// StoreCond so we supply data to any snoops that have +// appended themselves to this cache before knowing the store +// will fail. +blk-status |= BlkDirty; } else if (pkt-isRead()) { if (pkt-isLLSC()) { blk-trackLoadLocked(pkt); @@ -658,6 +662,13 @@ // (read-only) and we need exclusive assert(needsExclusive !blk-isWritable()); cmd = cpu_pkt-isLLSC() ? MemCmd::SCUpgradeReq : MemCmd::UpgradeReq; +} else if (cpu_pkt-cmd == MemCmd::SCUpgradeFailReq || + cpu_pkt-cmd == MemCmd::StoreCondFailReq) { +// Even though this SC will fail, we still need to send out the +// request and get the data to supply it to other snoopers in the case +// where the determination the StoreCond fails is delayed due to +// all caches not being on the same local bus. +cmd = MemCmd::SCUpgradeFailReq; } else { // block is invalid cmd = needsExclusive ? MemCmd::ReadExReq : MemCmd::ReadReq; @@ -1724,18 +1735,7 @@ DPRINTF(CachePort, %s %s for address %x size %d\n, __func__, tgt_pkt-cmdString(), tgt_pkt-getAddr(), tgt_pkt-getSize()); -if (tgt_pkt-cmd == MemCmd::SCUpgradeFailReq || -tgt_pkt-cmd == MemCmd::StoreCondFailReq) { -// SCUpgradeReq or StoreCondReq saw invalidation while queued -// in MSHR, so now that we are getting around to processing -// it, just treat it as if we got a failure response -pkt = new Packet(tgt_pkt); -pkt-cmd = MemCmd::UpgradeFailResp; -pkt-senderState = mshr; -pkt-busFirstWordDelay = pkt-busLastWordDelay = 0; -recvTimingResp(pkt); -return NULL; -} else if (mshr-isForwardNoResponse()) { +if (mshr-isForwardNoResponse()) { // no response expected, just forward packet as it is assert(tags-findBlock(mshr-addr, mshr-isSecure) == NULL); pkt = tgt_pkt; diff -r f40134eb3f85 -r 7aacec2a247d src/mem/packet.cc --- a/src/mem/packet.cc Tue May 27 11:00:56 2014 -0500 +++ b/src/mem/packet.cc Wed Sep 03 07:42:31 2014 -0400 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2011-2013 ARM Limited + * Copyright (c) 2011-2014 ARM Limited * All rights reserved * * The license below extends only to copyright in the software and shall @@ -115,12 +115,13 @@ /* UpgradeResp */ { SET3(NeedsExclusive, IsUpgrade, IsResponse), InvalidCmd, UpgradeResp }, -/* SCUpgradeFailReq: generates UpgradeFailResp ASAP */ -{ SET5(IsInvalidate, NeedsExclusive, IsLlsc, - IsRequest, NeedsResponse), +/* SCUpgradeFailReq: generates UpgradeFailResp but still gets the data */ +{ SET6(IsRead, NeedsExclusive, IsInvalidate, + IsLlsc, IsRequest, NeedsResponse), UpgradeFailResp, SCUpgradeFailReq }, -
[gem5-dev] changeset in gem5: mem: Add utility script to plot DRAM efficien...
changeset 5169ebd26163 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=5169ebd26163 description: mem: Add utility script to plot DRAM efficiency sweep This patch adds basic functionality to quickly visualise the output from the DRAM efficiency script. There are some unfortunate hacks needed to communicate the needed information from one script to the other, and we fall back on (ab)using the simout to do this. As part of this patch we also trim the efficiency sweep to stop at 512 bytes as this should be sufficient for all forseeable DRAMs. diffstat: configs/dram/sweep.py | 13 +++- util/dram_sweep_plot.py | 151 2 files changed, 161 insertions(+), 3 deletions(-) diffs (185 lines): diff -r 7f4059e4f2d5 -r 5169ebd26163 configs/dram/sweep.py --- a/configs/dram/sweep.py Wed Sep 03 07:42:28 2014 -0400 +++ b/configs/dram/sweep.py Wed Sep 03 07:42:29 2014 -0400 @@ -124,12 +124,16 @@ # assume we start at 0 max_addr = mem_range.end +# use min of the page size and 512 bytes as that should be more than +# enough +max_stride = min(512, page_size) + # now we create the state by iterating over the stride size from burst -# size to min of the page size and 1 kB, and from using only a single -# bank up to the number of banks available +# size to the max stride, and from using only a single bank up to the +# number of banks available nxt_state = 0 for bank in range(1, nbr_banks + 1): -for stride_size in range(burst_size, min(1024, page_size) + 1, burst_size): +for stride_size in range(burst_size, max_stride + 1, burst_size): cfg_file.write(STATE %d %d DRAM 100 0 %d %d %d %d %d %d %d %d %d 1\n % (nxt_state, period, max_addr, burst_size, itt, itt, 0, @@ -168,3 +172,6 @@ m5.instantiate() m5.simulate(nxt_state * period) + +print DRAM sweep with burst: %d, banks: %d, max stride: %d % \ +(burst_size, nbr_banks, max_stride) diff -r 7f4059e4f2d5 -r 5169ebd26163 util/dram_sweep_plot.py --- /dev/null Thu Jan 01 00:00:00 1970 + +++ b/util/dram_sweep_plot.py Wed Sep 03 07:42:29 2014 -0400 @@ -0,0 +1,151 @@ +#!/usr/bin/env python + +# Copyright (c) 2014 ARM Limited +# All rights reserved +# +# The license below extends only to copyright in the software and shall +# not be construed as granting a license to any other intellectual +# property including but not limited to intellectual property relating +# to a hardware implementation of the functionality of the software +# licensed hereunder. You may use the software subject to the license +# terms below provided that you ensure that this notice is replicated +# unmodified and in its entirety in all distributions of the software, +# modified or unmodified, in source code or in binary form. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions are +# met: redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer; +# redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution; +# neither the name of the copyright holders nor the names of its +# contributors may be used to endorse or promote products derived from +# this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +# +# Authors: Andreas Hansson + +try: + +from mpl_toolkits.mplot3d import Axes3D +from matplotlib import cm +from matplotlib.ticker import LinearLocator, FormatStrFormatter +import matplotlib.pyplot as plt +import numpy as np +except ImportError: +print Failed to import matplotlib and numpy +exit(-1) + +import sys +import re + +# Determine the parameters of the sweep from the simout output, and +# then parse the stats and plot the 3D surface corresponding to the +# different combinations of parallel banks, and stride size, as +# generated by the config/dram/sweep.py script +def main(): + +
[gem5-dev] changeset in gem5: arm: support 16kb vm granules
changeset f40134eb3f85 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=f40134eb3f85 description: arm: support 16kb vm granules diffstat: src/arch/arm/miscregs.hh | 26 - src/arch/arm/table_walker.cc | 125 +- src/arch/arm/table_walker.hh | 56 +- 3 files changed, 139 insertions(+), 68 deletions(-) diffs (truncated from 363 to 300 lines): diff -r 5169ebd26163 -r f40134eb3f85 src/arch/arm/miscregs.hh --- a/src/arch/arm/miscregs.hh Wed Sep 03 07:42:29 2014 -0400 +++ b/src/arch/arm/miscregs.hh Tue May 27 11:00:56 2014 -0500 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2010-2013 ARM Limited + * Copyright (c) 2010-2014 ARM Limited * All rights reserved * * The license below extends only to copyright in the software and shall @@ -1715,6 +1715,30 @@ Bitfield20 tbi; EndBitUnion(TTBCR) +// Fields of TCR_EL{1,2,3} (mostly overlapping) +// TCR_EL1 is natively 64 bits, the others are 32 bits +BitUnion64(TCR) +Bitfield5, 0 t0sz; +Bitfield7 epd0; // EL1 +Bitfield9, 8 irgn0; +Bitfield11, 10 orgn0; +Bitfield13, 12 sh0; +Bitfield15, 14 tg0; +Bitfield18, 16 ps; +Bitfield20 tbi; // EL2/EL3 +Bitfield21, 16 t1sz; // EL1 +Bitfield22 a1; // EL1 +Bitfield23 epd1; // EL1 +Bitfield25, 24 irgn1; // EL1 +Bitfield27, 26 orgn1; // EL1 +Bitfield29, 28 sh1; // EL1 +Bitfield31, 30 tg1; // EL1 +Bitfield34, 32 ips; // EL1 +Bitfield36 as; // EL1 +Bitfield37 tbi0; // EL1 +Bitfield38 tbi1; // EL1 +EndBitUnion(TCR) + BitUnion32(HTCR) Bitfield2, 0 t0sz; Bitfield9, 8 irgn0; diff -r 5169ebd26163 -r f40134eb3f85 src/arch/arm/table_walker.cc --- a/src/arch/arm/table_walker.cc Wed Sep 03 07:42:29 2014 -0400 +++ b/src/arch/arm/table_walker.cc Tue May 27 11:00:56 2014 -0500 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2010, 2012-2013 ARM Limited + * Copyright (c) 2010, 2012-2014 ARM Limited * All rights reserved * * The license below extends only to copyright in the software and shall @@ -220,18 +220,18 @@ case EL0: case EL1: currState-sctlr = currState-tc-readMiscReg(MISCREG_SCTLR_EL1); -currState-ttbcr = currState-tc-readMiscReg(MISCREG_TCR_EL1); +currState-tcr = currState-tc-readMiscReg(MISCREG_TCR_EL1); break; // @todo: uncomment this to enable Virtualization // case EL2: // assert(haveVirtualization); // currState-sctlr = currState-tc-readMiscReg(MISCREG_SCTLR_EL2); - // currState-ttbcr = currState-tc-readMiscReg(MISCREG_TCR_EL2); + // currState-tcr = currState-tc-readMiscReg(MISCREG_TCR_EL2); // break; case EL3: assert(haveSecurity); currState-sctlr = currState-tc-readMiscReg(MISCREG_SCTLR_EL3); -currState-ttbcr = currState-tc-readMiscReg(MISCREG_TCR_EL3); +currState-tcr = currState-tc-readMiscReg(MISCREG_TCR_EL3); break; default: panic(Invalid exception level); @@ -625,8 +625,7 @@ currState-longDesc.lookupLevel = start_lookup_level; currState-longDesc.aarch64 = false; -currState-longDesc.largeGrain = false; -currState-longDesc.grainSize = 12; +currState-longDesc.grainSize = Grain4KB; Event *event = start_lookup_level == L1 ? (Event *) doL1LongDescEvent : (Event *) doL2LongDescEvent; @@ -663,13 +662,18 @@ { assert(currState-aarch64); -DPRINTF(TLB, Beginning table walk for address %#llx, TTBCR: %#llx\n, -currState-vaddr_tainted, currState-ttbcr); +DPRINTF(TLB, Beginning table walk for address %#llx, TCR: %#llx\n, +currState-vaddr_tainted, currState-tcr); + +static const GrainSize GrainMapDefault[] = + { Grain4KB, Grain64KB, Grain16KB, ReservedGrain }; +static const GrainSize GrainMap_EL1_tg1[] = + { ReservedGrain, Grain16KB, Grain4KB, Grain64KB }; // Determine TTBR, table size, granule size and phys. address range Addr ttbr = 0; int tsz = 0, ps = 0; -bool large_grain = false; +GrainSize tg = Grain4KB; // grain size computed from tg* field bool fault = false; switch (currState-el) { case EL0: @@ -678,44 +682,44 @@ case 0: DPRINTF(TLB, - Selecting TTBR0 (AArch64)\n); ttbr = currState-tc-readMiscReg(MISCREG_TTBR0_EL1); -tsz = adjustTableSizeAArch64(64 - currState-ttbcr.t0sz); -large_grain = currState-ttbcr.tg0; +tsz = adjustTableSizeAArch64(64 - currState-tcr.t0sz); +tg = GrainMapDefault[currState-tcr.tg0]; if (bits(currState-vaddr, 63, tsz) != 0x0 || -currState-ttbcr.epd0) +
[gem5-dev] changeset in gem5: config: Change parsing of Addr so hex values ...
changeset 19f5df7ac6a1 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=19f5df7ac6a1 description: config: Change parsing of Addr so hex values work from scripts When passed from a configuration script with a hexadecimal value (like 0x8000), gem5 would error out. This is because it would call toMemorySize which requires the argument to end with a size specifier (like 1MB, etc). This modification makes it so raw hex values can be passed through Addr parameters from the configuration scripts. diffstat: src/arch/arm/ArmSystem.py | 2 +- src/python/m5/params.py | 12 ++-- 2 files changed, 11 insertions(+), 3 deletions(-) diffs (35 lines): diff -r d2850235e31c -r 19f5df7ac6a1 src/arch/arm/ArmSystem.py --- a/src/arch/arm/ArmSystem.py Wed Sep 03 07:42:19 2014 -0400 +++ b/src/arch/arm/ArmSystem.py Wed Sep 03 07:42:20 2014 -0400 @@ -65,7 +65,7 @@ highest_el_is_64 = Param.Bool(False, True if the register width of the highest implemented exception level is 64 bits (ARMv8)) -reset_addr_64 = Param.UInt64(0x0, +reset_addr_64 = Param.Addr(0x0, Reset address if the highest implemented exception level is 64 bits (ARMv8)) phys_addr_range_64 = Param.UInt8(40, diff -r d2850235e31c -r 19f5df7ac6a1 src/python/m5/params.py --- a/src/python/m5/params.py Wed Sep 03 07:42:19 2014 -0400 +++ b/src/python/m5/params.py Wed Sep 03 07:42:20 2014 -0400 @@ -626,9 +626,17 @@ self.value = value.value else: try: +# Often addresses are referred to with sizes. Ex: A device +# base address is at 512MB. Use toMemorySize() to convert +# these into addresses. If the address is not specified with a +# size, an exception will occur and numeric translation will +# proceed below. self.value = convert.toMemorySize(value) -except TypeError: -self.value = long(value) +except (TypeError, ValueError): +# Convert number to string and use long() to do automatic +# base conversion (requires base=0 for auto-conversion) +self.value = long(str(value), base=0) + self._check() def __add__(self, other): if isinstance(other, Addr): ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: dev: Avoid invalid sized reads in PL390 with ...
changeset 72890a571a7b in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=72890a571a7b description: dev: Avoid invalid sized reads in PL390 with DPRINTF enabled The first DPRINTF() in PL390::writeDistributor always read a uint32_t, though a packet may have only been 1 or 2 bytes. This caused an assertion in packet-get(). diffstat: src/dev/arm/gic_pl390.cc | 19 ++- 1 files changed, 18 insertions(+), 1 deletions(-) diffs (30 lines): diff -r 82a4fa2d19a0 -r 72890a571a7b src/dev/arm/gic_pl390.cc --- a/src/dev/arm/gic_pl390.cc Wed Sep 03 07:42:25 2014 -0400 +++ b/src/dev/arm/gic_pl390.cc Wed Sep 03 07:42:27 2014 -0400 @@ -395,8 +395,25 @@ assert(pkt-req-hasContextId()); int ctx_id = pkt-req-contextId(); +uint32_t pkt_data M5_VAR_USED; +switch (pkt-getSize()) +{ + case 1: +pkt_data = pkt-getuint8_t(); +break; + case 2: +pkt_data = pkt-getuint16_t(); +break; + case 4: +pkt_data = pkt-getuint32_t(); +break; + default: +panic(Invalid size when writing to priority regs in Gic: %d\n, + pkt-getSize()); +} + DPRINTF(GIC, gic distributor write register %#x size %#x value %#x \n, -daddr, pkt-getSize(), pkt-getuint32_t()); +daddr, pkt-getSize(), pkt_data); if (daddr = ICDISER_ST daddr ICDISER_ED + 4) { assert((daddr-ICDISER_ST) 2 32); ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: arch: Properly guess OpClass from optional St...
changeset 43516d8eabe9 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=43516d8eabe9 description: arch: Properly guess OpClass from optional StaticInst flags isa_parser.py guesses the OpClass if none were given based upon the StaticInst flags. The existing code does not take into account optionally set flags. This code hoists the setting of optional flags so OpClass is properly assigned. diffstat: src/arch/isa_parser.py | 36 +--- 1 files changed, 25 insertions(+), 11 deletions(-) diffs (57 lines): diff -r 7aacec2a247d -r 43516d8eabe9 src/arch/isa_parser.py --- a/src/arch/isa_parser.pyWed Sep 03 07:42:31 2014 -0400 +++ b/src/arch/isa_parser.pyWed Sep 03 07:42:32 2014 -0400 @@ -1,3 +1,15 @@ +# Copyright (c) 2014 ARM Limited +# All rights reserved +# +# The license below extends only to copyright in the software and shall +# not be construed as granting a license to any other intellectual +# property including but not limited to intellectual property relating +# to a hardware implementation of the functionality of the software +# licensed hereunder. You may use the software subject to the license +# terms below provided that you ensure that this notice is replicated +# unmodified and in its entirety in all distributions of the software, +# modified or unmodified, in source code or in binary form. +# # Copyright (c) 2003-2005 The Regents of The University of Michigan # Copyright (c) 2013 Advanced Micro Devices, Inc. # All rights reserved. @@ -1119,17 +1131,7 @@ self.flags = self.operands.concatAttrLists('flags') -# Make a basic guess on the operand class (function unit type). -# These are good enough for most cases, and can be overridden -# later otherwise. -if 'IsStore' in self.flags: -self.op_class = 'MemWriteOp' -elif 'IsLoad' in self.flags or 'IsPrefetch' in self.flags: -self.op_class = 'MemReadOp' -elif 'IsFloating' in self.flags: -self.op_class = 'FloatAddOp' -else: -self.op_class = 'IntAluOp' +self.op_class = None # Optional arguments are assumed to be either StaticInst flags # or an OpClass value. To avoid having to import a complete @@ -1144,6 +1146,18 @@ error('InstObjParams: optional arg %s not recognized ' 'as StaticInst::Flag or OpClass.' % oa) +# Make a basic guess on the operand class if not set. +# These are good enough for most cases. +if not self.op_class: +if 'IsStore' in self.flags: +self.op_class = 'MemWriteOp' +elif 'IsLoad' in self.flags or 'IsPrefetch' in self.flags: +self.op_class = 'MemReadOp' +elif 'IsFloating' in self.flags: +self.op_class = 'FloatAddOp' +else: +self.op_class = 'IntAluOp' + # add flag initialization to contructor here to include # any flags added via opt_args self.constructor += makeFlagConstructor(self.flags) ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: cpu: Fix SMT scheduling issue with the O3 cpu
changeset ed05298e8566 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=ed05298e8566 description: cpu: Fix SMT scheduling issue with the O3 cpu The o3 cpu could attempt to schedule inactive threads under round-robin SMT mode. This is because it maintained an independent priority list of threads from the active thread list. This priority list could be come stale once threads were inactive, leading to the cpu trying to fetch/commit from inactive threads. Additionally the fetch queue is now forcibly flushed of instrctuctions from the de-scheduled thread. Relevant output: 24557000: system.cpu: [tid:1]: Calling deactivate thread. 24557000: system.cpu: [tid:1]: Removing from active threads list 24557500: system.cpu: FullO3CPU: Ticking main, FullO3CPU. 24557500: system.cpu.fetch: Running stage. 24557500: system.cpu.fetch: Attempting to fetch from [tid:1] diffstat: src/cpu/o3/O3CPU.py |3 +- src/cpu/o3/commit.hh |5 +- src/cpu/o3/commit_impl.hh | 15 +- src/cpu/o3/cpu.cc |5 +- src/cpu/o3/fetch.hh |6 +- src/cpu/o3/fetch_impl.hh | 109 + 6 files changed, 99 insertions(+), 44 deletions(-) diffs (truncated from 306 to 300 lines): diff -r f54586c894e3 -r ed05298e8566 src/cpu/o3/O3CPU.py --- a/src/cpu/o3/O3CPU.py Wed Sep 03 07:42:36 2014 -0400 +++ b/src/cpu/o3/O3CPU.py Wed Sep 03 07:42:37 2014 -0400 @@ -61,7 +61,8 @@ commitToFetchDelay = Param.Cycles(1, Commit to fetch delay) fetchWidth = Param.Unsigned(8, Fetch width) fetchBufferSize = Param.Unsigned(64, Fetch buffer size in bytes) -fetchQueueSize = Param.Unsigned(32, Fetch queue size in micro-ops) +fetchQueueSize = Param.Unsigned(32, Fetch queue size in micro-ops +per-thread) renameToDecodeDelay = Param.Cycles(1, Rename to decode delay) iewToDecodeDelay = Param.Cycles(1, Issue/Execute/Writeback to decode diff -r f54586c894e3 -r ed05298e8566 src/cpu/o3/commit.hh --- a/src/cpu/o3/commit.hh Wed Sep 03 07:42:36 2014 -0400 +++ b/src/cpu/o3/commit.hh Wed Sep 03 07:42:37 2014 -0400 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2010-2012 ARM Limited + * Copyright (c) 2010-2012, 2014 ARM Limited * All rights reserved. * * The license below extends only to copyright in the software and shall @@ -218,6 +218,9 @@ /** Takes over from another CPU's thread. */ void takeOverFrom(); +/** Deschedules a thread from scheduling */ +void deactivateThread(ThreadID tid); + /** Ticks the commit stage, which tries to commit instructions. */ void tick(); diff -r f54586c894e3 -r ed05298e8566 src/cpu/o3/commit_impl.hh --- a/src/cpu/o3/commit_impl.hh Wed Sep 03 07:42:36 2014 -0400 +++ b/src/cpu/o3/commit_impl.hh Wed Sep 03 07:42:37 2014 -0400 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2010-2013 ARM Limited + * Copyright (c) 2010-2014 ARM Limited * All rights reserved * * The license below extends only to copyright in the software and shall @@ -463,6 +463,19 @@ template class Impl void +DefaultCommitImpl::deactivateThread(ThreadID tid) +{ +listThreadID::iterator thread_it = std::find(priority_list.begin(), +priority_list.end(), tid); + +if (thread_it != priority_list.end()) { +priority_list.erase(thread_it); +} +} + + +template class Impl +void DefaultCommitImpl::updateStatus() { // reset ROB changed variable diff -r f54586c894e3 -r ed05298e8566 src/cpu/o3/cpu.cc --- a/src/cpu/o3/cpu.cc Wed Sep 03 07:42:36 2014 -0400 +++ b/src/cpu/o3/cpu.cc Wed Sep 03 07:42:37 2014 -0400 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2011-2012 ARM Limited + * Copyright (c) 2011-2012, 2014 ARM Limited * Copyright (c) 2013 Advanced Micro Devices, Inc. * All rights reserved * @@ -728,6 +728,9 @@ tid); activeThreads.erase(thread_it); } + +fetch.deactivateThread(tid); +commit.deactivateThread(tid); } template class Impl diff -r f54586c894e3 -r ed05298e8566 src/cpu/o3/fetch.hh --- a/src/cpu/o3/fetch.hh Wed Sep 03 07:42:36 2014 -0400 +++ b/src/cpu/o3/fetch.hh Wed Sep 03 07:42:37 2014 -0400 @@ -255,6 +255,8 @@ /** Tells fetch to wake up from a quiesce instruction. */ void wakeFromQuiesce(); +/** For priority-based fetch policies, need to keep update priorityList */ +void deactivateThread(ThreadID tid); private: /** Reset this pipeline stage */ void resetStage(); @@ -484,8 +486,8 @@ /** The size of the fetch queue in micro-ops */ unsigned fetchQueueSize; -/** Queue of fetched instructions */ -std::dequeDynInstPtr fetchQueue; +/** Queue of fetched instructions. Per-thread to prevent HoL blocking. */ +std::dequeDynInstPtr fetchQueue[Impl::MaxThreads]; /** Whether or not the fetch buffer data
[gem5-dev] changeset in gem5: cpu: Add a fetch queue to the o3 cpu
changeset 12e3be8203a5 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=12e3be8203a5 description: cpu: Add a fetch queue to the o3 cpu This patch adds a fetch queue that sits between fetch and decode to the o3 cpu. This effectively decouples fetch from decode stalls allowing it to be more aggressive, running futher ahead in the instruction stream. diffstat: src/cpu/o3/O3CPU.py | 1 + src/cpu/o3/fetch.hh | 14 +++--- src/cpu/o3/fetch_impl.hh | 61 ++- 3 files changed, 55 insertions(+), 21 deletions(-) diffs (201 lines): diff -r 867b536a68be -r 12e3be8203a5 src/cpu/o3/O3CPU.py --- a/src/cpu/o3/O3CPU.py Wed Sep 03 07:42:34 2014 -0400 +++ b/src/cpu/o3/O3CPU.py Wed Sep 03 07:42:35 2014 -0400 @@ -61,6 +61,7 @@ commitToFetchDelay = Param.Cycles(1, Commit to fetch delay) fetchWidth = Param.Unsigned(8, Fetch width) fetchBufferSize = Param.Unsigned(64, Fetch buffer size in bytes) +fetchQueueSize = Param.Unsigned(32, Fetch queue size in micro-ops) renameToDecodeDelay = Param.Cycles(1, Rename to decode delay) iewToDecodeDelay = Param.Cycles(1, Issue/Execute/Writeback to decode diff -r 867b536a68be -r 12e3be8203a5 src/cpu/o3/fetch.hh --- a/src/cpu/o3/fetch.hh Wed Sep 03 07:42:34 2014 -0400 +++ b/src/cpu/o3/fetch.hh Wed Sep 03 07:42:35 2014 -0400 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2010-2012 ARM Limited + * Copyright (c) 2010-2012, 2014 ARM Limited * All rights reserved * * The license below extends only to copyright in the software and shall @@ -401,9 +401,6 @@ /** Wire to get commit's information from backwards time buffer. */ typename TimeBufferTimeStruct::wire fromCommit; -/** Internal fetch instruction queue. */ -TimeBufferFetchStruct *fetchQueue; - //Might be annoying how this name is different than the queue. /** Wire used to write any information heading to decode. */ typename TimeBufferFetchStruct::wire toDecode; @@ -455,6 +452,9 @@ /** The width of fetch in instructions. */ unsigned fetchWidth; +/** The width of decode in instructions. */ +unsigned decodeWidth; + /** Is the cache blocked? If so no threads can access it. */ bool cacheBlocked; @@ -481,6 +481,12 @@ /** The PC of the first instruction loaded into the fetch buffer. */ Addr fetchBufferPC[Impl::MaxThreads]; +/** The size of the fetch queue in micro-ops */ +unsigned fetchQueueSize; + +/** Queue of fetched instructions */ +std::dequeDynInstPtr fetchQueue; + /** Whether or not the fetch buffer data is valid. */ bool fetchBufferValid[Impl::MaxThreads]; diff -r 867b536a68be -r 12e3be8203a5 src/cpu/o3/fetch_impl.hh --- a/src/cpu/o3/fetch_impl.hh Wed Sep 03 07:42:34 2014 -0400 +++ b/src/cpu/o3/fetch_impl.hh Wed Sep 03 07:42:35 2014 -0400 @@ -82,11 +82,13 @@ iewToFetchDelay(params-iewToFetchDelay), commitToFetchDelay(params-commitToFetchDelay), fetchWidth(params-fetchWidth), + decodeWidth(params-decodeWidth), retryPkt(NULL), retryTid(InvalidThreadID), cacheBlkSize(cpu-cacheLineSize()), fetchBufferSize(params-fetchBufferSize), fetchBufferMask(fetchBufferSize - 1), + fetchQueueSize(params-fetchQueueSize), numThreads(params-numThreads), numFetchingThreads(params-smtNumFetchingThreads), finishTranslationEvent(this) @@ -313,12 +315,10 @@ templateclass Impl void -DefaultFetchImpl::setFetchQueue(TimeBufferFetchStruct *fq_ptr) +DefaultFetchImpl::setFetchQueue(TimeBufferFetchStruct *ftb_ptr) { -fetchQueue = fq_ptr; - -// Create wire to write information to proper place in fetch queue. -toDecode = fetchQueue-getWire(0); +// Create wire to write information to proper place in fetch time buf. +toDecode = ftb_ptr-getWire(0); } templateclass Impl @@ -342,6 +342,7 @@ cacheBlocked = false; priorityList.clear(); +fetchQueue.clear(); // Setup PC and nextPC with initial state. for (ThreadID tid = 0; tid numThreads; ++tid) { @@ -454,6 +455,10 @@ return false; } +// Not drained if fetch queue contains entries +if (!fetchQueue.empty()) +return false; + /* The pipeline might start up again in the middle of the drain * cycle if the finish translation event is scheduled, so make * sure that's not the case. @@ -673,11 +678,8 @@ fetchStatus[tid] = IcacheWaitResponse; } } else { -// Don't send an instruction to decode if it can't handle it. -// Asynchronous nature of this function's calling means we have to -// check 2 signals to see if decode is stalled. -if (!(numInst fetchWidth) || stalls[tid].decode || -fromDecode-decodeBlock[tid]) { +// Don't send an instruction to decode if we can't handle it. +if (!(numInst
[gem5-dev] changeset in gem5: cpu: Fix o3 front-end pipeline interlock beha...
changeset 867b536a68be in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=867b536a68be description: cpu: Fix o3 front-end pipeline interlock behavior The o3 pipeline interlock/stall logic is incorrect. o3 unnecessicarily stalled fetch and decode due to later stages in the pipeline. In general, a stage should usually only consider if it is stalled by the adjacent, downstream stage. Forcing stalls due to later stages creates and results in bubbles in the pipeline. Additionally, o3 stalled the entire frontend (fetch, decode, rename) on a branch mispredict while the ROB is being serially walked to update the RAT (robSquashing). Only should have stalled at rename. diffstat: src/cpu/o3/comm.hh| 2 - src/cpu/o3/commit.hh | 11 src/cpu/o3/commit_impl.hh | 40 - src/cpu/o3/decode.hh | 4 +-- src/cpu/o3/decode_impl.hh | 55 +++- src/cpu/o3/fetch.hh | 3 -- src/cpu/o3/fetch_impl.hh | 64 -- src/cpu/o3/iew.hh | 11 src/cpu/o3/iew_impl.hh| 23 +--- src/cpu/o3/rename_impl.hh | 25 + 10 files changed, 26 insertions(+), 212 deletions(-) diffs (truncated from 525 to 300 lines): diff -r 5b6279635c49 -r 867b536a68be src/cpu/o3/comm.hh --- a/src/cpu/o3/comm.hhWed Sep 03 07:42:33 2014 -0400 +++ b/src/cpu/o3/comm.hhWed Sep 03 07:42:34 2014 -0400 @@ -229,8 +229,6 @@ bool renameUnblock[Impl::MaxThreads]; bool iewBlock[Impl::MaxThreads]; bool iewUnblock[Impl::MaxThreads]; -bool commitBlock[Impl::MaxThreads]; -bool commitUnblock[Impl::MaxThreads]; }; #endif //__CPU_O3_COMM_HH__ diff -r 5b6279635c49 -r 867b536a68be src/cpu/o3/commit.hh --- a/src/cpu/o3/commit.hh Wed Sep 03 07:42:33 2014 -0400 +++ b/src/cpu/o3/commit.hh Wed Sep 03 07:42:34 2014 -0400 @@ -185,9 +185,6 @@ /** Sets the pointer to the IEW stage. */ void setIEWStage(IEW *iew_stage); -/** Skid buffer between rename and commit. */ -std::queueDynInstPtr skidBuffer; - /** The pointer to the IEW stage. Used solely to ensure that * various events (traps, interrupts, syscalls) do not occur until * all stores have written back. @@ -251,11 +248,6 @@ */ void setNextStatus(); -/** Checks if the ROB is completed with squashing. This is for the case - * where the ROB can take multiple cycles to complete squashing. - */ -bool robDoneSquashing(); - /** Returns if any of the threads have the number of ROB entries changed * on this cycle. Used to determine if the number of free ROB entries needs * to be sent back to previous stages. @@ -321,9 +313,6 @@ /** Gets instructions from rename and inserts them into the ROB. */ void getInsts(); -/** Insert all instructions from rename into skidBuffer */ -void skidInsert(); - /** Marks completed instructions using information sent from IEW. */ void markCompletedInsts(); diff -r 5b6279635c49 -r 867b536a68be src/cpu/o3/commit_impl.hh --- a/src/cpu/o3/commit_impl.hh Wed Sep 03 07:42:33 2014 -0400 +++ b/src/cpu/o3/commit_impl.hh Wed Sep 03 07:42:34 2014 -0400 @@ -1335,29 +1335,6 @@ template class Impl void -DefaultCommitImpl::skidInsert() -{ -DPRINTF(Commit, Attempting to any instructions from rename into -skidBuffer.\n); - -for (int inst_num = 0; inst_num fromRename-size; ++inst_num) { -DynInstPtr inst = fromRename-insts[inst_num]; - -if (!inst-isSquashed()) { -DPRINTF(Commit, Inserting PC %s [sn:%i] [tid:%i] into , -skidBuffer.\n, inst-pcState(), inst-seqNum, -inst-threadNumber); -skidBuffer.push(inst); -} else { -DPRINTF(Commit, Instruction PC %s [sn:%i] [tid:%i] was -squashed, skipping.\n, -inst-pcState(), inst-seqNum, inst-threadNumber); -} -} -} - -template class Impl -void DefaultCommitImpl::markCompletedInsts() { // Grab completed insts out of the IEW instruction queue, and mark @@ -1380,23 +1357,6 @@ } template class Impl -bool -DefaultCommitImpl::robDoneSquashing() -{ -listThreadID::iterator threads = activeThreads-begin(); -listThreadID::iterator end = activeThreads-end(); - -while (threads != end) { -ThreadID tid = *threads++; - -if (!rob-isDoneSquashing(tid)) -return false; -} - -return true; -} - -template class Impl void DefaultCommitImpl::updateComInstStats(DynInstPtr inst) { diff -r 5b6279635c49 -r 867b536a68be src/cpu/o3/decode.hh --- a/src/cpu/o3/decode.hh Wed Sep 03 07:42:33 2014 -0400 +++ b/src/cpu/o3/decode.hh Wed Sep 03 07:42:34 2014 -0400 @@ -126,7 +126,7 @@ void drainSanityCheck() const; /** Has the stage
[gem5-dev] changeset in gem5: cpu: Fix o3 drain bug
changeset 40d24a672351 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=40d24a672351 description: cpu: Fix o3 drain bug For X86, the o3 CPU would get stuck with the commit stage not being drained if an interrupt arrived while drain was pending. isDrained() makes sure that pcState.microPC() == 0, thus ensuring that we are at an instruction boundary. However, when we take an interrupt we execute: pcState.upc(romMicroPC(entry)); pcState.nupc(romMicroPC(entry) + 1); tc-pcState(pcState); As a result, the MicroPC is no longer zero. This patch ensures the drain is delayed until no interrupts are present. Once draining, non-synchronous interrupts are deffered until after the switch. diffstat: src/cpu/o3/commit.hh | 11 ++- src/cpu/o3/commit_impl.hh | 15 --- 2 files changed, 22 insertions(+), 4 deletions(-) diffs (72 lines): diff -r 53278be85b40 -r 40d24a672351 src/cpu/o3/commit.hh --- a/src/cpu/o3/commit.hh Wed Sep 03 07:42:44 2014 -0400 +++ b/src/cpu/o3/commit.hh Wed Sep 03 07:42:45 2014 -0400 @@ -438,9 +438,18 @@ /** Number of Active Threads */ ThreadID numThreads; -/** Is a drain pending. */ +/** Is a drain pending? Commit is looking for an instruction boundary while + * there are no pending interrupts + */ bool drainPending; +/** Is a drain imminent? Commit has found an instruction boundary while no + * interrupts were present or in flight. This was the last architecturally + * committed instruction. Interrupts disabled and pipeline flushed. + * Waiting for structures to finish draining. + */ +bool drainImminent; + /** The latency to handle a trap. Used when scheduling trap * squash event. */ diff -r 53278be85b40 -r 40d24a672351 src/cpu/o3/commit_impl.hh --- a/src/cpu/o3/commit_impl.hh Wed Sep 03 07:42:44 2014 -0400 +++ b/src/cpu/o3/commit_impl.hh Wed Sep 03 07:42:45 2014 -0400 @@ -104,6 +104,7 @@ commitWidth(params-commitWidth), numThreads(params-numThreads), drainPending(false), + drainImminent(false), trapLatency(params-trapLatency), canHandleInterrupts(true), avoidQuiesceLiveLock(false) @@ -406,6 +407,7 @@ DefaultCommitImpl::drainResume() { drainPending = false; +drainImminent = false; } template class Impl @@ -816,8 +818,10 @@ void DefaultCommitImpl::propagateInterrupt() { +// Don't propagate intterupts if we are currently handling a trap or +// in draining and the last observable instruction has been committed. if (commitStatus[0] == TrapPending || interrupt || trapSquash[0] || -tcSquash[0]) +tcSquash[0] || drainImminent) return; // Process interrupts if interrupts are enabled, not in PAL @@ -1089,10 +1093,15 @@ squashAfter(tid, head_inst); if (drainPending) { -DPRINTF(Drain, Draining: %i:%s\n, tid, pc[tid]); -if (pc[tid].microPC() == 0 interrupt == NoFault) { +if (pc[tid].microPC() == 0 interrupt == NoFault +!thread[tid]-trapPending) { +// Last architectually committed instruction. +// Squash the pipeline, stall fetch, and use +// drainImminent to disable interrupts +DPRINTF(Drain, Draining: %i:%s\n, tid, pc[tid]); squashAfter(tid, head_inst); cpu-commitDrained(tid); +drainImminent = true; } } ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: arm: Fix v8 neon latency issue for loads/stores
changeset 53278be85b40 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=53278be85b40 description: arm: Fix v8 neon latency issue for loads/stores Neon memory ops that operate on multiple registers currently have very poor performance because of interleave/deinterleave micro-ops. This patch marks the deinterleave/interleave micro-ops as No_OpClass such that they take minumum cycles to execute and are never resource constrained. Additionaly the micro-ops over-read registers. Although one form may need to read up to 20 sources, not all do. This adds in new forms so false dependencies are not modeled. Instructions read their minimum number of sources. diffstat: src/arch/arm/insts/macromem.cc| 47 +- src/arch/arm/isa/insts/neon64_mem.isa | 24 +++- 2 files changed, 56 insertions(+), 15 deletions(-) diffs (140 lines): diff -r 8bee5f4edb92 -r 53278be85b40 src/arch/arm/insts/macromem.cc --- a/src/arch/arm/insts/macromem.ccTue Apr 29 16:05:02 2014 -0500 +++ b/src/arch/arm/insts/macromem.ccWed Sep 03 07:42:44 2014 -0400 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2010-2013 ARM Limited + * Copyright (c) 2010-2014 ARM Limited * All rights reserved * * The license below extends only to copyright in the software and shall @@ -1107,9 +1107,26 @@ } for (int i = 0; i numMarshalMicroops; ++i) { -microOps[uopIdx++] = new MicroDeintNeon64( -machInst, vd + (RegIndex) (2 * i), vx, eSize, dataSize, -numStructElems, numRegs, i /* step */); +switch(numRegs) { +case 1: microOps[uopIdx++] = new MicroDeintNeon64_1Reg( +machInst, vd + (RegIndex) (2 * i), vx, eSize, dataSize, +numStructElems, 1, i /* step */); +break; +case 2: microOps[uopIdx++] = new MicroDeintNeon64_2Reg( +machInst, vd + (RegIndex) (2 * i), vx, eSize, dataSize, +numStructElems, 2, i /* step */); +break; +case 3: microOps[uopIdx++] = new MicroDeintNeon64_3Reg( +machInst, vd + (RegIndex) (2 * i), vx, eSize, dataSize, +numStructElems, 3, i /* step */); +break; +case 4: microOps[uopIdx++] = new MicroDeintNeon64_4Reg( +machInst, vd + (RegIndex) (2 * i), vx, eSize, dataSize, +numStructElems, 4, i /* step */); +break; +default: panic(Invalid number of registers); +} + } assert(uopIdx == numMicroops); @@ -1150,9 +1167,25 @@ unsigned uopIdx = 0; for(int i = 0; i numMarshalMicroops; ++i) { -microOps[uopIdx++] = new MicroIntNeon64( -machInst, vx + (RegIndex) (2 * i), vd, eSize, dataSize, -numStructElems, numRegs, i /* step */); +switch (numRegs) { +case 1: microOps[uopIdx++] = new MicroIntNeon64_1Reg( +machInst, vx + (RegIndex) (2 * i), vd, eSize, dataSize, +numStructElems, 1, i /* step */); +break; +case 2: microOps[uopIdx++] = new MicroIntNeon64_2Reg( +machInst, vx + (RegIndex) (2 * i), vd, eSize, dataSize, +numStructElems, 2, i /* step */); +break; +case 3: microOps[uopIdx++] = new MicroIntNeon64_3Reg( +machInst, vx + (RegIndex) (2 * i), vd, eSize, dataSize, +numStructElems, 3, i /* step */); +break; +case 4: microOps[uopIdx++] = new MicroIntNeon64_4Reg( +machInst, vx + (RegIndex) (2 * i), vd, eSize, dataSize, +numStructElems, 4, i /* step */); +break; +default: panic(Invalid number of registers); +} } uint32_t memaccessFlags = TLB::MustBeOne | (TLB::ArmFlags) eSize | diff -r 8bee5f4edb92 -r 53278be85b40 src/arch/arm/isa/insts/neon64_mem.isa --- a/src/arch/arm/isa/insts/neon64_mem.isa Tue Apr 29 16:05:02 2014 -0500 +++ b/src/arch/arm/isa/insts/neon64_mem.isa Wed Sep 03 07:42:44 2014 -0400 @@ -1,6 +1,6 @@ // -*- mode: c++ -*- -// Copyright (c) 2012-2013 ARM Limited +// Copyright (c) 2012-2014 ARM Limited // All rights reserved // // The license below extends only to copyright in the software and shall @@ -163,11 +163,11 @@ header_output += MicroNeonMemDeclare64.subst(loadIop) + \ MicroNeonMemDeclare64.subst(storeIop) -def mkMarshalMicroOp(name, Name): +def mkMarshalMicroOp(name, Name, numRegs=4): global header_output, decoder_output, exec_output getInputCodeOp1L = '' -for v in range(4): +for v in range(numRegs): for p in
[gem5-dev] changeset in gem5: cpu: fix bimodal predictor to use correct glo...
changeset 1b627a6ddac0 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=1b627a6ddac0 description: cpu: fix bimodal predictor to use correct global history reg A small bug in the bimodal predictor caused significant degradation in performance on some benchmarks. This was caused by using the wrong globalHistoryReg during the update phase. This patches fixes the bug and brings the performance to normal level. diffstat: src/cpu/pred/bi_mode.cc | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diffs (12 lines): diff -r 5e424aa952c5 -r 1b627a6ddac0 src/cpu/pred/bi_mode.cc --- a/src/cpu/pred/bi_mode.cc Wed Sep 03 07:42:40 2014 -0400 +++ b/src/cpu/pred/bi_mode.cc Wed Sep 03 07:42:41 2014 -0400 @@ -167,7 +167,7 @@ unsigned choiceHistoryIdx = ((branchAddr instShiftAmt) choiceHistoryMask); unsigned globalHistoryIdx = (((branchAddr instShiftAmt) -^ globalHistoryReg) +^ history-globalHistoryReg) globalHistoryMask); assert(choiceHistoryIdx choicePredictorSize); ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: mem: Refactor assignment of Packet types
changeset 711eb0e64249 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=711eb0e64249 description: mem: Refactor assignment of Packet types Put the packet type swizzling (that is currently done in a lot of places) into a refineCommand() member function. diffstat: src/cpu/checker/cpu.cc | 5 +--- src/cpu/inorder/resources/cache_unit.cc | 14 +- src/cpu/o3/lsq_unit.hh | 7 ++--- src/cpu/o3/lsq_unit_impl.hh | 9 ++ src/cpu/ozone/lw_lsq.hh | 5 +--- src/cpu/ozone/lw_lsq_impl.hh| 5 +--- src/cpu/simple/atomic.cc| 5 +-- src/cpu/simple/timing.cc| 15 +--- src/mem/packet.hh | 41 - 9 files changed, 54 insertions(+), 52 deletions(-) diffs (223 lines): diff -r 0b4d10f53c2d -r 711eb0e64249 src/cpu/checker/cpu.cc --- a/src/cpu/checker/cpu.ccWed Sep 03 07:42:46 2014 -0400 +++ b/src/cpu/checker/cpu.ccTue May 13 12:20:48 2014 -0500 @@ -170,10 +170,7 @@ // Now do the access if (fault == NoFault !memReq-getFlags().isSet(Request::NO_ACCESS)) { -PacketPtr pkt = new Packet(memReq, - memReq-isLLSC() ? - MemCmd::LoadLockedReq : - MemCmd::ReadReq); +PacketPtr pkt = Packet::createRead(memReq); pkt-dataStatic(data); diff -r 0b4d10f53c2d -r 711eb0e64249 src/cpu/inorder/resources/cache_unit.cc --- a/src/cpu/inorder/resources/cache_unit.cc Wed Sep 03 07:42:46 2014 -0400 +++ b/src/cpu/inorder/resources/cache_unit.cc Tue May 13 12:20:48 2014 -0500 @@ -812,21 +812,11 @@ void CacheUnit::buildDataPacket(CacheRequest *cache_req) { -// Check for LL/SC and if so change command -if (cache_req-memReq-isLLSC() cache_req-pktCmd == MemCmd::ReadReq) { -cache_req-pktCmd = MemCmd::LoadLockedReq; -} - -if (cache_req-pktCmd == MemCmd::WriteReq) { -cache_req-pktCmd = -cache_req-memReq-isSwap() ? MemCmd::SwapReq : -(cache_req-memReq-isLLSC() ? MemCmd::StoreCondReq - : MemCmd::WriteReq); -} - cache_req-dataPkt = new CacheReqPacket(cache_req, cache_req-pktCmd, cache_req-instIdx); +cache_req-dataPkt-refineCommand(); // handle LL/SC, etc. + DPRINTF(InOrderCachePort, [slot:%i]: Slot marked for %x\n, cache_req-getSlot(), cache_req-dataPkt-getAddr()); diff -r 0b4d10f53c2d -r 711eb0e64249 src/cpu/o3/lsq_unit.hh --- a/src/cpu/o3/lsq_unit.hhWed Sep 03 07:42:46 2014 -0400 +++ b/src/cpu/o3/lsq_unit.hhTue May 13 12:20:48 2014 -0500 @@ -776,8 +776,7 @@ // if we the cache is not blocked, do cache access bool completedFirst = false; -MemCmd command = req-isLLSC() ? MemCmd::LoadLockedReq : MemCmd::ReadReq; -PacketPtr data_pkt = new Packet(req, command); +PacketPtr data_pkt = Packet::createRead(req); PacketPtr fst_data_pkt = NULL; PacketPtr snd_data_pkt = NULL; @@ -794,8 +793,8 @@ fst_data_pkt = data_pkt; } else { // Create the split packets. -fst_data_pkt = new Packet(sreqLow, command); -snd_data_pkt = new Packet(sreqHigh, command); +fst_data_pkt = Packet::createRead(sreqLow); +snd_data_pkt = Packet::createRead(sreqHigh); fst_data_pkt-dataStatic(load_inst-memData); snd_data_pkt-dataStatic(load_inst-memData + sreqLow-getSize()); diff -r 0b4d10f53c2d -r 711eb0e64249 src/cpu/o3/lsq_unit_impl.hh --- a/src/cpu/o3/lsq_unit_impl.hh Wed Sep 03 07:42:46 2014 -0400 +++ b/src/cpu/o3/lsq_unit_impl.hh Tue May 13 12:20:48 2014 -0500 @@ -839,9 +839,6 @@ else memcpy(inst-memData, storeQueue[storeWBIdx].data, req-getSize()); -MemCmd command = -req-isSwap() ? MemCmd::SwapReq : -(req-isLLSC() ? MemCmd::StoreCondReq : MemCmd::WriteReq); PacketPtr data_pkt; PacketPtr snd_data_pkt = NULL; @@ -853,13 +850,13 @@ if (!TheISA::HasUnalignedMemAcc || !storeQueue[storeWBIdx].isSplit) { // Build a single data packet if the store isn't split. -data_pkt = new Packet(req, command); +data_pkt = Packet::createWrite(req); data_pkt-dataStatic(inst-memData); data_pkt-senderState = state; } else { // Create two packets if the store is split in two. -data_pkt = new Packet(sreqLow, command); -snd_data_pkt = new Packet(sreqHigh, command); +data_pkt = Packet::createWrite(sreqLow); +snd_data_pkt = Packet::createWrite(sreqHigh); data_pkt-dataStatic(inst-memData); snd_data_pkt-dataStatic(inst-memData +
[gem5-dev] changeset in gem5: x86: Flag instructions that call suspend as I...
changeset 0b4d10f53c2d in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=0b4d10f53c2d description: x86: Flag instructions that call suspend as IsQuiesce The o3 cpu relies upon instructions that suspend a thread context being flagged as IsQuiesce. If they are not, unpredictable behavior can occur. This patch fixes that for the x86 ISA. diffstat: src/arch/x86/isa/decoder/two_byte_opcodes.isa | 6 +++--- src/arch/x86/isa/microops/specop.isa | 3 ++- 2 files changed, 5 insertions(+), 4 deletions(-) diffs (33 lines): diff -r 40d24a672351 -r 0b4d10f53c2d src/arch/x86/isa/decoder/two_byte_opcodes.isa --- a/src/arch/x86/isa/decoder/two_byte_opcodes.isa Wed Sep 03 07:42:45 2014 -0400 +++ b/src/arch/x86/isa/decoder/two_byte_opcodes.isa Wed Sep 03 07:42:46 2014 -0400 @@ -141,13 +141,13 @@ }}, IsNonSpeculative); 0x01: m5quiesce({{ PseudoInst::quiesce(xc-tcBase()); -}}, IsNonSpeculative); +}}, IsNonSpeculative, IsQuiesce); 0x02: m5quiesceNs({{ PseudoInst::quiesceNs(xc-tcBase(), Rdi); -}}, IsNonSpeculative); +}}, IsNonSpeculative, IsQuiesce); 0x03: m5quiesceCycle({{ PseudoInst::quiesceCycles(xc-tcBase(), Rdi); -}}, IsNonSpeculative); +}}, IsNonSpeculative, IsQuiesce); 0x04: m5quiesceTime({{ Rax = PseudoInst::quiesceTime(xc-tcBase()); }}, IsNonSpeculative); diff -r 40d24a672351 -r 0b4d10f53c2d src/arch/x86/isa/microops/specop.isa --- a/src/arch/x86/isa/microops/specop.isa Wed Sep 03 07:42:45 2014 -0400 +++ b/src/arch/x86/isa/microops/specop.isa Wed Sep 03 07:42:46 2014 -0400 @@ -63,7 +63,8 @@ MicroHalt(ExtMachInst _machInst, const char * instMnem, uint64_t setFlags) : X86MicroopBase(_machInst, halt, instMnem, - setFlags | (ULL(1) StaticInst::IsNonSpeculative), + setFlags | (ULL(1) StaticInst::IsNonSpeculative) | + (ULL(1) StaticInst::IsQuiesce), No_OpClass) { } ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: arm: ISA X31 destination register fix
changeset 85001c018d4c in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=85001c018d4c description: arm: ISA X31 destination register fix This patch substituted the zero register for X31 used as a destination register. This prevents false dependencies based on X31. diffstat: src/arch/arm/intregs.hh |9 ++- src/arch/arm/isa/formats/aarch64.isa | 120 ++ 2 files changed, 72 insertions(+), 57 deletions(-) diffs (truncated from 349 to 300 lines): diff -r 60dddc0a6f78 -r 85001c018d4c src/arch/arm/intregs.hh --- a/src/arch/arm/intregs.hh Wed Sep 03 07:42:41 2014 -0400 +++ b/src/arch/arm/intregs.hh Wed Sep 03 07:42:43 2014 -0400 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2010-2013 ARM Limited + * Copyright (c) 2010-2014 ARM Limited * All rights reserved * * The license below extends only to copyright in the software and shall @@ -510,6 +510,13 @@ return reg; } +static inline IntRegIndex +makeZero(IntRegIndex reg) +{ +if (reg == INTREG_X31) +reg = INTREG_ZERO; +return reg; +} static inline bool isSP(IntRegIndex reg) diff -r 60dddc0a6f78 -r 85001c018d4c src/arch/arm/isa/formats/aarch64.isa --- a/src/arch/arm/isa/formats/aarch64.isa Wed Sep 03 07:42:41 2014 -0400 +++ b/src/arch/arm/isa/formats/aarch64.isa Wed Sep 03 07:42:43 2014 -0400 @@ -1,4 +1,4 @@ -// Copyright (c) 2011-2013 ARM Limited +// Copyright (c) 2011-2014 ARM Limited // All rights reserved // // The license below extends only to copyright in the software and shall @@ -63,6 +63,7 @@ { IntRegIndex rd = (IntRegIndex)(uint32_t)bits(machInst, 4, 0); IntRegIndex rdsp = makeSP(rd); +IntRegIndex rdzr = makeZero(rd); IntRegIndex rn = (IntRegIndex)(uint32_t)bits(machInst, 9, 5); IntRegIndex rnsp = makeSP(rn); @@ -79,9 +80,9 @@ uint64_t immhi = bits(machInst, 23, 5); uint64_t imm = (immlo 0) | (immhi 2); if (bits(machInst, 31) == 0) -return new AdrXImm(machInst, rd, INTREG_ZERO, sext21(imm)); +return new AdrXImm(machInst, rdzr, INTREG_ZERO, sext21(imm)); else -return new AdrpXImm(machInst, rd, INTREG_ZERO, +return new AdrpXImm(machInst, rdzr, INTREG_ZERO, sext33(imm 12)); } case 0x2: @@ -100,11 +101,11 @@ case 0x0: return new AddXImm(machInst, rdsp, rnsp, imm); case 0x1: -return new AddXImmCc(machInst, rd, rnsp, imm); +return new AddXImmCc(machInst, rdzr, rnsp, imm); case 0x2: return new SubXImm(machInst, rdsp, rnsp, imm); case 0x3: -return new SubXImmCc(machInst, rd, rnsp, imm); +return new SubXImmCc(machInst, rdzr, rnsp, imm); } } case 0x4: @@ -146,23 +147,24 @@ case 0x2: return new EorXImm(machInst, rdsp, rn, imm); case 0x3: -return new AndXImmCc(machInst, rd, rn, imm); +return new AndXImmCc(machInst, rdzr, rn, imm); } } case 0x5: { IntRegIndex rd = (IntRegIndex)(uint32_t)bits(machInst, 4, 0); +IntRegIndex rdzr = makeZero(rd); uint32_t imm16 = bits(machInst, 20, 5); uint32_t hw = bits(machInst, 22, 21); switch (opc) { case 0x0: -return new Movn(machInst, rd, imm16, hw * 16); +return new Movn(machInst, rdzr, imm16, hw * 16); case 0x1: return new Unknown64(machInst); case 0x2: -return new Movz(machInst, rd, imm16, hw * 16); +return new Movz(machInst, rdzr, imm16, hw * 16); case 0x3: -return new Movk(machInst, rd, imm16, hw * 16); +return new Movk(machInst, rdzr, imm16, hw * 16); } } case 0x6: @@ -170,11 +172,11 @@ return new Unknown64(machInst); switch (opc) { case 0x0: -return new Sbfm64(machInst, rd, rn, immr, imms); +return new Sbfm64(machInst, rdzr, rn, immr, imms); case 0x1: -return new Bfm64(machInst, rd, rn, immr, imms); +return new Bfm64(machInst, rdzr, rn, immr, imms); case 0x2: -return new Ubfm64(machInst, rd, rn, immr, imms); +return new Ubfm64(machInst, rdzr, rn, immr, imms); case 0x3: return new Unknown64(machInst); } @@ -184,7 +186,7 @@ if (opc || bits(machInst, 21)) return new Unknown64(machInst); else -return new
[gem5-dev] changeset in gem5: alpha: Stop using 'inorder' and rely entirely...
changeset 35241e33c38f in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=35241e33c38f description: alpha: Stop using 'inorder' and rely entirely on 'minor' This patch avoids building the 'inorder' CPU model for any permutation of ALPHA, and also removes the ALPHA regressions using the 'inorder' CPU. The 'minor' CPU is already providing a broader test coverage. diffstat: build_opts/ALPHA |2 +- build_opts/ALPHA_MESI_Two_Level |2 +- build_opts/ALPHA_MOESI_CMP_directory |2 +- build_opts/ALPHA_MOESI_CMP_token |2 +- build_opts/ALPHA_MOESI_hammer |2 +- build_opts/ALPHA_Network_test |2 +- tests/SConscript |1 - tests/configs/tsunami-inorder.py | 43 - tests/long/se/30.eon/ref/alpha/tru64/inorder-timing/config.ini| 346 tests/long/se/30.eon/ref/alpha/tru64/inorder-timing/simerr| 51 - tests/long/se/30.eon/ref/alpha/tru64/inorder-timing/simout| 14 - tests/long/se/30.eon/ref/alpha/tru64/inorder-timing/stats.txt | 721 - tests/long/se/50.vortex/ref/alpha/tru64/inorder-timing/config.ini | 346 tests/long/se/50.vortex/ref/alpha/tru64/inorder-timing/simerr |5 - tests/long/se/50.vortex/ref/alpha/tru64/inorder-timing/simout | 11 - tests/long/se/50.vortex/ref/alpha/tru64/inorder-timing/smred.msg | 158 -- tests/long/se/50.vortex/ref/alpha/tru64/inorder-timing/smred.out | 258 --- tests/long/se/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt | 752 - tests/long/se/60.bzip2/ref/alpha/tru64/inorder-timing/config.ini | 346 tests/long/se/60.bzip2/ref/alpha/tru64/inorder-timing/simerr |5 - tests/long/se/60.bzip2/ref/alpha/tru64/inorder-timing/simout | 26 - tests/long/se/60.bzip2/ref/alpha/tru64/inorder-timing/stats.txt | 759 -- tests/long/se/70.twolf/ref/alpha/tru64/inorder-timing/config.ini | 346 tests/long/se/70.twolf/ref/alpha/tru64/inorder-timing/simerr |5 - tests/long/se/70.twolf/ref/alpha/tru64/inorder-timing/simout | 26 - tests/long/se/70.twolf/ref/alpha/tru64/inorder-timing/smred.out | 276 --- tests/long/se/70.twolf/ref/alpha/tru64/inorder-timing/smred.pin | 17 - tests/long/se/70.twolf/ref/alpha/tru64/inorder-timing/smred.pl1 | 11 - tests/long/se/70.twolf/ref/alpha/tru64/inorder-timing/smred.pl2 |2 - tests/long/se/70.twolf/ref/alpha/tru64/inorder-timing/smred.sav | 18 - tests/long/se/70.twolf/ref/alpha/tru64/inorder-timing/smred.sv2 | 19 - tests/long/se/70.twolf/ref/alpha/tru64/inorder-timing/smred.twf | 29 - tests/long/se/70.twolf/ref/alpha/tru64/inorder-timing/stats.txt | 722 - tests/quick/se/00.hello/ref/alpha/linux/inorder-timing/config.ini | 346 tests/quick/se/00.hello/ref/alpha/linux/inorder-timing/simerr |1 - tests/quick/se/00.hello/ref/alpha/linux/inorder-timing/simout | 12 - tests/quick/se/00.hello/ref/alpha/linux/inorder-timing/stats.txt | 699 - 37 files changed, 6 insertions(+), 6377 deletions(-) diffs (truncated from 6560 to 300 lines): diff -r 939094c17866 -r 35241e33c38f build_opts/ALPHA --- a/build_opts/ALPHA Wed Sep 03 07:42:55 2014 -0400 +++ b/build_opts/ALPHA Wed Sep 03 07:42:56 2014 -0400 @@ -1,4 +1,4 @@ TARGET_ISA = 'alpha' SS_COMPATIBLE_FP = 1 -CPU_MODELS = 'AtomicSimpleCPU,TimingSimpleCPU,O3CPU,InOrderCPU,MinorCPU' +CPU_MODELS = 'AtomicSimpleCPU,TimingSimpleCPU,O3CPU,MinorCPU' PROTOCOL = 'MI_example' diff -r 939094c17866 -r 35241e33c38f build_opts/ALPHA_MESI_Two_Level --- a/build_opts/ALPHA_MESI_Two_Level Wed Sep 03 07:42:55 2014 -0400 +++ b/build_opts/ALPHA_MESI_Two_Level Wed Sep 03 07:42:56 2014 -0400 @@ -1,3 +1,3 @@ SS_COMPATIBLE_FP = 1 -CPU_MODELS = 'AtomicSimpleCPU,TimingSimpleCPU,O3CPU,InOrderCPU' +CPU_MODELS = 'AtomicSimpleCPU,TimingSimpleCPU,O3CPU,MinorCPU' PROTOCOL = 'MESI_Two_Level' diff -r 939094c17866 -r 35241e33c38f build_opts/ALPHA_MOESI_CMP_directory --- a/build_opts/ALPHA_MOESI_CMP_directory Wed Sep 03 07:42:55 2014 -0400 +++ b/build_opts/ALPHA_MOESI_CMP_directory Wed Sep 03 07:42:56 2014 -0400 @@ -1,3 +1,3 @@ SS_COMPATIBLE_FP = 1 -CPU_MODELS = 'AtomicSimpleCPU,TimingSimpleCPU,O3CPU,InOrderCPU' +CPU_MODELS = 'AtomicSimpleCPU,TimingSimpleCPU,O3CPU,MinorCPU' PROTOCOL = 'MOESI_CMP_directory' diff -r 939094c17866 -r 35241e33c38f build_opts/ALPHA_MOESI_CMP_token --- a/build_opts/ALPHA_MOESI_CMP_token Wed Sep 03 07:42:55 2014 -0400 +++ b/build_opts/ALPHA_MOESI_CMP_token Wed Sep 03 07:42:56 2014 -0400 @@ -1,3 +1,3 @@ SS_COMPATIBLE_FP = 1 -CPU_MODELS = 'AtomicSimpleCPU,TimingSimpleCPU,O3CPU,InOrderCPU' +CPU_MODELS =
[gem5-dev] changeset in gem5: config: Update Streamline scripts and configs
changeset 2d6d7a056a38 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=2d6d7a056a38 description: config: Update Streamline scripts and configs Updated the stat_config.ini files to reflect new structure. Moved to a more generic stat naming scheme that can easily handle multiple CPUs and L2s by letting the script replace pre-defined # symbols to CPU or L2 ids. Removed the previous per_switch_cpus sections. Still can be used by spelling out the stat names if necessary. (Resuming from checkpoints no longer use switch_cpus. Only fast-forwarding does.) diffstat: util/streamline/atomic_stat_config.ini | 55 --- util/streamline/m5stats2streamline.py | 53 +++--- util/streamline/o3_stat_config.ini | 78 +++-- 3 files changed, 81 insertions(+), 105 deletions(-) diffs (truncated from 343 to 300 lines): diff -r dfebd39c48a7 -r 2d6d7a056a38 util/streamline/atomic_stat_config.ini --- a/util/streamline/atomic_stat_config.iniWed Sep 03 07:43:01 2014 -0400 +++ b/util/streamline/atomic_stat_config.iniWed Sep 03 07:43:02 2014 -0400 @@ -40,54 +40,55 @@ # Stats grouped together will show as grouped in Streamline. # E.g., # -# icache = -#icache.overall_hits::total -#icache.overall_misses::total +# commit_inst_count = +# system.cluster.cpu#.commit.committedInsts +# system.cluster.cpu#.commit.commitSquashedInsts # -# will display the icache as a stacked line chart. +# will display the inst counts (committed/squashed) as a stacked line chart. # Charts will still be configurable in Streamline. [PER_CPU_STATS] -# system.cpu#. will automatically prepended for per-CPU stats +# '#' will be automatically replaced with the correct CPU id. + +commit_inst_count = +system.cluster.cpu#.committedInsts cycles = -num_busy_cycles -num_idle_cycles +system.cluster.cpu#.num_busy_cycles +system.cluster.cpu#.num_idle_cycles register_access = -num_int_register_reads -num_int_register_writes +system.cluster.cpu#.num_int_register_reads +system.cluster.cpu#.num_int_register_writes mem_refs = -num_mem_refs +system.cluster.cpu#.num_mem_refs inst_breakdown = -num_conditional_control_insts -num_int_insts -num_fp_insts -num_load_insts -num_store_insts +system.cluster.cpu#.num_conditional_control_insts +system.cluster.cpu#.num_int_insts +system.cluster.cpu#.num_fp_insts +system.cluster.cpu#.num_load_insts +system.cluster.cpu#.num_store_insts icache = -icache.overall_hits::total -icache.overall_misses::total +system.cluster.il1_cache#.overall_hits::total +system.cluster.il1_cache#.overall_misses::total dcache = -dcache.overall_hits::total -dcache.overall_misses::total - -[PER_SWITCHCPU_STATS] -# If starting from checkpoints, gem5 keeps CPU stats in system.switch_cpus# structures. -# List per-switchcpu stats here if any -# system.switch_cpus# will automatically prepended for per-CPU stats +system.cluster.dl1_cache#.overall_hits::total +system.cluster.dl1_cache#.overall_misses::total [PER_L2_STATS] +# '#' will be automatically replaced with the correct L2 id. l2_cache = -overall_hits::total -overall_misses::total +system.cluster.l2_cache#.overall_hits::total +system.cluster.l2_cache#.overall_misses::total [OTHER_STATS] +# Anything that doesn't belong to CPU or L2 caches physmem = -system.physmem.bw_total::total +system.memsys.mem_ctrls.bytes_read::total +system.memsys.mem_ctrls.bytes_written::total diff -r dfebd39c48a7 -r 2d6d7a056a38 util/streamline/m5stats2streamline.py --- a/util/streamline/m5stats2streamline.py Wed Sep 03 07:43:01 2014 -0400 +++ b/util/streamline/m5stats2streamline.py Wed Sep 03 07:43:02 2014 -0400 @@ -1,6 +1,6 @@ #!/usr/bin/env python -# Copyright (c) 2012 ARM Limited +# Copyright (c) 2012, 2014 ARM Limited # All rights reserved # # The license below extends only to copyright in the software and shall @@ -142,18 +142,18 @@ print ERROR: config file ', config_file, ' not found sys.exit(1) -if config.has_section(system.cpu): +if config.has_section(system.cluster.cpu): num_cpus = 1 else: num_cpus = 0 -while config.has_section(system.cpu + str(num_cpus)): +while config.has_section(system.cluster.cpu + str(num_cpus)): num_cpus += 1 -if config.has_section(system.l2): +if config.has_section(system.cluster.l2_cache): num_l2 = 1 else: num_l2 = 0 -while config.has_section(system.l2 + str(num_l2)): +while config.has_section(system.cluster.l2_cache + str(num_l2)): num_l2 += 1 print Num CPUs:, num_cpus @@ -713,7 +713,7 @@ # StatsEntry that contains individual statistics class StatsEntry(object): -def __init__(self, name,
[gem5-dev] changeset in gem5: arm: use condition code registers for ARM ISA
changeset 8bee5f4edb92 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=8bee5f4edb92 description: arm: use condition code registers for ARM ISA Analogous to ee049bf (for x86). Requires a bump of the checkpoint version and corresponding upgrader code to move the condition code register values to the new register file. diffstat: src/arch/arm/ccregs.hh| 85 +++ src/arch/arm/faults.cc| 18 src/arch/arm/insts/static_inst.cc | 5 +- src/arch/arm/intregs.hh | 5 -- src/arch/arm/isa.cc | 14 +++--- src/arch/arm/isa.hh | 6 +- src/arch/arm/isa/operands.isa | 54 src/arch/arm/miscregs.hh | 19 src/arch/arm/nativetrace.cc | 12 ++-- src/arch/arm/registers.hh | 14 -- src/arch/arm/utility.cc | 6 +- src/cpu/o3/O3CPU.py | 2 +- src/cpu/simple_thread.hh | 1 + src/sim/serialize.hh | 2 +- util/cpt_upgrader.py | 28 15 files changed, 184 insertions(+), 87 deletions(-) diffs (truncated from 538 to 300 lines): diff -r 85001c018d4c -r 8bee5f4edb92 src/arch/arm/ccregs.hh --- /dev/null Thu Jan 01 00:00:00 1970 + +++ b/src/arch/arm/ccregs.hhTue Apr 29 16:05:02 2014 -0500 @@ -0,0 +1,85 @@ +/* + * Copyright (c) 2014 ARM Limited + * All rights reserved + * + * The license below extends only to copyright in the software and shall + * not be construed as granting a license to any other intellectual + * property including but not limited to intellectual property relating + * to a hardware implementation of the functionality of the software + * licensed hereunder. You may use the software subject to the license + * terms below provided that you ensure that this notice is replicated + * unmodified and in its entirety in all distributions of the software, + * modified or unmodified, in source code or in binary form. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer; + * redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution; + * neither the name of the copyright holders nor the names of its + * contributors may be used to endorse or promote products derived from + * this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * Authors: Curtis Dunham + */ +#ifndef __ARCH_ARM_CCREGS_HH__ +#define __ARCH_ARM_CCREGS_HH__ + +namespace ArmISA +{ + +enum ccRegIndex { +CCREG_NZ, +CCREG_C, +CCREG_V, +CCREG_GE, +CCREG_FP, +CCREG_ZERO, +NUM_CCREGS +}; + +const char * const ccRegName[NUM_CCREGS] = { +nz, +c, +v, +ge, +fp, +zero +}; + +enum ConditionCode { +COND_EQ = 0, +COND_NE, // 1 +COND_CS, // 2 +COND_CC, // 3 +COND_MI, // 4 +COND_PL, // 5 +COND_VS, // 6 +COND_VC, // 7 +COND_HI, // 8 +COND_LS, // 9 +COND_GE, // 10 +COND_LT, // 11 +COND_GT, // 12 +COND_LE, // 13 +COND_AL, // 14 +COND_UC // 15 +}; + +} + +#endif // __ARCH_ARM_CCREGS_HH__ diff -r 85001c018d4c -r 8bee5f4edb92 src/arch/arm/faults.cc --- a/src/arch/arm/faults.ccWed Sep 03 07:42:43 2014 -0400 +++ b/src/arch/arm/faults.ccTue Apr 29 16:05:02 2014 -0500 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2010, 2012-2013 ARM Limited + * Copyright (c) 2010, 2012-2014 ARM Limited * All rights reserved * * The license below extends only to copyright in the software and shall @@ -466,10 +466,10 @@ SCTLR sctlr = tc-readMiscReg(MISCREG_SCTLR); SCR scr = tc-readMiscReg(MISCREG_SCR); CPSR saved_cpsr = tc-readMiscReg(MISCREG_CPSR); -saved_cpsr.nz = tc-readIntReg(INTREG_CONDCODES_NZ); -saved_cpsr.c = tc-readIntReg(INTREG_CONDCODES_C); -
[gem5-dev] changeset in gem5: cpu: Fix cache blocked load behavior in o3 cpu
changeset 6be8945d226b in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=6be8945d226b description: cpu: Fix cache blocked load behavior in o3 cpu This patch fixes the load blocked/replay mechanism in the o3 cpu. Rather than flushing the entire pipeline, this patch replays loads once the cache becomes unblocked. Additionally, deferred memory instructions (loads which had conflicting stores), when replayed would not respect the number of functional units (only respected issue width). This patch also corrects that. Improvements over 20% have been observed on a microbenchmark designed to exercise this behavior. diffstat: src/cpu/o3/iew.hh | 13 +- src/cpu/o3/iew_impl.hh | 57 ++ src/cpu/o3/inst_queue.hh| 25 - src/cpu/o3/inst_queue_impl.hh | 68 ++--- src/cpu/o3/lsq.hh | 27 +- src/cpu/o3/lsq_impl.hh | 23 +--- src/cpu/o3/lsq_unit.hh | 198 --- src/cpu/o3/lsq_unit_impl.hh | 40 ++- src/cpu/o3/mem_dep_unit.hh |4 +- src/cpu/o3/mem_dep_unit_impl.hh |4 +- 10 files changed, 203 insertions(+), 256 deletions(-) diffs (truncated from 846 to 300 lines): diff -r 1ba825974ee6 -r 6be8945d226b src/cpu/o3/iew.hh --- a/src/cpu/o3/iew.hh Wed Sep 03 07:42:38 2014 -0400 +++ b/src/cpu/o3/iew.hh Wed Sep 03 07:42:39 2014 -0400 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2010-2012 ARM Limited + * Copyright (c) 2010-2012, 2014 ARM Limited * All rights reserved * * The license below extends only to copyright in the software and shall @@ -181,6 +181,12 @@ /** Re-executes all rescheduled memory instructions. */ void replayMemInst(DynInstPtr inst); +/** Moves memory instruction onto the list of cache blocked instructions */ +void blockMemInst(DynInstPtr inst); + +/** Notifies that the cache has become unblocked */ +void cacheUnblocked(); + /** Sends an instruction to commit through the time buffer. */ void instToCommit(DynInstPtr inst); @@ -233,11 +239,6 @@ */ void squashDueToMemOrder(DynInstPtr inst, ThreadID tid); -/** Sends commit proper information for a squash due to memory becoming - * blocked (younger issued instructions must be retried). - */ -void squashDueToMemBlocked(DynInstPtr inst, ThreadID tid); - /** Sets Dispatch to blocked, and signals back to other stages to block. */ void block(ThreadID tid); diff -r 1ba825974ee6 -r 6be8945d226b src/cpu/o3/iew_impl.hh --- a/src/cpu/o3/iew_impl.hhWed Sep 03 07:42:38 2014 -0400 +++ b/src/cpu/o3/iew_impl.hhWed Sep 03 07:42:39 2014 -0400 @@ -530,29 +530,6 @@ templateclass Impl void -DefaultIEWImpl::squashDueToMemBlocked(DynInstPtr inst, ThreadID tid) -{ -DPRINTF(IEW, [tid:%i]: Memory blocked, squashing load and younger insts, -PC: %s [sn:%i].\n, tid, inst-pcState(), inst-seqNum); -if (!toCommit-squash[tid] || -inst-seqNum toCommit-squashedSeqNum[tid]) { -toCommit-squash[tid] = true; - -toCommit-squashedSeqNum[tid] = inst-seqNum; -toCommit-pc[tid] = inst-pcState(); -toCommit-mispredictInst[tid] = NULL; - -// Must include the broadcasted SN in the squash. -toCommit-includeSquashInst[tid] = true; - -ldstQueue.setLoadBlockedHandled(tid); - -wroteToTimeBuffer = true; -} -} - -templateclass Impl -void DefaultIEWImpl::block(ThreadID tid) { DPRINTF(IEW, [tid:%u]: Blocking.\n, tid); @@ -610,6 +587,20 @@ templateclass Impl void +DefaultIEWImpl::blockMemInst(DynInstPtr inst) +{ +instQueue.blockMemInst(inst); +} + +templateclass Impl +void +DefaultIEWImpl::cacheUnblocked() +{ +instQueue.cacheUnblocked(); +} + +templateclass Impl +void DefaultIEWImpl::instToCommit(DynInstPtr inst) { // This function should not be called after writebackInsts in a @@ -1376,15 +1367,6 @@ squashDueToMemOrder(violator, tid); ++memOrderViolationEvents; -} else if (ldstQueue.loadBlocked(tid) - !ldstQueue.isLoadBlockedHandled(tid)) { -fetchRedirect[tid] = true; - -DPRINTF(IEW, Load operation couldn't execute because the -memory system is blocked. PC: %s [sn:%lli]\n, -inst-pcState(), inst-seqNum); - -squashDueToMemBlocked(inst, tid); } } else { // Reset any state associated with redirects that will not @@ -1403,17 +1385,6 @@ ++memOrderViolationEvents; } -if (ldstQueue.loadBlocked(tid) -!ldstQueue.isLoadBlockedHandled(tid)) { -DPRINTF(IEW, Load operation couldn't execute because the -memory system is blocked. PC: %s [sn:%lli]\n, -
[gem5-dev] changeset in gem5: tests: Use medium dataset for perlbmk regress...
changeset ee383b8e4d3f in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=ee383b8e4d3f description: tests: Use medium dataset for perlbmk regressions This patch changes the perlbmk regression script from the large to the medium dataset to reduce the regression run time. For all ISAs and CPU models, the total perlbmk host CPU time with the large dataset is roughly 12 hours (constituting 30% of the total regression host time). There is, most likely, almost no added value in terms of code coverage for this rather excessive run time. diffstat: tests/long/se/40.perlbmk/test.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diffs (10 lines): diff -r 35241e33c38f -r ee383b8e4d3f tests/long/se/40.perlbmk/test.py --- a/tests/long/se/40.perlbmk/test.py Wed Sep 03 07:42:56 2014 -0400 +++ b/tests/long/se/40.perlbmk/test.py Wed Sep 03 07:42:57 2014 -0400 @@ -29,5 +29,5 @@ m5.util.addToPath('../configs/common') from cpu2000 import perlbmk_makerand -workload = perlbmk_makerand(isa, opsys, 'lgred') +workload = perlbmk_makerand(isa, opsys, 'mdred') root.system.cpu[0].workload = workload.makeLiveProcess() ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: config: Refactor RealviewEMM to fit into new ...
changeset dfebd39c48a7 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=dfebd39c48a7 description: config: Refactor RealviewEMM to fit into new config system This eliminates some default devices and adds in helper functions to connect the devices defined here to associate with the proper clock domains. diffstat: configs/common/FSConfig.py |3 + src/dev/arm/RealView.py| 169 +++- 2 files changed, 152 insertions(+), 20 deletions(-) diffs (283 lines): diff -r 5f1f92bf76ee -r dfebd39c48a7 configs/common/FSConfig.py --- a/configs/common/FSConfig.pyWed Sep 03 07:42:59 2014 -0400 +++ b/configs/common/FSConfig.pyWed Sep 03 07:43:01 2014 -0400 @@ -221,6 +221,9 @@ self.cf0 = CowIdeDisk(driveID='master') self.cf0.childImage(mdesc.disk()) + +# Attach any PCI devices this platform supports +self.realview.attachPciDevices() # default to an IDE controller rather than a CF one # assuming we've got one; EMM64 is an exception for the moment if machine_type != VExpress_EMM64: diff -r 5f1f92bf76ee -r dfebd39c48a7 src/dev/arm/RealView.py --- a/src/dev/arm/RealView.py Wed Sep 03 07:42:59 2014 -0400 +++ b/src/dev/arm/RealView.py Wed Sep 03 07:43:01 2014 -0400 @@ -1,4 +1,4 @@ -# Copyright (c) 2009-2013 ARM Limited +# Copyright (c) 2009-2014 ARM Limited # All rights reserved. # # The license below extends only to copyright in the software and shall @@ -44,7 +44,7 @@ from m5.proxy import * from Device import BasicPioDevice, PioDevice, IsaFake, BadAddr, DmaDevice from Pci import PciConfigAll -from Ethernet import NSGigE, IGbE_e1000, IGbE_igb +from Ethernet import NSGigE, IGbE_igb, IGbE_e1000 from Ide import * from Platform import Platform from Terminal import Terminal @@ -184,6 +184,18 @@ mem_start_addr = Param.Addr(0, Start address of main memory) max_mem_size = Param.Addr('256MB', Maximum amount of RAM supported by platform) +def attachPciDevices(self): +pass + +def enableMSIX(self): +pass + +def onChipIOClkDomain(self, clkdomain): +pass + +def offChipIOClkDomain(self, clkdomain): +pass + def setupBootLoader(self, mem_bus, cur_sys, loc): self.nvmem = SimpleMemory(range = AddrRange('2GB', size = '64MB'), conf_table_reported = False) @@ -250,6 +262,14 @@ self.flash_fake.pio_addr + \ self.flash_fake.pio_size - 1)] +# Set the clock domain for IO objects that are considered +# to be close to the cores. +def onChipIOClkDomain(self, clkdomain): +self.gic.clk_domain = clkdomain +self.l2x0_fake.clk_domain = clkdomain +self.a9scu.clkdomain= clkdomain +self.local_cpu_timer.clk_domain = clkdomain + # Attach I/O devices to specified bus object. Can't do this # earlier, since the bus object itself is typically defined at the # System level. @@ -282,12 +302,40 @@ self.rtc.pio = bus.master self.flash_fake.pio= bus.master +# Set the clock domain for IO objects that are considered +# to be far away from the cores. +def offChipIOClkDomain(self, clkdomain): +self.uart.clk_domain = clkdomain +self.realview_io.clk_domain = clkdomain +self.timer0.clk_domain= clkdomain +self.timer1.clk_domain= clkdomain +self.clcd.clk_domain = clkdomain +self.kmi0.clk_domain = clkdomain +self.kmi1.clk_domain = clkdomain +self.cf_ctrl.clk_domain = clkdomain +self.dmac_fake.clk_domain = clkdomain +self.uart1_fake.clk_domain= clkdomain +self.uart2_fake.clk_domain= clkdomain +self.uart3_fake.clk_domain= clkdomain +self.smc_fake.clk_domain = clkdomain +self.sp810_fake.clk_domain= clkdomain +self.watchdog_fake.clk_domain = clkdomain +self.gpio0_fake.clk_domain= clkdomain +self.gpio1_fake.clk_domain= clkdomain +self.gpio2_fake.clk_domain= clkdomain +self.ssp_fake.clk_domain = clkdomain +self.sci_fake.clk_domain = clkdomain +self.aaci_fake.clk_domain = clkdomain +self.mmc_fake.clk_domain = clkdomain +self.rtc.clk_domain = clkdomain +self.flash_fake.clk_domain= clkdomain + # Reference for memory map and interrupt number # RealView Emulation Baseboard User Guide (ARM DUI 0143B) # Chapter 4: Programmer's Reference class RealViewEB(RealView): uart = Pl011(pio_addr=0x10009000, int_num=44) -realview_io = RealViewCtrl(pio_addr=0x1000) +realview_io = RealViewCtrl(pio_addr=0x1000, idreg=0x01400500) gic = Pl390(dist_addr=0x10041000, cpu_addr=0x1004) timer0
[gem5-dev] changeset in gem5: arm: Assume we have a kernel that supports pc...
changeset 1aff1376921e in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=1aff1376921e description: arm: Assume we have a kernel that supports pci devices Change the default kernel for AArch64 and since it supports PCI devices remove the hack that made it use CF. Unfortunately, there isn't really a half-way here and we need to switch. Current users will get an error message that the kernel isn't found and hopefully go download a new kernel that supports PCI. diffstat: configs/common/FSConfig.py | 12 1 files changed, 4 insertions(+), 8 deletions(-) diffs (29 lines): diff -r 198dfef33403 -r 1aff1376921e configs/common/FSConfig.py --- a/configs/common/FSConfig.pyWed Sep 03 07:43:04 2014 -0400 +++ b/configs/common/FSConfig.pyWed Sep 03 07:43:04 2014 -0400 @@ -225,13 +225,9 @@ # Attach any PCI devices this platform supports self.realview.attachPciDevices() # default to an IDE controller rather than a CF one -# assuming we've got one; EMM64 is an exception for the moment -if machine_type != VExpress_EMM64: -try: -self.realview.ide.disks = [self.cf0] -except: -self.realview.cf_ctrl.disks = [self.cf0] -else: +try: +self.realview.ide.disks = [self.cf0] +except: self.realview.cf_ctrl.disks = [self.cf0] if bare_metal: @@ -241,7 +237,7 @@ size = mdesc.mem())] else: if machine_type == VExpress_EMM64: -self.kernel = binary('vmlinux-3.14-aarch64-vexpress-emm64') +self.kernel = binary('vmlinux-3.16-aarch64-vexpress-emm64-pcie') elif machine_type == VExpress_EMM: self.kernel = binary('vmlinux-3.3-arm-vexpress-emm-pcie') else: ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: arm: Mark v7 cbz instructions as direct branches
changeset 5e424aa952c5 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=5e424aa952c5 description: arm: Mark v7 cbz instructions as direct branches v7 cbz/cbnz instructions were improperly marked as indirect branches. diffstat: src/arch/arm/isa/insts/branch.isa | 11 +++ src/arch/arm/isa/templates/branch.isa | 6 +- 2 files changed, 12 insertions(+), 5 deletions(-) diffs (52 lines): diff -r 6be8945d226b -r 5e424aa952c5 src/arch/arm/isa/insts/branch.isa --- a/src/arch/arm/isa/insts/branch.isa Wed Sep 03 07:42:39 2014 -0400 +++ b/src/arch/arm/isa/insts/branch.isa Wed Sep 03 07:42:40 2014 -0400 @@ -1,6 +1,6 @@ // -*- mode:c++ -*- -// Copyright (c) 2010-2012 ARM Limited +// Copyright (c) 2010-2012, 2014 ARM Limited // All rights reserved // // The license below extends only to copyright in the software and shall @@ -174,12 +174,15 @@ #CBNZ, CBZ. These are always unconditional as far as predicates for (mnem, test) in ((cbz, ==), (cbnz, !=)): code = 'NPC = (uint32_t)(PC + imm);\n' +br_tgt_code = '''pcs.instNPC((uint32_t)(branchPC.instPC() + imm));''' predTest = Op1 %(test)s 0 % {test: test} iop = InstObjParams(mnem, mnem.capitalize(), BranchImmReg, -{code: code, predicate_test: predTest}, -[IsIndirectControl]) +{code: code, predicate_test: predTest, +brTgtCode : br_tgt_code}, +[IsDirectControl]) header_output += BranchImmRegDeclare.subst(iop) -decoder_output += BranchImmRegConstructor.subst(iop) +decoder_output += BranchImmRegConstructor.subst(iop) + \ + BranchTarget.subst(iop) exec_output += PredOpExecute.subst(iop) #TBB, TBH diff -r 6be8945d226b -r 5e424aa952c5 src/arch/arm/isa/templates/branch.isa --- a/src/arch/arm/isa/templates/branch.isa Wed Sep 03 07:42:39 2014 -0400 +++ b/src/arch/arm/isa/templates/branch.isa Wed Sep 03 07:42:40 2014 -0400 @@ -1,6 +1,6 @@ // -*- mode:c++ -*- -// Copyright (c) 2010 ARM Limited +// Copyright (c) 2010, 2014 ARM Limited // All rights reserved // // The license below extends only to copyright in the software and shall @@ -212,6 +212,10 @@ %(class_name)s(ExtMachInst machInst, int32_t imm, IntRegIndex _op1); %(BasicExecDeclare)s +ArmISA::PCState branchTarget(const ArmISA::PCState branchPC) const; + +/// Explicitly import the otherwise hidden branchTarget +using StaticInst::branchTarget; }; }}; ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: cpu: Fix o3 quiesce fetch bug
changeset 1ba825974ee6 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=1ba825974ee6 description: cpu: Fix o3 quiesce fetch bug O3 is supposed to stop fetching instructions once a quiesce is encountered. However due to a bug, it would continue fetching instructions from the current fetch buffer. This is because of a break statment that only broke out of the first of 2 nested loops. It should have broken out of both. diffstat: src/cpu/o3/fetch_impl.hh | 8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diffs (34 lines): diff -r ed05298e8566 -r 1ba825974ee6 src/cpu/o3/fetch_impl.hh --- a/src/cpu/o3/fetch_impl.hh Wed Sep 03 07:42:37 2014 -0400 +++ b/src/cpu/o3/fetch_impl.hh Wed Sep 03 07:42:38 2014 -0400 @@ -1236,6 +1236,9 @@ // ended this fetch block. bool predictedBranch = false; +// Need to halt fetch if quiesce instruction detected +bool quiesce = false; + TheISA::MachInst *cacheInsts = reinterpret_castTheISA::MachInst *(fetchBuffer[tid]); @@ -1246,7 +1249,7 @@ // Keep issuing while fetchWidth is available and branch is not // predicted taken while (numInst fetchWidth fetchQueue[tid].size() fetchQueueSize -!predictedBranch) { +!predictedBranch !quiesce) { // We need to process more memory if we aren't going to get a // StaticInst from the rom, the current macroop, or what's already // in the decoder. @@ -1363,9 +1366,10 @@ if (instruction-isQuiesce()) { DPRINTF(Fetch, -Quiesce instruction encountered, halting fetch!); +Quiesce instruction encountered, halting fetch!\n); fetchStatus[tid] = QuiescePending; status_change = true; +quiesce = true; break; } } while ((curMacroop || decoder[tid]-instReady()) ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: arm: Make memory ops work on 64bit/128-bit qu...
changeset d96b61d843b2 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=d96b61d843b2 description: arm: Make memory ops work on 64bit/128-bit quantities Multiple instructions assume only 32-bit load operations are available, this patch increases load sizes to 64-bit or 128-bit for many load pair and load multiple instructions. diffstat: src/arch/arm/insts/macromem.cc | 388 ++- src/arch/arm/insts/macromem.hh | 22 +- src/arch/arm/isa/insts/ldr64.isa| 90 +++--- src/arch/arm/isa/insts/macromem.isa | 24 +- src/arch/arm/isa/insts/mem.isa |4 +- src/arch/arm/isa/templates/macromem.isa | 35 ++- 6 files changed, 355 insertions(+), 208 deletions(-) diffs (truncated from 864 to 300 lines): diff -r b5bef3c8e070 -r d96b61d843b2 src/arch/arm/insts/macromem.cc --- a/src/arch/arm/insts/macromem.ccFri Jun 27 12:29:00 2014 -0500 +++ b/src/arch/arm/insts/macromem.ccWed Sep 03 07:42:52 2014 -0400 @@ -61,14 +61,29 @@ { uint32_t regs = reglist; uint32_t ones = number_of_ones(reglist); -// Remember that writeback adds a uop or two and the temp register adds one -numMicroops = ones + (writeback ? (load ? 2 : 1) : 0) + 1; +uint32_t mem_ops = ones; -// It's technically legal to do a lot of nothing -if (!ones) +// Copy the base address register if we overwrite it, or if this instruction +// is basically a no-op (we have to do something) +bool copy_base = (bits(reglist, rn) load) || !ones; +bool force_user = user !bits(reglist, 15); +bool exception_ret = user bits(reglist, 15); +bool pc_temp = load writeback bits(reglist, 15); + +if (!ones) { numMicroops = 1; +} else if (load) { +numMicroops = ((ones + 1) / 2) ++ ((ones % 2 == 0 exception_ret) ? 1 : 0) ++ (copy_base ? 1 : 0) ++ (writeback? 1 : 0) ++ (pc_temp ? 1 : 0); +} else { +numMicroops = ones + (writeback ? 1 : 0); +} microOps = new StaticInstPtr[numMicroops]; + uint32_t addr = 0; if (!up) @@ -81,94 +96,129 @@ // Add 0 to Rn and stick it in ureg0. // This is equivalent to a move. -*uop = new MicroAddiUop(machInst, INTREG_UREG0, rn, 0); +if (copy_base) +*uop++ = new MicroAddiUop(machInst, INTREG_UREG0, rn, 0); unsigned reg = 0; -unsigned regIdx = 0; -bool force_user = user !bits(reglist, 15); -bool exception_ret = user bits(reglist, 15); +while (mem_ops != 0) { +// Do load operations in pairs if possible +if (load mem_ops = 2 +!(mem_ops == 2 bits(regs,INTREG_PC) exception_ret)) { +// 64-bit memory operation +// Find 2 set register bits (clear them after finding) +unsigned reg_idx1; +unsigned reg_idx2; -for (int i = 0; i ones; i++) { -// Find the next register. -while (!bits(regs, reg)) -reg++; -replaceBits(regs, reg, 0); +// Find the first register +while (!bits(regs, reg)) reg++; +replaceBits(regs, reg, 0); +reg_idx1 = force_user ? intRegInMode(MODE_USER, reg) : reg; -regIdx = reg; -if (force_user) { -regIdx = intRegInMode(MODE_USER, regIdx); -} +// Find the second register +while (!bits(regs, reg)) reg++; +replaceBits(regs, reg, 0); +reg_idx2 = force_user ? intRegInMode(MODE_USER, reg) : reg; -if (load) { -if (writeback i == ones - 1) { -// If it's a writeback and this is the last register -// do the load into a temporary register which we'll move -// into the final one later -*++uop = new MicroLdrUop(machInst, INTREG_UREG1, INTREG_UREG0, -up, addr); -} else { -// Otherwise just do it normally -if (reg == INTREG_PC exception_ret) { -// This must be the exception return form of ldm. -*++uop = new MicroLdrRetUop(machInst, regIdx, - INTREG_UREG0, up, addr); +// Load into temp reg if necessary +if (reg_idx2 == INTREG_PC pc_temp) +reg_idx2 = INTREG_UREG1; + +// Actually load both registers from memory +*uop = new MicroLdr2Uop(machInst, reg_idx1, reg_idx2, +copy_base ? INTREG_UREG0 : rn, up, addr); + +if (!writeback reg_idx2 == INTREG_PC) { +// No writeback if idx==pc, set appropriate flags +(*uop)-setFlag(StaticInst::IsControl); +(*uop)-setFlag(StaticInst::IsIndirectControl); + +if (!(condCode == COND_AL || condCode == COND_UC)) +
[gem5-dev] changeset in gem5: tests: Use O3_ARM_v7a config for full-system ...
changeset 60dddc0a6f78 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=60dddc0a6f78 description: tests: Use O3_ARM_v7a config for full-system ARM regressions This patch changes the CPU configuration used for the full-system ARM regressions to increase the test coverage. Note that it is only the core configuration, and not the caches etc. diffstat: tests/configs/realview-o3-checker.py | 3 ++- tests/configs/realview-o3-dual.py| 3 ++- tests/configs/realview-o3.py | 3 ++- 3 files changed, 6 insertions(+), 3 deletions(-) diffs (41 lines): diff -r 1b627a6ddac0 -r 60dddc0a6f78 tests/configs/realview-o3-checker.py --- a/tests/configs/realview-o3-checker.py Wed Sep 03 07:42:41 2014 -0400 +++ b/tests/configs/realview-o3-checker.py Wed Sep 03 07:42:41 2014 -0400 @@ -37,8 +37,9 @@ from m5.objects import * from arm_generic import * +from O3_ARM_v7a import O3_ARM_v7a_3 root = LinuxArmFSSystemUniprocessor(mem_mode='timing', mem_class=DDR3_1600_x64, -cpu_class=DerivO3CPU, +cpu_class=O3_ARM_v7a_3, checker=True).create_root() diff -r 1b627a6ddac0 -r 60dddc0a6f78 tests/configs/realview-o3-dual.py --- a/tests/configs/realview-o3-dual.py Wed Sep 03 07:42:41 2014 -0400 +++ b/tests/configs/realview-o3-dual.py Wed Sep 03 07:42:41 2014 -0400 @@ -37,8 +37,9 @@ from m5.objects import * from arm_generic import * +from O3_ARM_v7a import O3_ARM_v7a_3 root = LinuxArmFSSystem(mem_mode='timing', mem_class=DDR3_1600_x64, -cpu_class=DerivO3CPU, +cpu_class=O3_ARM_v7a_3, num_cpus=2).create_root() diff -r 1b627a6ddac0 -r 60dddc0a6f78 tests/configs/realview-o3.py --- a/tests/configs/realview-o3.py Wed Sep 03 07:42:41 2014 -0400 +++ b/tests/configs/realview-o3.py Wed Sep 03 07:42:41 2014 -0400 @@ -37,7 +37,8 @@ from m5.objects import * from arm_generic import * +from O3_ARM_v7a import O3_ARM_v7a_3 root = LinuxArmFSSystemUniprocessor(mem_mode='timing', mem_class=DDR3_1600_x64, -cpu_class=DerivO3CPU).create_root() +cpu_class=O3_ARM_v7a_3).create_root() ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: dev: seperate legacy io offsets from PCI offset
changeset 1e2f39859382 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=1e2f39859382 description: dev: seperate legacy io offsets from PCI offset The PC platform has a single IO range that is used both legacy IO and PCI IO while other platforms may use seperate regions. Provide another mechanism to configure the legacy IO base address range and set it to the PCI IO address range for x86. diffstat: src/dev/Pci.py | 1 + src/dev/pcidev.cc | 2 +- src/dev/x86/SouthBridge.py | 1 + 3 files changed, 3 insertions(+), 1 deletions(-) diffs (34 lines): diff -r 644b615fbe6a -r 1e2f39859382 src/dev/Pci.py --- a/src/dev/Pci.pyWed Sep 03 07:43:05 2014 -0400 +++ b/src/dev/Pci.pyWed Sep 03 07:43:06 2014 -0400 @@ -98,6 +98,7 @@ BAR3LegacyIO = Param.Bool(False, Whether BAR3 is hardwired legacy IO) BAR4LegacyIO = Param.Bool(False, Whether BAR4 is hardwired legacy IO) BAR5LegacyIO = Param.Bool(False, Whether BAR5 is hardwired legacy IO) +LegacyIOBase = Param.Addr(0x0, Base Address for Legacy IO) CardbusCIS = Param.UInt32(0x00, Cardbus Card Information Structure) SubsystemID = Param.UInt16(0x00, Subsystem ID) diff -r 644b615fbe6a -r 1e2f39859382 src/dev/pcidev.cc --- a/src/dev/pcidev.cc Wed Sep 03 07:43:05 2014 -0400 +++ b/src/dev/pcidev.cc Wed Sep 03 07:43:06 2014 -0400 @@ -213,7 +213,7 @@ for (int i = 0; i 6; ++i) { if (legacyIO[i]) { -BARAddrs[i] = platform-calcPciIOAddr(letoh(config.baseAddr[i])); +BARAddrs[i] = p-LegacyIOBase + letoh(config.baseAddr[i]); config.baseAddr[i] = 0; } else { BARAddrs[i] = 0; diff -r 644b615fbe6a -r 1e2f39859382 src/dev/x86/SouthBridge.py --- a/src/dev/x86/SouthBridge.pyWed Sep 03 07:43:05 2014 -0400 +++ b/src/dev/x86/SouthBridge.pyWed Sep 03 07:43:06 2014 -0400 @@ -84,6 +84,7 @@ ide.ProgIF = 0x80 ide.InterruptLine = 14 ide.InterruptPin = 1 +ide.LegacyIOBase = x86IOAddress(0) def attachIO(self, bus, dma_ports): # Route interupt signals ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: arm: Support 2GB of memory for AArch64 systems
changeset 644b615fbe6a in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=644b615fbe6a description: arm: Support 2GB of memory for AArch64 systems diffstat: configs/common/FSConfig.py | 27 +++ src/dev/arm/RealView.py| 9 + 2 files changed, 24 insertions(+), 12 deletions(-) diffs (75 lines): diff -r 1aff1376921e -r 644b615fbe6a configs/common/FSConfig.py --- a/configs/common/FSConfig.pyWed Sep 03 07:43:04 2014 -0400 +++ b/configs/common/FSConfig.pyWed Sep 03 07:43:05 2014 -0400 @@ -246,19 +246,30 @@ if dtb_filename: self.dtb_filename = binary(dtb_filename) self.machine_type = machine_type -if convert.toMemorySize(mdesc.mem()) int(self.realview.max_mem_size): -print The currently selected ARM platforms doesn't support -print the amount of DRAM you've selected. Please try -print another platform -sys.exit(1) - # Ensure that writes to the UART actually go out early in the boot boot_flags = 'earlyprintk=pl011,0x1c09 console=ttyAMA0 ' + \ 'lpj=19988480 norandmaps rw loglevel=8 ' + \ 'mem=%s root=/dev/sda1' % mdesc.mem() -self.mem_ranges = [AddrRange(self.realview.mem_start_addr, - size = mdesc.mem())] +self.mem_ranges = [] +size_remain = long(Addr(mdesc.mem())) +for region in self.realview._mem_regions: +if size_remain long(region[1]): +self.mem_ranges.append(AddrRange(region[0], size=region[1])) +size_remain = size_remain - long(region[1]) +else: +self.mem_ranges.append(AddrRange(region[0], size=size_remain)) +size_remain = 0 +break +warn(Memory size specified spans more than one region. Creating \ + another memory controller for that range.) + +if size_remain 0: +fatal(The currently selected ARM platforms doesn't support \ + the amount of DRAM you've selected. Please try \ + another platform) + + self.realview.setupBootLoader(self.membus, self, binary) self.gic_cpu_addr = self.realview.gic.cpu_addr self.flags_addr = self.realview.realview_io.pio_addr + 0x30 diff -r 1aff1376921e -r 644b615fbe6a src/dev/arm/RealView.py --- a/src/dev/arm/RealView.py Wed Sep 03 07:43:04 2014 -0400 +++ b/src/dev/arm/RealView.py Wed Sep 03 07:43:05 2014 -0400 @@ -184,8 +184,7 @@ pci_cfg_base = Param.Addr(0, Base address of PCI Configuraiton Space) pci_cfg_gen_offsets = Param.Bool(False, Should the offsets used for PCI cfg access be compatible with the pci-generic-host or the legacy host bridge?) -mem_start_addr = Param.Addr(0, Start address of main memory) -max_mem_size = Param.Addr('256MB', Maximum amount of RAM supported by platform) +_mem_regions = [(Addr(0), Addr('256MB'))] def attachPciDevices(self): pass @@ -444,8 +443,7 @@ self.smcreg_fake.clk_domain = clkdomain class VExpress_EMM(RealView): -mem_start_addr = '2GB' -max_mem_size = '2GB' +_mem_regions = [(Addr('2GB'), Addr('2GB'))] pci_cfg_base = 0x3000 uart = Pl011(pio_addr=0x1c09, int_num=37) realview_io = RealViewCtrl(proc_id0=0x1400, proc_id1=0x1400, \ @@ -602,6 +600,9 @@ class VExpress_EMM64(VExpress_EMM): pci_io_base = 0x2f00 pci_cfg_gen_offsets = True +# Three memory regions are specified totalling 512GB +_mem_regions = [(Addr('2GB'), Addr('2GB')), (Addr('34GB'), Addr('30GB')), +(Addr('512GB'), Addr('480GB'))] def setupBootLoader(self, mem_bus, cur_sys, loc): self.nvmem = SimpleMemory(range = AddrRange(0, size = '64MB')) self.nvmem.port = mem_bus.master ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: mem: Fix a bug in the cache port flow control
changeset fa9ef374075f in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=fa9ef374075f description: mem: Fix a bug in the cache port flow control This patch fixes a bug in the cache port where the retry flag was reset too early, allowing new requests to arrive before the retry was actually sent, but with the event already scheduled. This caused a deadlock in the interactions with the O3 LSQ. The patche fixes the underlying issue by shifting the resetting of the flag to be done by the event that also calls sendRetry(). The patch also tidies up the flow control in recvTimingReq and ensures that we also check if we already have a retry outstanding. diffstat: src/mem/cache/base.cc | 11 +-- src/mem/cache/base.hh | 5 - src/mem/cache/cache_impl.hh | 27 +++ 3 files changed, 32 insertions(+), 11 deletions(-) diffs (80 lines): diff -r a1eea45928e6 -r fa9ef374075f src/mem/cache/base.cc --- a/src/mem/cache/base.cc Tue May 13 12:20:49 2014 -0500 +++ b/src/mem/cache/base.cc Wed Sep 03 07:42:50 2014 -0400 @@ -106,13 +106,20 @@ DPRINTF(CachePort, Cache port %s accepting new requests\n, name()); blocked = false; if (mustSendRetry) { -DPRINTF(CachePort, Cache port %s sending retry\n, name()); -mustSendRetry = false; // @TODO: need to find a better time (next bus cycle?) owner.schedule(sendRetryEvent, curTick() + 1); } } +void +BaseCache::CacheSlavePort::processSendRetry() +{ +DPRINTF(CachePort, Cache port %s sending retry\n, name()); + +// reset the flag and call retry +mustSendRetry = false; +sendRetry(); +} void BaseCache::init() diff -r a1eea45928e6 -r fa9ef374075f src/mem/cache/base.hh --- a/src/mem/cache/base.hh Tue May 13 12:20:49 2014 -0500 +++ b/src/mem/cache/base.hh Wed Sep 03 07:42:50 2014 -0400 @@ -182,7 +182,10 @@ private: -EventWrapperSlavePort, SlavePort::sendRetry sendRetryEvent; +void processSendRetry(); + +EventWrapperCacheSlavePort, + CacheSlavePort::processSendRetry sendRetryEvent; }; diff -r a1eea45928e6 -r fa9ef374075f src/mem/cache/cache_impl.hh --- a/src/mem/cache/cache_impl.hh Tue May 13 12:20:49 2014 -0500 +++ b/src/mem/cache/cache_impl.hh Wed Sep 03 07:42:50 2014 -0400 @@ -1937,16 +1937,27 @@ bool CacheTagStore::CpuSidePort::recvTimingReq(PacketPtr pkt) { -// always let inhibited requests through even if blocked -if (!pkt-memInhibitAsserted() blocked) { -assert(!cache-system-bypassCaches()); -DPRINTF(Cache,Scheduling a retry while blocked\n); -mustSendRetry = true; -return false; +assert(!cache-system-bypassCaches()); + +bool success = false; + +// always let inhibited requests through, even if blocked +if (pkt-memInhibitAsserted()) { +// this should always succeed +success = cache-recvTimingReq(pkt); +assert(success); +} else if (blocked || mustSendRetry) { +// either already committed to send a retry, or blocked +success = false; +} else { +// for now this should always succeed +success = cache-recvTimingReq(pkt); +assert(success); } -cache-recvTimingReq(pkt); -return true; +// remember if we have to retry +mustSendRetry = !success; +return success; } templateclass TagStore ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
[gem5-dev] changeset in gem5: base: Use the global Mersenne twister throughout
changeset c91b23c72d5e in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=c91b23c72d5e description: base: Use the global Mersenne twister throughout This patch tidies up random number generation to ensure that it is done consistently throughout the code base. In essence this involves a clean-up of Ruby, and some code simplifications in the traffic generator. As part of this patch a bunch of skewed distributions (off-by-one etc) have been fixed. Note that a single global random number generator is used, and that the object instantiation order will impact the behaviour (the sequence of numbers will be unaffected, but if module A calles random before module B then they would obviously see a different outcome). The dependency on the instantiation order is true in any case due to the execution-model of gem5, so we leave it as is. Also note that the global ranom generator is not thread safe at this point. Regressions using the memtest, TrafficGen or any Ruby tester are affected and will be updated accordingly. diffstat: src/cpu/testers/directedtest/SeriesRequestGenerator.cc | 3 +- src/cpu/testers/memtest/memtest.cc | 18 -- src/cpu/testers/networktest/networktest.cc | 7 +++-- src/cpu/testers/rubytest/Check.cc | 22 + src/cpu/testers/rubytest/CheckTable.cc | 3 +- src/cpu/testers/traffic_gen/generators.cc | 15 +-- src/cpu/testers/traffic_gen/traffic_gen.cc | 2 +- src/mem/ruby/common/NetDest.cc | 7 - src/mem/ruby/common/NetDest.hh | 1 - src/mem/ruby/common/Set.cc | 16 - src/mem/ruby/common/Set.hh | 1 - src/mem/ruby/network/MessageBuffer.cc | 7 +++-- src/mem/ruby/network/simple/PerfectSwitch.cc | 4 ++- src/mem/ruby/slicc_interface/RubySlicc_Util.hh | 6 src/mem/ruby/structures/RubyMemoryControl.cc | 5 ++- 15 files changed, 48 insertions(+), 69 deletions(-) diffs (truncated from 450 to 300 lines): diff -r d548d1d7597c -r c91b23c72d5e src/cpu/testers/directedtest/SeriesRequestGenerator.cc --- a/src/cpu/testers/directedtest/SeriesRequestGenerator.ccWed Sep 03 07:42:53 2014 -0400 +++ b/src/cpu/testers/directedtest/SeriesRequestGenerator.ccWed Sep 03 07:42:54 2014 -0400 @@ -27,6 +27,7 @@ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ +#include base/random.hh #include cpu/testers/directedtest/DirectedGenerator.hh #include cpu/testers/directedtest/RubyDirectedTester.hh #include cpu/testers/directedtest/SeriesRequestGenerator.hh @@ -60,7 +61,7 @@ Request *req = new Request(m_address, 1, flags, masterId); Packet::Command cmd; -bool do_write = ((random() % 100) m_percent_writes); +bool do_write = (random_mt.random(0, 100) m_percent_writes); if (do_write) { cmd = MemCmd::WriteReq; } else { diff -r d548d1d7597c -r c91b23c72d5e src/cpu/testers/memtest/memtest.cc --- a/src/cpu/testers/memtest/memtest.ccWed Sep 03 07:42:53 2014 -0400 +++ b/src/cpu/testers/memtest/memtest.ccWed Sep 03 07:42:54 2014 -0400 @@ -37,6 +37,7 @@ #include vector #include base/misc.hh +#include base/random.hh #include base/statistics.hh #include cpu/testers/memtest/memtest.hh #include debug/MemTest.hh @@ -261,14 +262,14 @@ } //make new request -unsigned cmd = random() % 100; -unsigned offset = random() % size; -unsigned base = random() % 2; -uint64_t data = random(); -unsigned access_size = random() % 4; -bool uncacheable = (random() % 100) percentUncacheable; +unsigned cmd = random_mt.random(0, 100); +unsigned offset = random_mt.randomunsigned(0, size - 1); +unsigned base = random_mt.random(0, 1); +uint64_t data = random_mt.randomuint64_t(); +unsigned access_size = random_mt.random(0, 3); +bool uncacheable = random_mt.random(0, 100) percentUncacheable; -unsigned dma_access_size = random() % 4; +unsigned dma_access_size = random_mt.random(0, 3); //If we aren't doing copies, use id as offset, and do a false sharing //mem tester @@ -296,7 +297,8 @@ return; } -bool do_functional = (random() % 100 percentFunctional) !uncacheable; +bool do_functional = (random_mt.random(0, 100) percentFunctional) +!uncacheable; Request *req = new Request(); uint8_t *result = new uint8_t[8]; diff -r d548d1d7597c -r c91b23c72d5e src/cpu/testers/networktest/networktest.cc --- a/src/cpu/testers/networktest/networktest.ccWed Sep 03 07:42:53 2014 -0400 +++ b/src/cpu/testers/networktest/networktest.ccWed Sep 03 07:42:54 2014 -0400 @@ -35,6
[gem5-dev] changeset in gem5: stats: Update stats for CPU and cache changes
changeset 5f1f92bf76ee in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=5f1f92bf76ee description: stats: Update stats for CPU and cache changes This patch updates the stats to reflect the fixes and changes to the CPU (mainly the o3), and the caches. diffstat: tests/long/fs/10.linux-boot/ref/alpha/linux/tsunami-minor/stats.txt | 1532 +- tests/long/fs/10.linux-boot/ref/alpha/linux/tsunami-o3-dual/stats.txt | 3819 tests/long/fs/10.linux-boot/ref/alpha/linux/tsunami-o3/stats.txt | 2187 ++-- tests/long/fs/10.linux-boot/ref/alpha/linux/tsunami-switcheroo-full/stats.txt | 3122 +++--- tests/long/fs/10.linux-boot/ref/arm/linux/realview-minor-dual/stats.txt | 2214 ++-- tests/long/fs/10.linux-boot/ref/arm/linux/realview-minor/stats.txt | 1311 +- tests/long/fs/10.linux-boot/ref/arm/linux/realview-o3-checker/stats.txt | 2244 ++-- tests/long/fs/10.linux-boot/ref/arm/linux/realview-o3-dual/stats.txt | 3684 tests/long/fs/10.linux-boot/ref/arm/linux/realview-o3/stats.txt | 2214 ++-- tests/long/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-full/stats.txt | 3077 +++--- tests/long/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-o3/stats.txt | 3339 +++--- tests/long/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/stats.txt | 2117 ++-- tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/stats.txt | 2461 ++-- tests/long/fs/10.linux-boot/ref/x86/linux/pc-switcheroo-full/stats.txt | 3138 +++--- tests/long/se/10.mcf/ref/arm/linux/minor-timing/stats.txt | 1128 +- tests/long/se/10.mcf/ref/arm/linux/o3-timing/stats.txt | 1460 +- tests/long/se/10.mcf/ref/arm/linux/simple-atomic/stats.txt | 130 +- tests/long/se/10.mcf/ref/arm/linux/simple-timing/stats.txt | 430 +- tests/long/se/10.mcf/ref/x86/linux/o3-timing/stats.txt | 1438 +- tests/long/se/20.parser/ref/alpha/tru64/minor-timing/stats.txt | 838 +- tests/long/se/20.parser/ref/arm/linux/minor-timing/stats.txt | 1321 +- tests/long/se/20.parser/ref/arm/linux/o3-timing/stats.txt | 1700 +- tests/long/se/20.parser/ref/arm/linux/simple-atomic/stats.txt | 130 +- tests/long/se/20.parser/ref/arm/linux/simple-timing/stats.txt | 456 +- tests/long/se/20.parser/ref/x86/linux/o3-timing/stats.txt | 1599 +- tests/long/se/30.eon/ref/alpha/tru64/minor-timing/stats.txt | 468 +- tests/long/se/30.eon/ref/alpha/tru64/o3-timing/stats.txt | 1328 +- tests/long/se/30.eon/ref/alpha/tru64/simple-atomic/stats.txt |18 +- tests/long/se/30.eon/ref/alpha/tru64/simple-timing/stats.txt |18 +- tests/long/se/30.eon/ref/arm/linux/minor-timing/stats.txt | 1152 +- tests/long/se/30.eon/ref/arm/linux/o3-timing/stats.txt | 1493 +- tests/long/se/30.eon/ref/arm/linux/simple-atomic/stats.txt | 160 +- tests/long/se/30.eon/ref/arm/linux/simple-timing/stats.txt | 492 +- tests/long/se/40.perlbmk/ref/alpha/tru64/minor-timing/stats.txt | 948 +- tests/long/se/40.perlbmk/ref/alpha/tru64/o3-timing/stats.txt | 1584 +- tests/long/se/40.perlbmk/ref/alpha/tru64/simple-atomic/stats.txt | 194 +- tests/long/se/40.perlbmk/ref/alpha/tru64/simple-timing/stats.txt | 806 +- tests/long/se/40.perlbmk/ref/arm/linux/minor-timing/stats.txt | 1248 +- tests/long/se/40.perlbmk/ref/arm/linux/o3-timing/stats.txt | 1645 +- tests/long/se/40.perlbmk/ref/arm/linux/simple-atomic/stats.txt | 172 +- tests/long/se/40.perlbmk/ref/arm/linux/simple-timing/stats.txt | 848 +- tests/long/se/50.vortex/ref/alpha/tru64/minor-timing/stats.txt | 888 +- tests/long/se/50.vortex/ref/alpha/tru64/o3-timing/stats.txt | 1558 +- tests/long/se/50.vortex/ref/alpha/tru64/simple-atomic/stats.txt |14 +- tests/long/se/50.vortex/ref/alpha/tru64/simple-timing/stats.txt |14 +- tests/long/se/50.vortex/ref/arm/linux/minor-timing/stats.txt | 940 +-
[gem5-dev] changeset in gem5: dev, arm: Add support for linux generic pci h...
changeset 198dfef33403 in /z/repo/gem5 details: http://repo.gem5.org/gem5?cmd=changeset;node=198dfef33403 description: dev, arm: Add support for linux generic pci host driver This change adds support for a generic pci host bus driver that has been included in recent Linux kernel instead of the more bespoke one we've been using to date. It also works with aarch64 so it provides PCI support for 64-bit ARM Linux. To make this work a new configuration option pci_io_base is added to the RealView platform that should be set to the start of the memory used as memory mapped IO ports (IO ports that are memory mapped, not regular memory mapped IO). And a parameter pci_cfg_gen_offsets which specifies if the config space offsets should be used that the generic driver expects. To use the pci-host-generic device you need to: pci_io_base = 0x2f00 (Valid for VExpress EMM) pci_cfg_gen_offsets = True and add the following to your device tree: pci { compatible = pci-host-ecam-generic; device_type = pci; #address-cells = 0x3; #size-cells = 0x2; #interrupt-cells = 0x1; //bus-range = 0x0 0x1; // CPU_PHYSICAL(2) SIZE(2) // Note, some DTS blobs only support 1 size reg = 0x0 0x3000 0x0 0x1000; // IO (1), no bus address (2), cpu address (2), size (2) // MMIO (1), at address (2), cpu address (2), size (2) ranges = 0x0100 0x0 0x 0x0 0x2f00 0x0 0x1, 0x0200 0x0 0x4000 0x0 0x4000 0x0 0x1000; // With gem5 we typically use INTA/B/C/D one per device interrupt-map = 0x 0x0 0x0 0x1 0x1 0x0 0x11 0x1 0x 0x0 0x0 0x2 0x1 0x0 0x12 0x1 0x 0x0 0x0 0x3 0x1 0x0 0x13 0x1 0x 0x0 0x0 0x4 0x1 0x0 0x14 0x1; // Only match INTA/B/C/D and not BDF interrupt-map-mask = 0x 0x0 0x0 0x7; }; diffstat: src/dev/arm/RealView.py | 5 + src/dev/arm/realview.cc | 27 --- src/dev/arm/realview.hh | 3 +++ 3 files changed, 32 insertions(+), 3 deletions(-) diffs (90 lines): diff -r 7565dcd505a4 -r 198dfef33403 src/dev/arm/RealView.py --- a/src/dev/arm/RealView.py Wed Sep 03 07:43:03 2014 -0400 +++ b/src/dev/arm/RealView.py Wed Sep 03 07:43:04 2014 -0400 @@ -180,7 +180,10 @@ type = 'RealView' cxx_header = dev/arm/realview.hh system = Param.System(Parent.any, system) +pci_io_base = Param.Addr(0, Base address of PCI IO Space) pci_cfg_base = Param.Addr(0, Base address of PCI Configuraiton Space) +pci_cfg_gen_offsets = Param.Bool(False, Should the offsets used for PCI cfg access + be compatible with the pci-generic-host or the legacy host bridge?) mem_start_addr = Param.Addr(0, Start address of main memory) max_mem_size = Param.Addr('256MB', Maximum amount of RAM supported by platform) @@ -597,6 +600,8 @@ self.mmc_fake.clk_domain = clkdomain class VExpress_EMM64(VExpress_EMM): +pci_io_base = 0x2f00 +pci_cfg_gen_offsets = True def setupBootLoader(self, mem_bus, cur_sys, loc): self.nvmem = SimpleMemory(range = AddrRange(0, size = '64MB')) self.nvmem.port = mem_bus.master diff -r 7565dcd505a4 -r 198dfef33403 src/dev/arm/realview.cc --- a/src/dev/arm/realview.cc Wed Sep 03 07:43:03 2014 -0400 +++ b/src/dev/arm/realview.cc Wed Sep 03 07:43:04 2014 -0400 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2009 ARM Limited + * Copyright (c) 2009, 2014 ARM Limited * All rights reserved * * The license below extends only to copyright in the software and shall @@ -63,6 +63,21 @@ {} void +RealView::initState() +{ +Addr junk; +bool has_gen_pci_host; +has_gen_pci_host = system-kernelSymtab-findAddress(gen_pci_setup, junk); + +if (has_gen_pci_host !params()-pci_cfg_gen_offsets) +warn(Kernel supports generic PCI host but PCI Config offsets +configured for legacy. Set pci_cfg_gen_offsets to True); +if (has_gen_pci_host !params()-pci_io_base) +warn(Kernel supports generic PCI host but PCI IO base is set +to 0. Set pci_io_base to the start of PCI IO space); +} + +void RealView::postConsoleInt() { warn_once(Don't know what interrupt to post for console.\n); @@ -100,13 +115,19 @@ { if (bus != 0) return ULL(-1); -return params()-pci_cfg_base | ((func 7) 16) | ((dev 0x1f) 19); + +Addr cfg_offset = 0; +if (params()-pci_cfg_gen_offsets) +cfg_offset |= ((func 7) 12) | ((dev 0x1f) 15); +else +cfg_offset |= ((func 7) 16) | ((dev
Re: [gem5-dev] Review Request 2372: style: add .clang-format file
On Sept. 1, 2014, 6:14 p.m., Andreas Sandberg wrote: .clang-format, line 18 http://reviews.gem5.org/r/2372/diff/1/?file=41128#file41128line18 Has this changed name? The clang documentation lists DerivePointerAlignment, but not DerivePointerBinding. Nilay Vaish wrote: Documentation for version 3.4 lists DerivePointerBinding. That explains it. I was looking at the 3.6 documentation. (http://clang.llvm.org/docs/ClangFormatStyleOptions.html) - Andreas --- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2372/#review5321 --- On Sept. 3, 2014, 5:50 a.m., Nilay Vaish wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2372/ --- (Updated Sept. 3, 2014, 5:50 a.m.) Review request for Default. Repository: gem5 Description --- Changeset 10318:34b549ec182b --- style: add .clang-format file The format specified in this file is used by clang-format to fix the formatting of a given file. Hopefully, this will ease the burden on the developers as they no longer need to manually format things. Diffs - .clang-format PRE-CREATION Diff: http://reviews.gem5.org/r/2372/diff/ Testing --- Thanks, Nilay Vaish ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
Re: [gem5-dev] bi-mode branch predictor miss prediction rate is high
A bug was recently found in the bimodal predictor. If you are still looking at this, you might want to try a new checkout. Hope this helps. On Wed, Jul 2, 2014 at 4:52 PM, Zi Yan via gem5-dev gem5-dev@gem5.org wrote: I get 5 100-million-instruction simpoints for each benchmark in SPEC CPU 2006 with *ref input*. I am using cross-tool arm-cortex_a15-linux-gnueabi-gcc version 4.8.2 to compile. For gcc, I got from 0.2% to 5% miss rate from tournament, but 3% to 22% miss rate from bi-mode cross all simpoints. Most weird part is hmmer, I got from 0.3% to 0.5% miss rate from tournament, but 52% to 60% miss rate from bi-mode. -- Best Regards Yan Zi On 2 Jul 2014, at 17:11, Anthony Gutierrez via gem5-dev wrote: This could depend on a lot of factors. How are you running the benchmarks? E.g., running SPEC 2k6's gcc to completion with the train input set in FS mode yields a 6.45% miss rate for bi-mode, while the tournament predictor yields a 7.12% miss rate. Anthony Gutierrez http://web.eecs.umich.edu/~atgutier On Wed, Jul 2, 2014 at 4:37 PM, Zi Yan via gem5-dev gem5-dev@gem5.org wrote: Hi, I just updated gem5-dev and got bi-mode as ARM's default branch predictor. I got mis-prediction rate (system.cpu.branchPred.condIncorrect/system.cpu.branchPred.condPredicted) ranging from 10% to 60%, whereas I saw mis-prediction rate ranging from 1% to 9% with tournament for SPEC CPU 2006 benchmarks. Should I expect this from bi-mode? Thanks. -- Best Regards Yan Zi ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
Re: [gem5-dev] workaround: Ruby functional read failed error
BTW, in the scenario above, the functionalWrite will not work correctly, data updated by functionalWrite in controllers will be replaced with old data from a queued packet several ticks later. functionalWrite works well in this scenario, the functional write failed bug is already fixed. Sorry for my missunderstanding, I need to improve my c++ reading skills. When I read the code carefully, it seems to me, that the same logic, as it is implemented for writing in the RubySystem::functionalWrite(PacketPtr pkt) and RubyMemoryControl::functionalWriteBuffers(Packet *pkt), is alredy prepared also for reading in the RubyMemoryControl::functionalReadBuffers(Packet *pkt). Is there any reason, why it is not used in the RubySystem::functionalRead(PacketPtr pkt) function ? Is the following code to add before the return false the right solution ? Regards, Jiri Kaspar --- for (unsigned int i = 0; i num_controllers;++i) { if (m_abs_cntrl_vec[i]-functionalReadBuffers(pkt)) return true; } for (unsigned int i = 0; i m_memory_controller_vec.size() ;++i) { if (m_memory_controller_vec[i]-functionalReadBuffers(pkt)) return true; } if (m_network_ptr-functionalWrite(pkt)) return true; ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev