Re: [m5-dev] changeset in m5: ruby: Ruby support for LLSC
Comments below... On Sun, Mar 21, 2010 at 10:47 PM, Brad Beckmann brad.beckm...@amd.com wrote: changeset 185ad61a4117 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=185ad61a4117 description: ruby: Ruby support for LLSC diffstat: 4 files changed, 102 insertions(+), 20 deletions(-) src/mem/ruby/system/CacheMemory.cc | 17 +- src/mem/ruby/system/RubyPort.cc | 58 +--- src/mem/ruby/system/SConscript | 2 + src/mem/ruby/system/Sequencer.cc | 45 ++- diffs (215 lines): [...] - type = RubyRequestType_RMW_Write; } else { - panic(Unsupported ruby packet type\n); + if (pkt-isRead()) { + if (pkt-req-isInstFetch()) { + type = RubyRequestType_IFETCH; + } else { + type = RubyRequestType_LD; + } + } else if (pkt-isWrite()) { + type = RubyRequestType_ST; + } else if (pkt-isReadWrite()) { + // + // Fix me. Just because the packet is a read/write request does not + // necessary mean it is a read-modify-write atomic operation. + // + type = RubyRequestType_RMW_Write; + } else { + panic(Unsupported ruby packet type\n); + } } Did you miss my comment about isReadWrite() on this patch? Or did you decide not to do anything since it's no worse than before? Seems like it's at least worth adding a comment for posterity, but better yet to fix it now if it needs fixing. + + g_system_ptr-getProfiler()-profileTransition(Seq, + m_version, + Address(request.paddr), + , + SC Fail, + , + RubyRequestType_to_string(request.type)); + I'm still not thrilled with the line-count explosion your parameter per line formatting gives, especially when we're dedicating entire lines to parameters like ''. As far as parsing the call, there are enough args to this function that it's opaque to me what it's supposed to be doing regardless of how many lines it takes. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] [PATCH] CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc
Thanks to Brad's prompting, I took another look at this tonight. I think the fundamental problem is that the transition to split-transaction translation didn't move all the operations that depend on the translation completing into the translation callback or something that was guaranteed to run after the translation callback. The req-isUncacheable() test that Brad moved around in his recent patch (which I commented on just a few minutes ago) is one example of that. Another case is that we're calling traceData-setData() in read() even though the translation may not be done yet (how does that work? are we calling setData() again when the real data comes back, so we haven't noticed that this is broken?). The other traceData examples (like setAddr() or setData() in write()) are more like anti-dependences; there's no direct dependence on the translation finishing, but there is a dependence on it not deleting the traceData object. So in other words the problem isn't that the finishTranslation() or completeAcc() calls are happening too soon, it's that the code that runs after the translation is initiated in read()/write() (or in initiateAcc() from which it is called) should not be making assumptions about whether finishTranslation() or completeAcc() has or has not been called. I think it's pretty easy to fix this, in some cases by moving independent code up above where the translation is initiated, and in other cases by moving dependent code into finishTranslation(). Interestingly, there are a number of cases where we trace the store data twice, once in the CPU's write() function and again in the ISA's initiateAcc() function. I propose to just get rid of the latter. As I mentioned in the other email, I'm tempted to just get rid of the recordEvent() calls, unless someone actually uses these. I see that Nate created the basic recordEvent() infrastructure in 2004, and right now it's only uncached accesses and faults that get recorded, which seems very ad hoc and irregular. I'll try and make these changes and test them as soon as I can. I thought I'd send this note out first though in case anyone has feedback before I get started. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Cron m5t...@zizzer /z/m5/regression/do-regression quick
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/o3-timing passed. * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple-timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-atomic passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby passed. * build/ALPHA_SE/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-atomic passed. * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple-atomic passed. * build/ALPHA_SE/tests/fast/quick/01.hello-2T-smt/alpha/linux/o3-timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder-timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/o3-timing passed. * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple-atomic-mp passed. * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple-timing-mp passed. * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby passed. * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby-MOESI_hammer passed. = Statistics differences =* build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby-MOESI_hammer passed. * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby-MOESI_hammer passed. * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby-MOESI_hammer passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby-MESI_CMP_directory passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby-MESI_CMP_directory passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby-MESI_CMP_directory passed. * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby-MESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby-MOESI_CMP_token passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby-MOESI_CMP_token passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby-MOESI_CMP_token passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby-MOESI_CMP_token passed. * build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-atomic passed. * build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-atomic-dual passed. * build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-timing passed. * build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-timing-dual passed. * build/ALPHA_FS/tests/fast/quick/80.netperf-stream/alpha/linux/twosys-tsunami-simple-atomic passed. = Statistics differences =* build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/o3-timing passed. * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing-ruby passed. * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing passed. * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-atomic passed. * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/inorder-timing passed. * build/POWER_SE/tests/fast/quick/00.hello/power/linux/simple-atomic passed. * build/POWER_SE/tests/fast/quick/00.hello/power/linux/o3-timing passed. * build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-timing passed. * build/SPARC_SE/tests/fast/quick/02.insttest/sparc/linux/simple-atomic passed. = Statistics differences =* build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-atomic passed. = Statistics differences =* build/SPARC_SE/tests/fast/quick/02.insttest/sparc/linux/o3-timing passed. * build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-timing-ruby passed. *
Re: [m5-dev] [PATCH] CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc
It's been a while since I've looked at this, but I want to make sure I remember to respond so I don't want to wait until I have a chance to re-research all this. Isn't the problem that initiateAcc ends up calling completeAcc mid-function, that cleans up after the access, then initateAcc gets control back and tries to finish initiating the access? The problem to me seems to be that initiateAcc doesn't know whether it's just letting the rest of the CPU know something needs to happen later, or if it's actually causing that thing to happen right away. The other part, completeAcc, doesn't know if it's happening in the middle of initiateAcc, or if its in a callback. If it's in initiateAcc there are things it (or maybe initiateAcc, is that what you're saying?) shouldn't touch. If it's in a callback, it's likely the last thing to handle the instruction/translation/packet/whatever, so it needs to be sure to clean up everything. These seem like contradictory responsibilities. What I had been advocating before (if I remember right) was to disambiguate what the circumstances each function can expect by breaking up the call chains so that the functions don't end up nested inside each other unpredictably. Like I said, though, I'm making sure I respond rather than getting my facts straight, so please let me know if I'm missing your point (likely) or misunderstanding something. Gabe Steve Reinhardt wrote: Thanks to Brad's prompting, I took another look at this tonight. I think the fundamental problem is that the transition to split-transaction translation didn't move all the operations that depend on the translation completing into the translation callback or something that was guaranteed to run after the translation callback. The req-isUncacheable() test that Brad moved around in his recent patch (which I commented on just a few minutes ago) is one example of that. Another case is that we're calling traceData-setData() in read() even though the translation may not be done yet (how does that work? are we calling setData() again when the real data comes back, so we haven't noticed that this is broken?). The other traceData examples (like setAddr() or setData() in write()) are more like anti-dependences; there's no direct dependence on the translation finishing, but there is a dependence on it not deleting the traceData object. So in other words the problem isn't that the finishTranslation() or completeAcc() calls are happening too soon, it's that the code that runs after the translation is initiated in read()/write() (or in initiateAcc() from which it is called) should not be making assumptions about whether finishTranslation() or completeAcc() has or has not been called. I think it's pretty easy to fix this, in some cases by moving independent code up above where the translation is initiated, and in other cases by moving dependent code into finishTranslation(). Interestingly, there are a number of cases where we trace the store data twice, once in the CPU's write() function and again in the ISA's initiateAcc() function. I propose to just get rid of the latter. As I mentioned in the other email, I'm tempted to just get rid of the recordEvent() calls, unless someone actually uses these. I see that Nate created the basic recordEvent() infrastructure in 2004, and right now it's only uncached accesses and faults that get recorded, which seems very ad hoc and irregular. I'll try and make these changes and test them as soon as I can. I thought I'd send this note out first though in case anyone has feedback before I get started. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] [PATCH] CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc
Hi Gabe, On Sun, Mar 21, 2010 at 5:22 PM, Gabe Black gbl...@eecs.umich.edu wrote: It's been a while since I've looked at this, but I want to make sure I remember to respond so I don't want to wait until I have a chance to re-research all this. Isn't the problem that initiateAcc ends up calling completeAcc mid-function, that cleans up after the access, then initateAcc gets control back and tries to finish initiating the access? Yes and no... you're right that that's how we viewed it before (so your memory is good), and that's one way of looking at it, and it's not incorrect. I'm saying that if we look at the problem differently I think there's an easier solution. The problem to me seems to be that initiateAcc doesn't know whether it's just letting the rest of the CPU know something needs to happen later, or if it's actually causing that thing to happen right away. The other part, completeAcc, doesn't know if it's happening in the middle of initiateAcc, or if its in a callback. If it's in initiateAcc there are things it (or maybe initiateAcc, is that what you're saying?) shouldn't touch. If it's in a callback, it's likely the last thing to handle the instruction/translation/packet/whatever, so it needs to be sure to clean up everything. These seem like contradictory responsibilities. You've got the right idea there... basically initiateAcc() actually initiates the acc when it calls read() or write(), and it can't count on the translation being complete when read()/write() returns, but conversely it also shouldn't count on the translation *not* being complete when read()/write() returns. So there really shouldn't be any code after read()/write() that should care one way or the other. If we get rid of the code that does, then we'll solve the problem without adding any complexity to the control flow. As a specific example, the one thing that appears to give us trouble in initiateAcc() is this: fault = xc-write((uint%(mem_acc_size)d_t)Mem, EA, memAccessFlags, NULL); if (traceData) { traceData-setData(Mem); } ...and the root problem is that if the access completes in write(), traceData may not be valid any more. The ironic thing here is that (1) the value of Mem doesn't depend on write(), so the line could be moved before the call; and (2) traceData-setData() is also called in write() itself, so this whole line is redundant. So we can fix this specific problem by either (1) moving the traceData line above write() or (2) preferably, deleting the line entirely. Similarly, read()/write() initiate the translation when they call translateTiming(), and there shouldn't be any code in read()/write() after that point that depends on the translation having completed *or not*. Interestingly there is code (the req-isUncacheable() check, and a setData() call in read()) that seems like it requires the translation to have completed, which is just as incorrect. That code should be moved somewhere else (like finishTranslation()) where it can be guaranteed that the translation has finished. What I had been advocating before (if I remember right) was to disambiguate what the circumstances each function can expect by breaking up the call chains so that the functions don't end up nested inside each other unpredictably. Like I said, though, I'm making sure I respond rather than getting my facts straight, so please let me know if I'm missing your point (likely) or misunderstanding something. That would work, but this approach is simpler since it doesn't depend on figuring out what circumstances you're in. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] [PATCH] Cpu: Make TimingSimpleCPU use new DTB translation code
Hi Ali, Yes, certainly. I'll try to get round to doing this today or tomorrow. Cheers Tim On Sun, 21 Mar 2010 11:13:22 -0400, Ali Saidi sa...@umich.edu wrote: Hi Tim, When you have a moment, would you mind adding some comments to src/cpu/ translation.hh? Thanks, Ali On Nov 25, 2009, at 3:20 PM, Timothy M. Jones wrote: # HG changeset patch # User Timothy M. Jones tjon...@inf.ed.ac.uk # Date 1259064162 0 # Node ID 77dffd16710c4ad8e308b7dfee75cce4219b30a0 # Parent b431ed08a6b7d809df9bfed51365e4d3bdf81f93 Cpu: Make TimingSimpleCPU use new DTB translation code. This patch removes the duplicate code from cpu/simple/timing.hh and makes TimingSimpleCPU use the new code in cpu/translation.hh. I've had to make a few additional changes to get it to work ok for both O3 and Timing. diff --git a/src/cpu/base.hh b/src/cpu/base.hh --- a/src/cpu/base.hh +++ b/src/cpu/base.hh @@ -53,6 +53,7 @@ class ThreadContext; class System; class Port; +class WholeTranslationState; namespace TheISA { @@ -276,6 +277,12 @@ virtual Counter totalInstructions() const { return 0; } +/** + * Finish a DTB translation. + * @param state The DTB translation state. + */ +virtual void finishTranslation(WholeTranslationState *state) {} + // Function tracing private: bool functionTracingEnabled; diff --git a/src/cpu/base_dyn_inst.hh b/src/cpu/base_dyn_inst.hh --- a/src/cpu/base_dyn_inst.hh +++ b/src/cpu/base_dyn_inst.hh @@ -979,9 +979,9 @@ BaseTLB::Mode mode) { WholeTranslationState *state = -new WholeTranslationState(req, res, mode); -DataTranslationImpl *trans = -new DataTranslationImpl(this, state); +new WholeTranslationState(req, NULL, res, mode); +DataTranslationInstImpl *trans = +new DataTranslationInstImpl(this, state); cpu-dtb-translateTiming(req, thread-getTC(), trans, mode); } @@ -993,11 +993,11 @@ { // Set up the translation state. WholeTranslationState *state = -new WholeTranslationState(req, sreqLow, sreqHigh, res, mode); -DataTranslationImpl *stransLow = -new DataTranslationImpl(this, state, 0); -DataTranslationImpl *stransHigh = -new DataTranslationImpl(this, state, 1); +new WholeTranslationState(req, sreqLow, sreqHigh, NULL, res, mode); +DataTranslationInstImpl *stransLow = +new DataTranslationInstImpl(this, state, 0); +DataTranslationInstImpl *stransHigh = +new DataTranslationInstImpl(this, state, 1); // Perform the translation. cpu-dtb-translateTiming(sreqLow, thread-getTC(), stransLow, mode); diff --git a/src/cpu/simple/timing.cc b/src/cpu/simple/timing.cc --- a/src/cpu/simple/timing.cc +++ b/src/cpu/simple/timing.cc @@ -268,19 +268,9 @@ } void -TimingSimpleCPU::sendData(Fault fault, RequestPtr req, -uint8_t *data, uint64_t *res, bool read) +TimingSimpleCPU::sendData(RequestPtr req, uint8_t *data, uint64_t *res, + bool read) { -_status = Running; -if (fault != NoFault) { -if (req-isPrefetch()) -fault = NoFault; -delete data; -delete req; - -translationFault(fault); -return; -} PacketPtr pkt; buildPacket(pkt, req, read); pkt-dataDynamicuint8_t(data); @@ -311,25 +301,9 @@ } void -TimingSimpleCPU::sendSplitData(Fault fault1, Fault fault2, -RequestPtr req1, RequestPtr req2, RequestPtr req, -uint8_t *data, bool read) +TimingSimpleCPU::sendSplitData(RequestPtr req1, RequestPtr req2, + RequestPtr req, uint8_t *data, bool read) { -_status = Running; -if (fault1 != NoFault || fault2 != NoFault) { -if (req1-isPrefetch()) -fault1 = NoFault; -if (req2-isPrefetch()) -fault2 = NoFault; -delete data; -delete req1; -delete req2; -if (fault1 != NoFault) -translationFault(fault1); -else if (fault2 != NoFault) -translationFault(fault2); -return; -} PacketPtr pkt1, pkt2; buildSplitPacket(pkt1, pkt2, req1, req2, req, data, read); if (req-getFlags().isSet(Request::NO_ACCESS)) { @@ -450,6 +424,7 @@ const Addr pc = thread-readPC(); unsigned block_size = dcachePort.peerBlockSize(); int data_size = sizeof(T); +BaseTLB::Mode mode = BaseTLB::Read; RequestPtr req = new Request(asid, addr, data_size, flags, pc, _cpuId, tid); @@ -457,24 +432,27 @@ Addr split_addr = roundDown(addr + data_size - 1, block_size); assert(split_addr = addr || split_addr - addr block_size); - _status = DTBWaitResponse; if (split_addr addr) { RequestPtr req1, req2; assert(!req-isLLSC() !req-isSwap()); req-splitOnVaddr(split_addr, req1, req2); -
Re: [m5-dev] changeset in m5: ruby: Ruby support for LLSC
I did see your comments on the isReadWrite check and that is why I added the Fix Me comment. The problem is I did not originally add the line and I have no way of testing any changes to this functionality. I assume this is needed by the libruby interface and I'm expecting the folks at Wisconsin to fix this. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Sunday, March 21, 2010 11:05 PM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: ruby: Ruby support for LLSC Comments below... On Sun, Mar 21, 2010 at 10:47 PM, Brad Beckmann brad.beckm...@amd.com wrote: changeset 185ad61a4117 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=185ad61a4117 description: ruby: Ruby support for LLSC diffstat: 4 files changed, 102 insertions(+), 20 deletions(-) src/mem/ruby/system/CacheMemory.cc | 17 +- src/mem/ruby/system/RubyPort.cc | 58 +--- src/mem/ruby/system/SConscript | 2 + src/mem/ruby/system/Sequencer.cc | 45 ++- diffs (215 lines): [...] - type = RubyRequestType_RMW_Write; } else { - panic(Unsupported ruby packet type\n); + if (pkt-isRead()) { + if (pkt-req-isInstFetch()) { + type = RubyRequestType_IFETCH; + } else { + type = RubyRequestType_LD; + } + } else if (pkt-isWrite()) { + type = RubyRequestType_ST; + } else if (pkt-isReadWrite()) { + // + // Fix me. Just because the packet is a read/write request does not + // necessary mean it is a read-modify-write atomic operation. + // + type = RubyRequestType_RMW_Write; + } else { + panic(Unsupported ruby packet type\n); + } } Did you miss my comment about isReadWrite() on this patch? Or did you decide not to do anything since it's no worse than before? Seems like it's at least worth adding a comment for posterity, but better yet to fix it now if it needs fixing. + + g_system_ptr-getProfiler()- profileTransition(Seq, + m_version, + Address(request.paddr), + , + SC Fail, + , + RubyRequestType_to_string(request.type)); + I'm still not thrilled with the line-count explosion your parameter per line formatting gives, especially when we're dedicating entire lines to parameters like ''. As far as parsing the call, there are enough args to this function that it's opaque to me what it's supposed to be doing regardless of how many lines it takes. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: ruby: Ruby support for LLSC
OK... it does need a fix me comment, but the text doesn't really identify the issue; it should include something like this code never gets executed because isReadWrite() implies both isRead() and isWrite() are also true. For the record, we'll need to move away from sending RMW operations to the memory system to sending a locked read followed by a locked write to support generic x86 locked operations anyway. Steve On Mon, Mar 22, 2010 at 9:03 AM, Beckmann, Brad brad.beckm...@amd.com wrote: I did see your comments on the isReadWrite check and that is why I added the Fix Me comment. The problem is I did not originally add the line and I have no way of testing any changes to this functionality. I assume this is needed by the libruby interface and I'm expecting the folks at Wisconsin to fix this. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Sunday, March 21, 2010 11:05 PM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: ruby: Ruby support for LLSC Comments below... On Sun, Mar 21, 2010 at 10:47 PM, Brad Beckmann brad.beckm...@amd.com wrote: changeset 185ad61a4117 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=185ad61a4117 description: ruby: Ruby support for LLSC diffstat: 4 files changed, 102 insertions(+), 20 deletions(-) src/mem/ruby/system/CacheMemory.cc | 17 +- src/mem/ruby/system/RubyPort.cc | 58 +--- src/mem/ruby/system/SConscript | 2 + src/mem/ruby/system/Sequencer.cc | 45 ++- diffs (215 lines): [...] - type = RubyRequestType_RMW_Write; } else { - panic(Unsupported ruby packet type\n); + if (pkt-isRead()) { + if (pkt-req-isInstFetch()) { + type = RubyRequestType_IFETCH; + } else { + type = RubyRequestType_LD; + } + } else if (pkt-isWrite()) { + type = RubyRequestType_ST; + } else if (pkt-isReadWrite()) { + // + // Fix me. Just because the packet is a read/write request does not + // necessary mean it is a read-modify-write atomic operation. + // + type = RubyRequestType_RMW_Write; + } else { + panic(Unsupported ruby packet type\n); + } } Did you miss my comment about isReadWrite() on this patch? Or did you decide not to do anything since it's no worse than before? Seems like it's at least worth adding a comment for posterity, but better yet to fix it now if it needs fixing. + + g_system_ptr-getProfiler()- profileTransition(Seq, + m_version, + Address(request.paddr), + , + SC Fail, + , + RubyRequestType_to_string(request.type)); + I'm still not thrilled with the line-count explosion your parameter per line formatting gives, especially when we're dedicating entire lines to parameters like ''. As far as parsing the call, there are enough args to this function that it's opaque to me what it's supposed to be doing regardless of how many lines it takes. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Cron m5t...@zizzer /z/m5/regression/do-regression quick
I'm curious why a few tests last night showed stats differences. When I locally run the regression tester, the only large statistically differences I see are in the host_inst_rate. I don't have direct access to the regression machine and I just want to make sure that I haven't missed anything or made ruby fundamentally slower. Could someone forward me the statsdiff files? Thanks, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Cron Daemon Sent: Monday, March 22, 2010 12:23 AM To: m5-dev@m5sim.org Subject: [m5-dev] Cron m5t...@zizzer /z/m5/regression/do-regression quick * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/o3-timing passed. * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple- atomic passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple- timing-ruby passed. * build/ALPHA_SE/tests/fast/quick/60.rubytest/alpha/linux/rubytest- ruby passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple- timing-ruby passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple- atomic passed. * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple- atomic passed. * build/ALPHA_SE/tests/fast/quick/01.hello-2T-smt/alpha/linux/o3- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/o3-timing passed. * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple- atomic-mp passed. * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple- timing-mp passed. * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest- ruby passed. * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/60.rubytest/alpha/linux/ru bytest-ruby-MOESI_hammer passed. = Statistics differences =* build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/tru64/simpl e-timing-ruby-MOESI_hammer passed. * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/linux/simpl e-timing-ruby-MOESI_hammer passed. * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/50.memtest/alpha/linux/mem test-ruby-MOESI_hammer passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/60.rubytest/alpha/li nux/rubytest-ruby-MESI_CMP_directory passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru64 /simple-timing-ruby-MESI_CMP_directory passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/linux /simple-timing-ruby-MESI_CMP_directory passed. * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/50.memtest/alpha/lin ux/memtest-ruby-MESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru6 4/simple-timing-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/linu x/simple-timing-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/60.rubytest/alpha/l inux/rubytest-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/50.memtest/alpha/li nux/memtest-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/tru64/si mple-timing-ruby-MOESI_CMP_token passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/linux/si mple-timing-ruby-MOESI_CMP_token passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/60.rubytest/alpha/linux /rubytest-ruby-MOESI_CMP_token passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/50.memtest/alpha/linux/ memtest-ruby-MOESI_CMP_token passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-atomic passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-atomic-dual passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-timing passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-timing-dual passed. * build/ALPHA_FS/tests/fast/quick/80.netperf- stream/alpha/linux/twosys-tsunami-simple-atomic passed. = Statistics differences =* build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/o3-timing passed. * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple- timing-ruby passed. * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing passed. * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-atomic passed. *
Re: [m5-dev] changeset in m5: ruby: Ruby support for LLSC
Yes, I only identified that the programmer's intent was ill specified, I should have also identified that the code never gets executed. I'll check in a change now. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Monday, March 22, 2010 9:13 AM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: ruby: Ruby support for LLSC OK... it does need a fix me comment, but the text doesn't really identify the issue; it should include something like this code never gets executed because isReadWrite() implies both isRead() and isWrite() are also true. For the record, we'll need to move away from sending RMW operations to the memory system to sending a locked read followed by a locked write to support generic x86 locked operations anyway. Steve On Mon, Mar 22, 2010 at 9:03 AM, Beckmann, Brad brad.beckm...@amd.com wrote: I did see your comments on the isReadWrite check and that is why I added the Fix Me comment. The problem is I did not originally add the line and I have no way of testing any changes to this functionality. I assume this is needed by the libruby interface and I'm expecting the folks at Wisconsin to fix this. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Sunday, March 21, 2010 11:05 PM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: ruby: Ruby support for LLSC Comments below... On Sun, Mar 21, 2010 at 10:47 PM, Brad Beckmann brad.beckm...@amd.com wrote: changeset 185ad61a4117 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=185ad61a4117 description: ruby: Ruby support for LLSC diffstat: 4 files changed, 102 insertions(+), 20 deletions(-) src/mem/ruby/system/CacheMemory.cc | 17 +- src/mem/ruby/system/RubyPort.cc | 58 +--- src/mem/ruby/system/SConscript | 2 + src/mem/ruby/system/Sequencer.cc | 45 ++-- --- diffs (215 lines): [...] - type = RubyRequestType_RMW_Write; } else { - panic(Unsupported ruby packet type\n); + if (pkt-isRead()) { + if (pkt-req-isInstFetch()) { + type = RubyRequestType_IFETCH; + } else { + type = RubyRequestType_LD; + } + } else if (pkt-isWrite()) { + type = RubyRequestType_ST; + } else if (pkt-isReadWrite()) { + // + // Fix me. Just because the packet is a read/write request does not + // necessary mean it is a read-modify-write atomic operation. + // + type = RubyRequestType_RMW_Write; + } else { + panic(Unsupported ruby packet type\n); + } } Did you miss my comment about isReadWrite() on this patch? Or did you decide not to do anything since it's no worse than before? Seems like it's at least worth adding a comment for posterity, but better yet to fix it now if it needs fixing. + + g_system_ptr-getProfiler()- profileTransition(Seq, + m_version, + Address(request.paddr), + , + SC Fail, + , + RubyRequestType_to_string(request.type)); + I'm still not thrilled with the line-count explosion your parameter per line formatting gives, especially when we're dedicating entire lines to parameters like ''. As far as parsing the call, there are enough args to this function that it's opaque to me what it's supposed to be doing regardless of how many lines it takes. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Cron m5t...@zizzer /z/m5/regression/do-regression quick
There aren't any statistics differences... the lines like: = Statistics differences =* build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/o3-timing passed. are just output getting mixed up because jobs are running concurrently, combined with cheesy grep-based summary generation. If there was a difference in a significant statistic (e.g., not host_inst_rate) then the test would be flagged as failing. Steve On Mon, Mar 22, 2010 at 11:00 AM, Beckmann, Brad brad.beckm...@amd.com wrote: I'm curious why a few tests last night showed stats differences. When I locally run the regression tester, the only large statistically differences I see are in the host_inst_rate. I don't have direct access to the regression machine and I just want to make sure that I haven't missed anything or made ruby fundamentally slower. Could someone forward me the statsdiff files? Thanks, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Cron Daemon Sent: Monday, March 22, 2010 12:23 AM To: m5-dev@m5sim.org Subject: [m5-dev] Cron m5t...@zizzer /z/m5/regression/do-regression quick * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/o3-timing passed. * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple- atomic passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple- timing-ruby passed. * build/ALPHA_SE/tests/fast/quick/60.rubytest/alpha/linux/rubytest- ruby passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple- timing-ruby passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple- atomic passed. * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple- atomic passed. * build/ALPHA_SE/tests/fast/quick/01.hello-2T-smt/alpha/linux/o3- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/o3-timing passed. * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple- atomic-mp passed. * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple- timing-mp passed. * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest- ruby passed. * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/60.rubytest/alpha/linux/ru bytest-ruby-MOESI_hammer passed. = Statistics differences =* build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/tru64/simpl e-timing-ruby-MOESI_hammer passed. * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/linux/simpl e-timing-ruby-MOESI_hammer passed. * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/50.memtest/alpha/linux/mem test-ruby-MOESI_hammer passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/60.rubytest/alpha/li nux/rubytest-ruby-MESI_CMP_directory passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru64 /simple-timing-ruby-MESI_CMP_directory passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/linux /simple-timing-ruby-MESI_CMP_directory passed. * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/50.memtest/alpha/lin ux/memtest-ruby-MESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru6 4/simple-timing-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/linu x/simple-timing-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/60.rubytest/alpha/l inux/rubytest-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/50.memtest/alpha/li nux/memtest-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/tru64/si mple-timing-ruby-MOESI_CMP_token passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/linux/si mple-timing-ruby-MOESI_CMP_token passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/60.rubytest/alpha/linux /rubytest-ruby-MOESI_CMP_token passed. * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/50.memtest/alpha/linux/ memtest-ruby-MOESI_CMP_token passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-atomic passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-atomic-dual passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-timing passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-timing-dual passed.
Re: [m5-dev] changeset in m5: ruby: Ruby support for LLSC
Yes, it's good to have an exhaustive list of all the ways that code is broken ;-). Thanks! On Mon, Mar 22, 2010 at 11:12 AM, Beckmann, Brad brad.beckm...@amd.com wrote: Yes, I only identified that the programmer's intent was ill specified, I should have also identified that the code never gets executed. I'll check in a change now. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Monday, March 22, 2010 9:13 AM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: ruby: Ruby support for LLSC OK... it does need a fix me comment, but the text doesn't really identify the issue; it should include something like this code never gets executed because isReadWrite() implies both isRead() and isWrite() are also true. For the record, we'll need to move away from sending RMW operations to the memory system to sending a locked read followed by a locked write to support generic x86 locked operations anyway. Steve On Mon, Mar 22, 2010 at 9:03 AM, Beckmann, Brad brad.beckm...@amd.com wrote: I did see your comments on the isReadWrite check and that is why I added the Fix Me comment. The problem is I did not originally add the line and I have no way of testing any changes to this functionality. I assume this is needed by the libruby interface and I'm expecting the folks at Wisconsin to fix this. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Sunday, March 21, 2010 11:05 PM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: ruby: Ruby support for LLSC Comments below... On Sun, Mar 21, 2010 at 10:47 PM, Brad Beckmann brad.beckm...@amd.com wrote: changeset 185ad61a4117 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=185ad61a4117 description: ruby: Ruby support for LLSC diffstat: 4 files changed, 102 insertions(+), 20 deletions(-) src/mem/ruby/system/CacheMemory.cc | 17 +- src/mem/ruby/system/RubyPort.cc | 58 +--- src/mem/ruby/system/SConscript | 2 + src/mem/ruby/system/Sequencer.cc | 45 ++-- --- diffs (215 lines): [...] - type = RubyRequestType_RMW_Write; } else { - panic(Unsupported ruby packet type\n); + if (pkt-isRead()) { + if (pkt-req-isInstFetch()) { + type = RubyRequestType_IFETCH; + } else { + type = RubyRequestType_LD; + } + } else if (pkt-isWrite()) { + type = RubyRequestType_ST; + } else if (pkt-isReadWrite()) { + // + // Fix me. Just because the packet is a read/write request does not + // necessary mean it is a read-modify-write atomic operation. + // + type = RubyRequestType_RMW_Write; + } else { + panic(Unsupported ruby packet type\n); + } } Did you miss my comment about isReadWrite() on this patch? Or did you decide not to do anything since it's no worse than before? Seems like it's at least worth adding a comment for posterity, but better yet to fix it now if it needs fixing. + + g_system_ptr-getProfiler()- profileTransition(Seq, + m_version, + Address(request.paddr), + , + SC Fail, + , + RubyRequestType_to_string(request.type)); + I'm still not thrilled with the line-count explosion your parameter per line formatting gives, especially when we're dedicating entire lines to parameters like ''. As far as parsing the call, there are enough args to this function that it's opaque to me what it's supposed to be doing regardless of how many lines it takes. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: ruby: improved isReadWrite fix me comment
changeset b78b3a9e205f in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=b78b3a9e205f description: ruby: improved isReadWrite fix me comment diffstat: 1 file changed, 4 insertions(+), 2 deletions(-) src/mem/ruby/system/RubyPort.cc |6 -- diffs (16 lines): diff -r 6bf327b128c6 -r b78b3a9e205f src/mem/ruby/system/RubyPort.cc --- a/src/mem/ruby/system/RubyPort.cc Sun Mar 21 21:22:22 2010 -0700 +++ b/src/mem/ruby/system/RubyPort.cc Mon Mar 22 11:19:17 2010 -0700 @@ -230,8 +230,10 @@ type = RubyRequestType_ST; } else if (pkt-isReadWrite()) { // -// Fix me. Just because the packet is a read/write request does not -// necessary mean it is a read-modify-write atomic operation. +// Fix me. This conditional will never be executed because +// isReadWrite() is just an OR of isRead() and isWrite(). +// Furthermore, just because the packet is a read/write request does +// not necessary mean it is a read-modify-write atomic operation. // type = RubyRequestType_RMW_Write; } else { ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Cron m5t...@zizzer /z/m5/regression/do-regression --scratch all
We're still getting those 'invalid IPR' errors... On Sun, Mar 21, 2010 at 3:46 AM, Cron Daemon r...@zizzer.eecs.umich.edu wrote: * build/ALPHA_SE/tests/fast/long/70.twolf/alpha/tru64/inorder-timing FAILED! * build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/inorder-timing FAILED! ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] [PATCH] CPU: Make the CPU wait until initiateAcc finishes before calling completeAcc
Ok. This sounds plausible to me, and I'm glad it's not going to be so much work. I'm sure you already thought of this, but along the lines of what we were talking about in that other thread, something that important and subtle will deserve a big obvious comment or two. Gabe Steve Reinhardt wrote: Hi Gabe, On Sun, Mar 21, 2010 at 5:22 PM, Gabe Black gbl...@eecs.umich.edu wrote: It's been a while since I've looked at this, but I want to make sure I remember to respond so I don't want to wait until I have a chance to re-research all this. Isn't the problem that initiateAcc ends up calling completeAcc mid-function, that cleans up after the access, then initateAcc gets control back and tries to finish initiating the access? Yes and no... you're right that that's how we viewed it before (so your memory is good), and that's one way of looking at it, and it's not incorrect. I'm saying that if we look at the problem differently I think there's an easier solution. The problem to me seems to be that initiateAcc doesn't know whether it's just letting the rest of the CPU know something needs to happen later, or if it's actually causing that thing to happen right away. The other part, completeAcc, doesn't know if it's happening in the middle of initiateAcc, or if its in a callback. If it's in initiateAcc there are things it (or maybe initiateAcc, is that what you're saying?) shouldn't touch. If it's in a callback, it's likely the last thing to handle the instruction/translation/packet/whatever, so it needs to be sure to clean up everything. These seem like contradictory responsibilities. You've got the right idea there... basically initiateAcc() actually initiates the acc when it calls read() or write(), and it can't count on the translation being complete when read()/write() returns, but conversely it also shouldn't count on the translation *not* being complete when read()/write() returns. So there really shouldn't be any code after read()/write() that should care one way or the other. If we get rid of the code that does, then we'll solve the problem without adding any complexity to the control flow. As a specific example, the one thing that appears to give us trouble in initiateAcc() is this: fault = xc-write((uint%(mem_acc_size)d_t)Mem, EA, memAccessFlags, NULL); if (traceData) { traceData-setData(Mem); } ...and the root problem is that if the access completes in write(), traceData may not be valid any more. The ironic thing here is that (1) the value of Mem doesn't depend on write(), so the line could be moved before the call; and (2) traceData-setData() is also called in write() itself, so this whole line is redundant. So we can fix this specific problem by either (1) moving the traceData line above write() or (2) preferably, deleting the line entirely. Similarly, read()/write() initiate the translation when they call translateTiming(), and there shouldn't be any code in read()/write() after that point that depends on the translation having completed *or not*. Interestingly there is code (the req-isUncacheable() check, and a setData() call in read()) that seems like it requires the translation to have completed, which is just as incorrect. That code should be moved somewhere else (like finishTranslation()) where it can be guaranteed that the translation has finished. What I had been advocating before (if I remember right) was to disambiguate what the circumstances each function can expect by breaking up the call chains so that the functions don't end up nested inside each other unpredictably. Like I said, though, I'm making sure I respond rather than getting my facts straight, so please let me know if I'm missing your point (likely) or misunderstanding something. That would work, but this approach is simpler since it doesn't depend on figuring out what circumstances you're in. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: inorder: fix address list bug
changeset 7739d67ca64f in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=7739d67ca64f description: inorder: fix address list bug diffstat: 3 files changed, 14 insertions(+), 13 deletions(-) src/cpu/inorder/SConscript |2 +- src/cpu/inorder/cpu.cc |2 +- src/cpu/inorder/resources/cache_unit.cc | 23 --- diffs (88 lines): diff -r b78b3a9e205f -r 7739d67ca64f src/cpu/inorder/SConscript --- a/src/cpu/inorder/SConscriptMon Mar 22 11:19:17 2010 -0700 +++ b/src/cpu/inorder/SConscriptMon Mar 22 15:38:28 2010 -0400 @@ -61,7 +61,7 @@ 'InOrderMDU', 'InOrderAGEN', 'InOrderFetchSeq', 'InOrderTLB', 'InOrderBPred', 'InOrderDecode', 'InOrderExecute', 'InOrderInstBuffer', 'InOrderUseDef', 'InOrderGraduation', 'InOrderCachePort', 'RegDepMap', 'Resource', - 'ThreadModel']) + 'ThreadModel', 'AddrDep']) Source('pipeline_traits.cc') Source('inorder_dyn_inst.cc') diff -r b78b3a9e205f -r 7739d67ca64f src/cpu/inorder/cpu.cc --- a/src/cpu/inorder/cpu.ccMon Mar 22 11:19:17 2010 -0700 +++ b/src/cpu/inorder/cpu.ccMon Mar 22 15:38:28 2010 -0400 @@ -1335,7 +1335,7 @@ while (!reqRemoveList.empty()) { ResourceRequest *res_req = reqRemoveList.front(); -DPRINTF(InOrderCPU, [tid:%i] [sn:%lli]: Removing Request +DPRINTF(Resource, [tid:%i] [sn:%lli]: Removing Request [stage_num:%i] [res:%s] [slot:%i] [completed:%i].\n, res_req-inst-threadNumber, res_req-inst-seqNum, diff -r b78b3a9e205f -r 7739d67ca64f src/cpu/inorder/resources/cache_unit.cc --- a/src/cpu/inorder/resources/cache_unit.cc Mon Mar 22 11:19:17 2010 -0700 +++ b/src/cpu/inorder/resources/cache_unit.cc Mon Mar 22 15:38:28 2010 -0400 @@ -188,12 +188,18 @@ addrList[tid].push_back(req_addr); addrMap[tid][req_addr] = inst-seqNum; -DPRINTF(InOrderCachePort, -[tid:%i]: [sn:%i]: Address %08p added to dependency list\n, -inst-readTid(), inst-seqNum, req_addr); + DPRINTF(AddrDep, [tid:%i]: [sn:%i]: Address %08p added to dependency list\n, inst-readTid(), inst-seqNum, req_addr); + +//@NOTE: 10 is an arbitrarily high number here, but to be exact +// we would need to know the # of outstanding accesses +// a priori. Information like fetch width, stage width, +// and the branch resolution stage would be useful for the +// icache_port (among other things). For the dcache, the # +// of outstanding cache accesses might be sufficient. +assert(addrList[tid].size() 10); } void @@ -203,6 +209,8 @@ Addr mem_addr = inst-getMemAddr(); +inst-unsetMemAddr(); + // Erase from Address List vectorAddr::iterator vect_it = find(addrList[tid].begin(), addrList[tid].end(), mem_addr); @@ -1106,8 +1114,6 @@ tid, cache_req-inst-readPC()); cache_req-setMemAccCompleted(); } - -inst-unsetMemAddr(); } void @@ -1225,10 +1231,6 @@ // Mark slot for removal from resource slot_remove_list.push_back(req_ptr-getSlot()); - -DPRINTF(InOrderCachePort, -[tid:%i] Squashing request from [sn:%i]\n, -req_ptr-getInst()-readTid(), req_ptr-getInst()-seqNum); } else { DPRINTF(InOrderCachePort, [tid:%i] Request from [sn:%i] squashed, but still pending completion.\n, @@ -1246,8 +1248,7 @@ req_ptr-getInst()-getMemAddr()); removeAddrDependency(req_ptr-getInst()); -} - +} } map_it++; ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: ruby: improved isReadWrite fix me comment
Given that the code is unreachable, using warn() seems somewhat irrelevant. :-) I can see your point though in terms of being able to grep for possible problems. Steve On Mon, Mar 22, 2010 at 1:57 PM, nathan binkert n...@binkert.org wrote: IMHO, things like this shouldn't be comments, but should rather use hack() or warn() to notify the user that something fishy might be going on. Nate On Mon, Mar 22, 2010 at 11:22 AM, Brad Beckmann brad.beckm...@amd.com wrote: changeset b78b3a9e205f in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=b78b3a9e205f description: ruby: improved isReadWrite fix me comment diffstat: 1 file changed, 4 insertions(+), 2 deletions(-) src/mem/ruby/system/RubyPort.cc | 6 -- diffs (16 lines): diff -r 6bf327b128c6 -r b78b3a9e205f src/mem/ruby/system/RubyPort.cc --- a/src/mem/ruby/system/RubyPort.cc Sun Mar 21 21:22:22 2010 -0700 +++ b/src/mem/ruby/system/RubyPort.cc Mon Mar 22 11:19:17 2010 -0700 @@ -230,8 +230,10 @@ type = RubyRequestType_ST; } else if (pkt-isReadWrite()) { // - // Fix me. Just because the packet is a read/write request does not - // necessary mean it is a read-modify-write atomic operation. + // Fix me. This conditional will never be executed because + // isReadWrite() is just an OR of isRead() and isWrite(). + // Furthermore, just because the packet is a read/write request does + // not necessary mean it is a read-modify-write atomic operation. // type = RubyRequestType_RMW_Write; } else { ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: ruby: improved isReadWrite fix me comment
More seriously, yes, something like: if (pkt-isReadWrite()) warn(this is probably broken); does seem reasonable. On Mon, Mar 22, 2010 at 2:01 PM, Steve Reinhardt ste...@gmail.com wrote: Given that the code is unreachable, using warn() seems somewhat irrelevant. :-) I can see your point though in terms of being able to grep for possible problems. Steve On Mon, Mar 22, 2010 at 1:57 PM, nathan binkert n...@binkert.org wrote: IMHO, things like this shouldn't be comments, but should rather use hack() or warn() to notify the user that something fishy might be going on. Nate On Mon, Mar 22, 2010 at 11:22 AM, Brad Beckmann brad.beckm...@amd.com wrote: changeset b78b3a9e205f in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=b78b3a9e205f description: ruby: improved isReadWrite fix me comment diffstat: 1 file changed, 4 insertions(+), 2 deletions(-) src/mem/ruby/system/RubyPort.cc | 6 -- diffs (16 lines): diff -r 6bf327b128c6 -r b78b3a9e205f src/mem/ruby/system/RubyPort.cc --- a/src/mem/ruby/system/RubyPort.cc Sun Mar 21 21:22:22 2010 -0700 +++ b/src/mem/ruby/system/RubyPort.cc Mon Mar 22 11:19:17 2010 -0700 @@ -230,8 +230,10 @@ type = RubyRequestType_ST; } else if (pkt-isReadWrite()) { // - // Fix me. Just because the packet is a read/write request does not - // necessary mean it is a read-modify-write atomic operation. + // Fix me. This conditional will never be executed because + // isReadWrite() is just an OR of isRead() and isWrite(). + // Furthermore, just because the packet is a read/write request does + // not necessary mean it is a read-modify-write atomic operation. // type = RubyRequestType_RMW_Write; } else { ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: ruby: improved isReadWrite fix me comment
Right, just because it's unreachable as written doesn't mean that a warn can't be added somehow. More seriously, yes, something like: if (pkt-isReadWrite()) warn(this is probably broken); does seem reasonable. On Mon, Mar 22, 2010 at 2:01 PM, Steve Reinhardt ste...@gmail.com wrote: Given that the code is unreachable, using warn() seems somewhat irrelevant. :-) I can see your point though in terms of being able to grep for possible problems. Steve On Mon, Mar 22, 2010 at 1:57 PM, nathan binkert n...@binkert.org wrote: IMHO, things like this shouldn't be comments, but should rather use hack() or warn() to notify the user that something fishy might be going on. Nate On Mon, Mar 22, 2010 at 11:22 AM, Brad Beckmann brad.beckm...@amd.com wrote: changeset b78b3a9e205f in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=b78b3a9e205f description: ruby: improved isReadWrite fix me comment diffstat: 1 file changed, 4 insertions(+), 2 deletions(-) src/mem/ruby/system/RubyPort.cc | 6 -- diffs (16 lines): diff -r 6bf327b128c6 -r b78b3a9e205f src/mem/ruby/system/RubyPort.cc --- a/src/mem/ruby/system/RubyPort.cc Sun Mar 21 21:22:22 2010 -0700 +++ b/src/mem/ruby/system/RubyPort.cc Mon Mar 22 11:19:17 2010 -0700 @@ -230,8 +230,10 @@ type = RubyRequestType_ST; } else if (pkt-isReadWrite()) { // - // Fix me. Just because the packet is a read/write request does not - // necessary mean it is a read-modify-write atomic operation. + // Fix me. This conditional will never be executed because + // isReadWrite() is just an OR of isRead() and isWrite(). + // Furthermore, just because the packet is a read/write request does + // not necessary mean it is a read-modify-write atomic operation. // type = RubyRequestType_RMW_Write; } else { ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: inorder: import name for addtl. bpred stats
changeset 5ae66d5f5ca2 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=5ae66d5f5ca2 description: inorder: import name for addtl. bpred stats diffstat: 3 files changed, 22 insertions(+), 19 deletions(-) src/cpu/inorder/resources/bpred_unit.cc | 32 +++-- src/cpu/inorder/resources/bpred_unit.hh |7 - src/cpu/inorder/resources/branch_predictor.cc |2 - diffs (121 lines): diff -r c207d418514e -r 5ae66d5f5ca2 src/cpu/inorder/resources/bpred_unit.cc --- a/src/cpu/inorder/resources/bpred_unit.cc Mon Mar 22 16:59:12 2010 -0400 +++ b/src/cpu/inorder/resources/bpred_unit.cc Mon Mar 22 17:19:48 2010 -0400 @@ -39,10 +39,9 @@ using namespace std; using namespace ThePipeline; -BPredUnit::BPredUnit(ThePipeline::Params *params) - : BTB(params-BTBEntries, -params-BTBTagSize, -params-instShiftAmt) +BPredUnit::BPredUnit(Resource *_res, ThePipeline::Params *params) +: res(_res), + BTB(params-BTBEntries, params-BTBTagSize, params-instShiftAmt) { // Setup the selected predictor. if (params-predType == local) { @@ -70,48 +69,47 @@ RAS[i].init(params-RASSize); } +std::string +BPredUnit::name() +{ +return res-name(); +} void BPredUnit::regStats() { lookups -.name(name() + .BPredUnit.lookups) +.name(name() + .lookups) .desc(Number of BP lookups) ; condPredicted -.name(name() + .BPredUnit.condPredicted) +.name(name() + .condPredicted) .desc(Number of conditional branches predicted) ; condIncorrect -.name(name() + .BPredUnit.condIncorrect) +.name(name() + .condIncorrect) .desc(Number of conditional branches incorrect) ; BTBLookups -.name(name() + .BPredUnit.BTBLookups) +.name(name() + .BTBLookups) .desc(Number of BTB lookups) ; BTBHits -.name(name() + .BPredUnit.BTBHits) +.name(name() + .BTBHits) .desc(Number of BTB hits) ; -BTBCorrect -.name(name() + .BPredUnit.BTBCorrect) -.desc(Number of correct BTB predictions (this stat may not - work properly.) -; - usedRAS -.name(name() + .BPredUnit.usedRAS) +.name(name() + .usedRAS) .desc(Number of times the RAS was used to get a target.) ; RASIncorrect -.name(name() + .BPredUnit.RASInCorrect) +.name(name() + .RASInCorrect) .desc(Number of incorrect RAS predictions.) ; } diff -r c207d418514e -r 5ae66d5f5ca2 src/cpu/inorder/resources/bpred_unit.hh --- a/src/cpu/inorder/resources/bpred_unit.hh Mon Mar 22 16:59:12 2010 -0400 +++ b/src/cpu/inorder/resources/bpred_unit.hh Mon Mar 22 17:19:48 2010 -0400 @@ -39,6 +39,7 @@ #include cpu/inst_seq.hh #include cpu/inorder/inorder_dyn_inst.hh #include cpu/inorder/pipeline_traits.hh +#include cpu/inorder/resource.hh #include cpu/pred/2bit_local.hh #include cpu/pred/btb.hh #include cpu/pred/ras.hh @@ -65,7 +66,9 @@ /** * @param params The params object, that has the size of the BP and BTB. */ -BPredUnit(ThePipeline::Params *params); +BPredUnit(Resource *_res, ThePipeline::Params *params); + +std::string name(); /** * Registers statistics. @@ -169,6 +172,8 @@ void dump(); private: +Resource *res; + struct PredictorHistory { /** * Makes a predictor history struct that contains any diff -r c207d418514e -r 5ae66d5f5ca2 src/cpu/inorder/resources/branch_predictor.cc --- a/src/cpu/inorder/resources/branch_predictor.cc Mon Mar 22 16:59:12 2010 -0400 +++ b/src/cpu/inorder/resources/branch_predictor.cc Mon Mar 22 17:19:48 2010 -0400 @@ -39,7 +39,7 @@ BranchPredictor::BranchPredictor(std::string res_name, int res_id, int res_width, int res_latency, InOrderCPU *_cpu, ThePipeline::Params *params) : Resource(res_name, res_id, res_width, res_latency, _cpu), - branchPred(params) + branchPred(this, params) { instSize = sizeof(MachInst); } ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: ruby: style pass
changeset bc0b6ea676b5 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=bc0b6ea676b5 description: ruby: style pass diffstat: 92 files changed, 6869 insertions(+), 8071 deletions(-) src/mem/ruby/buffers/MessageBuffer.cc| 490 +- src/mem/ruby/buffers/MessageBuffer.hh| 214 - src/mem/ruby/buffers/MessageBufferNode.cc| 11 src/mem/ruby/buffers/MessageBufferNode.hh| 80 src/mem/ruby/common/Address.cc | 48 src/mem/ruby/common/Address.hh | 300 - src/mem/ruby/common/Consumer.hh | 132 src/mem/ruby/common/DataBlock.cc | 20 src/mem/ruby/common/DataBlock.hh | 185 - src/mem/ruby/common/Debug.cc | 427 +- src/mem/ruby/common/Debug.hh | 418 +- src/mem/ruby/common/Driver.cc| 40 src/mem/ruby/common/Driver.hh| 77 src/mem/ruby/common/Global.cc|1 src/mem/ruby/common/Global.hh| 46 src/mem/ruby/common/Histogram.cc | 218 - src/mem/ruby/common/Histogram.hh | 96 src/mem/ruby/common/Message.cc | 34 src/mem/ruby/common/NetDest.cc | 294 - src/mem/ruby/common/NetDest.hh | 150 src/mem/ruby/common/SConscript |1 src/mem/ruby/common/SubBlock.cc | 51 src/mem/ruby/common/SubBlock.hh | 87 src/mem/ruby/eventqueue/RubyEventQueue.cc| 43 src/mem/ruby/eventqueue/RubyEventQueue.hh| 69 src/mem/ruby/eventqueue/RubyEventQueueNode.cc| 23 src/mem/ruby/eventqueue/RubyEventQueueNode.hh| 78 src/mem/ruby/filters/AbstractBloomFilter.hh | 57 src/mem/ruby/filters/BlockBloomFilter.cc | 207 - src/mem/ruby/filters/BlockBloomFilter.hh | 69 src/mem/ruby/filters/BulkBloomFilter.cc | 383 +- src/mem/ruby/filters/BulkBloomFilter.hh | 75 src/mem/ruby/filters/GenericBloomFilter.cc | 136 src/mem/ruby/filters/GenericBloomFilter.hh | 82 src/mem/ruby/filters/H3BloomFilter.cc| 1791 ++ src/mem/ruby/filters/H3BloomFilter.hh| 100 src/mem/ruby/filters/LSB_CountingBloomFilter.cc | 195 - src/mem/ruby/filters/LSB_CountingBloomFilter.hh | 69 src/mem/ruby/filters/MultiBitSelBloomFilter.cc | 287 - src/mem/ruby/filters/MultiBitSelBloomFilter.hh | 94 src/mem/ruby/filters/MultiGrainBloomFilter.cc| 167 src/mem/ruby/filters/MultiGrainBloomFilter.hh| 81 src/mem/ruby/filters/NonCountingBloomFilter.cc | 193 - src/mem/ruby/filters/NonCountingBloomFilter.hh | 79 src/mem/ruby/libruby.cc | 267 - src/mem/ruby/libruby.hh | 84 src/mem/ruby/libruby_internal.hh | 10 src/mem/ruby/slicc_interface/AbstractCacheEntry.cc | 19 src/mem/ruby/slicc_interface/AbstractCacheEntry.hh | 53 src/mem/ruby/slicc_interface/AbstractController.hh | 45 src/mem/ruby/slicc_interface/AbstractEntry.cc|9 src/mem/ruby/slicc_interface/AbstractEntry.hh| 49 src/mem/ruby/slicc_interface/AbstractProtocol.hh | 36 src/mem/ruby/slicc_interface/Message.hh | 80 src/mem/ruby/slicc_interface/NetworkMessage.hh | 101 src/mem/ruby/slicc_interface/RubySlicc_ComponentMapping.cc |2 src/mem/ruby/slicc_interface/RubySlicc_ComponentMapping.hh | 131 src/mem/ruby/slicc_interface/RubySlicc_Profiler_interface.cc | 84 src/mem/ruby/slicc_interface/RubySlicc_Profiler_interface.hh | 44 src/mem/ruby/slicc_interface/RubySlicc_Util.hh | 167 src/mem/ruby/slicc_interface/RubySlicc_includes.hh |6 src/mem/ruby/system/AbstractMemOrCache.hh| 56 src/mem/ruby/system/AbstractReplacementPolicy.hh | 76 src/mem/ruby/system/CacheMemory.cc | 618 +-- src/mem/ruby/system/CacheMemory.hh | 219 - src/mem/ruby/system/DMASequencer.cc | 207 - src/mem/ruby/system/DMASequencer.hh | 62 src/mem/ruby/system/DirectoryMemory.cc | 303 - src/mem/ruby/system/DirectoryMemory.hh |
[m5-dev] changeset in m5: inorder: update vortex regression
changeset 0039707f915e in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=0039707f915e description: inorder: update vortex regression diffstat: 3 files changed, 164 insertions(+), 145 deletions(-) tests/long/50.vortex/ref/alpha/tru64/inorder-timing/config.ini |3 tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simout |6 tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt | 300 +- diffs (truncated from 472 to 300 lines): diff -r 5ae66d5f5ca2 -r 0039707f915e tests/long/50.vortex/ref/alpha/tru64/inorder-timing/config.ini --- a/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/config.iniMon Mar 22 17:19:48 2010 -0400 +++ b/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/config.iniMon Mar 22 23:39:23 2010 -0400 @@ -79,6 +79,7 @@ latency=1000 max_miss_count=0 mshrs=10 +num_cpus=1 prefetch_data_accesses_only=false prefetch_degree=1 prefetch_latency=1 @@ -113,6 +114,7 @@ latency=1000 max_miss_count=0 mshrs=10 +num_cpus=1 prefetch_data_accesses_only=false prefetch_degree=1 prefetch_latency=1 @@ -147,6 +149,7 @@ latency=1 max_miss_count=0 mshrs=10 +num_cpus=1 prefetch_data_accesses_only=false prefetch_degree=1 prefetch_latency=10 diff -r 5ae66d5f5ca2 -r 0039707f915e tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simout --- a/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simoutMon Mar 22 17:19:48 2010 -0400 +++ b/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simoutMon Mar 22 23:39:23 2010 -0400 @@ -5,9 +5,9 @@ All Rights Reserved -M5 compiled Jan 30 2010 14:58:44 -M5 revision 4b602939e245 6707 default inorder_vortex_alpha qtip tip -M5 started Jan 30 2010 14:58:45 +M5 compiled Mar 22 2010 20:37:53 +M5 revision 8f0f6a2f2f48 7039 default qtip tip inorder_twolf qbase +M5 started Mar 22 2010 20:37:54 M5 executing on zooks command line: build/ALPHA_SE/m5.fast -d build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/inorder-timing -re tests/run.py build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/inorder-timing Global frequency set at 1 ticks per second diff -r 5ae66d5f5ca2 -r 0039707f915e tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt --- a/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt Mon Mar 22 17:19:48 2010 -0400 +++ b/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt Mon Mar 22 23:39:23 2010 -0400 @@ -1,53 +1,60 @@ -- Begin Simulation Statistics -- -host_inst_rate 51950 # Simulator instruction rate (inst/s) -host_mem_usage 166756 # Number of bytes of host memory used -host_seconds 1700.48 # Real time elapsed on the host -host_tick_rate 63220517 # Simulator tick rate (ticks/s) +host_inst_rate 52039 # Simulator instruction rate (inst/s) +host_mem_usage 166880 # Number of bytes of host memory used +host_seconds 1697.59 # Real time elapsed on the host +host_tick_rate 62436709 # Simulator tick rate (ticks/s) sim_freq 1 # Frequency of simulated ticks sim_insts88340673 # Number of instructions simulated -sim_seconds 0.107505 # Number of seconds simulated -sim_ticks107505320500 # Number of ticks simulated +sim_seconds 0.105992 # Number of seconds simulated +sim_ticks105992011500 # Number of ticks simulated system.cpu.AGEN-Unit.instReqsProcessed 35224018 # Number of Instructions Requests that completed in this resource. -system.cpu.Branch-Predictor.instReqsProcessed 88523379 # Number of Instructions Requests that completed in this resource. -system.cpu.Branch-Predictor.predictedNotTaken 10466150 # Number of Branches Predicted As Not Taken (False). -system.cpu.Branch-Predictor.predictedTaken 3314731 # Number of Branches Predicted As Taken (True). -system.cpu.Decode-Unit.instReqsProcessed 88523379 # Number of Instructions Requests that completed in this resource. +system.cpu.Branch-Predictor.BTBHits 4998012 # Number of BTB hits +system.cpu.Branch-Predictor.BTBLookups 12031092 # Number of BTB lookups
[m5-dev] changeset in m5: inorder: update twolf regression
changeset ba1ff0a71710 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=ba1ff0a71710 description: inorder: update twolf regression diffstat: 3 files changed, 143 insertions(+), 125 deletions(-) tests/long/70.twolf/ref/alpha/tru64/inorder-timing/config.ini |3 tests/long/70.twolf/ref/alpha/tru64/inorder-timing/simout |6 tests/long/70.twolf/ref/alpha/tru64/inorder-timing/stats.txt | 259 +- diffs (truncated from 446 to 300 lines): diff -r 0039707f915e -r ba1ff0a71710 tests/long/70.twolf/ref/alpha/tru64/inorder-timing/config.ini --- a/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/config.ini Mon Mar 22 23:39:23 2010 -0400 +++ b/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/config.ini Tue Mar 23 00:14:52 2010 -0400 @@ -79,6 +79,7 @@ latency=1000 max_miss_count=0 mshrs=10 +num_cpus=1 prefetch_data_accesses_only=false prefetch_degree=1 prefetch_latency=1 @@ -113,6 +114,7 @@ latency=1000 max_miss_count=0 mshrs=10 +num_cpus=1 prefetch_data_accesses_only=false prefetch_degree=1 prefetch_latency=1 @@ -147,6 +149,7 @@ latency=1 max_miss_count=0 mshrs=10 +num_cpus=1 prefetch_data_accesses_only=false prefetch_degree=1 prefetch_latency=10 diff -r 0039707f915e -r ba1ff0a71710 tests/long/70.twolf/ref/alpha/tru64/inorder-timing/simout --- a/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/simout Mon Mar 22 23:39:23 2010 -0400 +++ b/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/simout Tue Mar 23 00:14:52 2010 -0400 @@ -5,9 +5,9 @@ All Rights Reserved -M5 compiled Jan 29 2010 09:29:58 -M5 revision a196f8cf520a 6706 default qtip tip inorder_twolf_alpha -M5 started Jan 29 2010 09:31:14 +M5 compiled Mar 22 2010 23:40:09 +M5 revision ec3385b5d6df 7040 default qtip tip inorder_twolf_update +M5 started Mar 22 2010 23:40:10 M5 executing on zooks command line: build/ALPHA_SE/m5.fast -d build/ALPHA_SE/tests/fast/long/70.twolf/alpha/tru64/inorder-timing -re tests/run.py build/ALPHA_SE/tests/fast/long/70.twolf/alpha/tru64/inorder-timing Couldn't unlink build/ALPHA_SE/tests/fast/long/70.twolf/alpha/tru64/inorder-timing/smred.sav diff -r 0039707f915e -r ba1ff0a71710 tests/long/70.twolf/ref/alpha/tru64/inorder-timing/stats.txt --- a/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/stats.txt Mon Mar 22 23:39:23 2010 -0400 +++ b/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/stats.txt Tue Mar 23 00:14:52 2010 -0400 @@ -1,53 +1,60 @@ -- Begin Simulation Statistics -- -host_inst_rate 55182 # Simulator instruction rate (inst/s) -host_mem_usage 156168 # Number of bytes of host memory used -host_seconds 1665.47 # Real time elapsed on the host -host_tick_rate 59164617 # Simulator tick rate (ticks/s) +host_inst_rate 53958 # Simulator instruction rate (inst/s) +host_mem_usage 156280 # Number of bytes of host memory used +host_seconds 1703.24 # Real time elapsed on the host +host_tick_rate 57999569 # Simulator tick rate (ticks/s) sim_freq 1 # Frequency of simulated ticks sim_insts91903056 # Number of instructions simulated -sim_seconds 0.098537 # Number of seconds simulated -sim_ticks 98536744000 # Number of ticks simulated +sim_seconds 0.098787 # Number of seconds simulated +sim_ticks 98787075000 # Number of ticks simulated system.cpu.AGEN-Unit.instReqsProcessed 26537108 # Number of Instructions Requests that completed in this resource. -system.cpu.Branch-Predictor.instReqsProcessed 92657148 # Number of Instructions Requests that completed in this resource. -system.cpu.Branch-Predictor.predictedNotTaken 8232810 # Number of Branches Predicted As Not Taken (False). -system.cpu.Branch-Predictor.predictedTaken 2041716 # Number of Branches Predicted As Taken (True). -system.cpu.Decode-Unit.instReqsProcessed 92657148 # Number of Instructions Requests that completed in this resource. +system.cpu.Branch-Predictor.BTBHits 5943749 # Number of BTB hits +system.cpu.Branch-Predictor.BTBLookups9141724 # Number of BTB
[m5-dev] changeset in m5: m5merge: ruby + inorder
changeset 6ecffa1e1ba2 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=6ecffa1e1ba2 description: m5merge: ruby + inorder automerge of updated inorder regressions and ruby style pass diffstat: 6 files changed, 307 insertions(+), 270 deletions(-) tests/long/50.vortex/ref/alpha/tru64/inorder-timing/config.ini |3 tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simout |6 tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt | 300 +- tests/long/70.twolf/ref/alpha/tru64/inorder-timing/config.ini |3 tests/long/70.twolf/ref/alpha/tru64/inorder-timing/simout |6 tests/long/70.twolf/ref/alpha/tru64/inorder-timing/stats.txt | 259 diffs (truncated from 918 to 300 lines): diff -r bc0b6ea676b5 -r 6ecffa1e1ba2 tests/long/50.vortex/ref/alpha/tru64/inorder-timing/config.ini --- a/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/config.iniMon Mar 22 18:43:53 2010 -0700 +++ b/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/config.iniTue Mar 23 00:21:19 2010 -0400 @@ -79,6 +79,7 @@ latency=1000 max_miss_count=0 mshrs=10 +num_cpus=1 prefetch_data_accesses_only=false prefetch_degree=1 prefetch_latency=1 @@ -113,6 +114,7 @@ latency=1000 max_miss_count=0 mshrs=10 +num_cpus=1 prefetch_data_accesses_only=false prefetch_degree=1 prefetch_latency=1 @@ -147,6 +149,7 @@ latency=1 max_miss_count=0 mshrs=10 +num_cpus=1 prefetch_data_accesses_only=false prefetch_degree=1 prefetch_latency=10 diff -r bc0b6ea676b5 -r 6ecffa1e1ba2 tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simout --- a/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simoutMon Mar 22 18:43:53 2010 -0700 +++ b/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simoutTue Mar 23 00:21:19 2010 -0400 @@ -5,9 +5,9 @@ All Rights Reserved -M5 compiled Jan 30 2010 14:58:44 -M5 revision 4b602939e245 6707 default inorder_vortex_alpha qtip tip -M5 started Jan 30 2010 14:58:45 +M5 compiled Mar 22 2010 20:37:53 +M5 revision 8f0f6a2f2f48 7039 default qtip tip inorder_twolf qbase +M5 started Mar 22 2010 20:37:54 M5 executing on zooks command line: build/ALPHA_SE/m5.fast -d build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/inorder-timing -re tests/run.py build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/inorder-timing Global frequency set at 1 ticks per second diff -r bc0b6ea676b5 -r 6ecffa1e1ba2 tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt --- a/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt Mon Mar 22 18:43:53 2010 -0700 +++ b/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt Tue Mar 23 00:21:19 2010 -0400 @@ -1,53 +1,60 @@ -- Begin Simulation Statistics -- -host_inst_rate 51950 # Simulator instruction rate (inst/s) -host_mem_usage 166756 # Number of bytes of host memory used -host_seconds 1700.48 # Real time elapsed on the host -host_tick_rate 63220517 # Simulator tick rate (ticks/s) +host_inst_rate 52039 # Simulator instruction rate (inst/s) +host_mem_usage 166880 # Number of bytes of host memory used +host_seconds 1697.59 # Real time elapsed on the host +host_tick_rate 62436709 # Simulator tick rate (ticks/s) sim_freq 1 # Frequency of simulated ticks sim_insts88340673 # Number of instructions simulated -sim_seconds 0.107505 # Number of seconds simulated -sim_ticks107505320500 # Number of ticks simulated +sim_seconds 0.105992 # Number of seconds simulated +sim_ticks105992011500 # Number of ticks simulated system.cpu.AGEN-Unit.instReqsProcessed 35224018 # Number of Instructions Requests that completed in this resource. -system.cpu.Branch-Predictor.instReqsProcessed 88523379 # Number of Instructions Requests that completed in this resource. -system.cpu.Branch-Predictor.predictedNotTaken 10466150 # Number of Branches Predicted As Not Taken (False). -system.cpu.Branch-Predictor.predictedTaken 3314731 # Number of Branches Predicted As Taken (True). -system.cpu.Decode-Unit.instReqsProcessed 88523379 #
[m5-dev] changeset in m5: inorder: update hello world for alpha and mips
changeset 86558845c195 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=86558845c195 description: inorder: update hello world for alpha and mips diffstat: 6 files changed, 132 insertions(+), 118 deletions(-) tests/quick/00.hello/ref/alpha/linux/inorder-timing/config.ini |2 tests/quick/00.hello/ref/alpha/linux/inorder-timing/simout | 10 tests/quick/00.hello/ref/alpha/linux/inorder-timing/stats.txt | 213 +- tests/quick/00.hello/ref/mips/linux/inorder-timing/config.ini |2 tests/quick/00.hello/ref/mips/linux/inorder-timing/simout |8 tests/quick/00.hello/ref/mips/linux/inorder-timing/stats.txt | 15 diffs (truncated from 465 to 300 lines): diff -r ba1ff0a71710 -r 86558845c195 tests/quick/00.hello/ref/alpha/linux/inorder-timing/config.ini --- a/tests/quick/00.hello/ref/alpha/linux/inorder-timing/config.iniTue Mar 23 00:14:52 2010 -0400 +++ b/tests/quick/00.hello/ref/alpha/linux/inorder-timing/config.iniTue Mar 23 00:26:53 2010 -0400 @@ -191,7 +191,7 @@ env= errout=cerr euid=100 -executable=/proj/aatl_perfmod_arch/m5_system_files/regression/test-progs/hello/bin/alpha/linux/hello +executable=/dist/m5/regression/test-progs/hello/bin/alpha/linux/hello gid=100 input=cin max_stack_size=67108864 diff -r ba1ff0a71710 -r 86558845c195 tests/quick/00.hello/ref/alpha/linux/inorder-timing/simout --- a/tests/quick/00.hello/ref/alpha/linux/inorder-timing/simoutTue Mar 23 00:14:52 2010 -0400 +++ b/tests/quick/00.hello/ref/alpha/linux/inorder-timing/simoutTue Mar 23 00:26:53 2010 -0400 @@ -5,13 +5,13 @@ All Rights Reserved -M5 compiled Feb 24 2010 23:12:40 -M5 revision 1a33ca29ec29 6980 default share-aware-test-update.patch tip qtip -M5 started Feb 25 2010 02:21:00 -M5 executing on SC2B0619 +M5 compiled Mar 23 2010 00:24:02 +M5 revision ba1ff0a71710 7040 default tip +M5 started Mar 23 2010 00:24:03 +M5 executing on zooks command line: build/ALPHA_SE/m5.fast -d build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder-timing -re tests/run.py build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder-timing Global frequency set at 1 ticks per second info: Entering event queue @ 0. Starting simulation... info: Increasing stack size by one page. Hello world! -Exiting @ tick 31286000 because target called exit() +Exiting @ tick 31225500 because target called exit() diff -r ba1ff0a71710 -r 86558845c195 tests/quick/00.hello/ref/alpha/linux/inorder-timing/stats.txt --- a/tests/quick/00.hello/ref/alpha/linux/inorder-timing/stats.txt Tue Mar 23 00:14:52 2010 -0400 +++ b/tests/quick/00.hello/ref/alpha/linux/inorder-timing/stats.txt Tue Mar 23 00:26:53 2010 -0400 @@ -1,53 +1,60 @@ -- Begin Simulation Statistics -- -host_inst_rate 37021 # Simulator instruction rate (inst/s) -host_mem_usage 190468 # Number of bytes of host memory used -host_seconds 0.17 # Real time elapsed on the host -host_tick_rate 180549624 # Simulator tick rate (ticks/s) +host_inst_rate 30611 # Simulator instruction rate (inst/s) +host_mem_usage 153332 # Number of bytes of host memory used +host_seconds 0.21 # Real time elapsed on the host +host_tick_rate 149038484 # Simulator tick rate (ticks/s) sim_freq 1 # Frequency of simulated ticks sim_insts6404 # Number of instructions simulated sim_seconds 0.31 # Number of seconds simulated -sim_ticks31286000 # Number of ticks simulated +sim_ticks31225500 # Number of ticks simulated system.cpu.AGEN-Unit.instReqsProcessed 2050 # Number of Instructions Requests that completed in this resource. -system.cpu.Branch-Predictor.instReqsProcessed 6581 # Number of Instructions Requests that completed in this resource. -system.cpu.Branch-Predictor.predictedNotTaken 924 # Number of Branches Predicted As Not Taken (False). -system.cpu.Branch-Predictor.predictedTaken 143 # Number of Branches Predicted As Taken (True). -system.cpu.Decode-Unit.instReqsProcessed 6581 # Number of Instructions Requests that completed in this resource. +system.cpu.Branch-Predictor.BTBHits
Re: [m5-dev] Cron m5t...@zizzer /z/m5/regression/do-regression --scratch all
Steve, where are you finding the 'Invalid IPR' message? This bug is kind of elusive since everything is working fine on my local machine, but trying to run off zizzer is painfully slow when I ssh in. I was speculating that this might be a case of me different versions of gcc or maybe something subtle like 32 v. 64-bit machine causing some problem that I didnt account for. But again, that's just wild speculation. Anyway, I know I can solve the problem relatively quick I just need some help figuring out where to replicate the bug so I can knock it out. Ideas? On Mon, Mar 22, 2010 at 2:54 PM, Steve Reinhardt ste...@gmail.com wrote: We're still getting those 'invalid IPR' errors... On Sun, Mar 21, 2010 at 3:46 AM, Cron Daemon r...@zizzer.eecs.umich.edu wrote: * build/ALPHA_SE/tests/fast/long/70.twolf/alpha/tru64/inorder-timing FAILED! * build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/inorder-timing FAILED! ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Cron m5t...@zizzer /z/m5/regression/do-regression --scratch all
Lookat the simerr files in these directories on zizzer: /z/m5/regression/zizzer/m5/build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/inorder-timing /z/m5/regression/zizzer/m5/build/ALPHA_SE/tests/fast/long/70.twolf/alpha/tru64/inorder-timing Steve On Mon, Mar 22, 2010 at 9:32 PM, Korey Sewell ksew...@umich.edu wrote: Steve, where are you finding the 'Invalid IPR' message? This bug is kind of elusive since everything is working fine on my local machine, but trying to run off zizzer is painfully slow when I ssh in. I was speculating that this might be a case of me different versions of gcc or maybe something subtle like 32 v. 64-bit machine causing some problem that I didnt account for. But again, that's just wild speculation. Anyway, I know I can solve the problem relatively quick I just need some help figuring out where to replicate the bug so I can knock it out. Ideas? On Mon, Mar 22, 2010 at 2:54 PM, Steve Reinhardt ste...@gmail.com wrote: We're still getting those 'invalid IPR' errors... On Sun, Mar 21, 2010 at 3:46 AM, Cron Daemon r...@zizzer.eecs.umich.edu wrote: * build/ALPHA_SE/tests/fast/long/70.twolf/alpha/tru64/inorder-timing FAILED! * build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/inorder-timing FAILED! ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev