I actually looked at the code a bit this time; and I have a hypothesis that the problem arises from two similar but fundamentally different models of "bypassing" potential event-based delays:
main() { x_will_callback = x(); if (!x_will_callback) y(); } x() { if (...) { sched_callback(&cb); return true; } else { return false; } } cb() { y(); } as opposed to: main() { x(); } x() { if (...) { sched_callback(&cb); } else { y(); /* or cb(); */ } } cb() { y(); } Both of these have the overall effect of calling x() and then y(), sometimes with a delay and sometimes not. However in the latter case y() is called from inside the call to x(), which leads to problems when that's not expected... basically this is the root of the initiateAcc/completeAcc problem. Also if there's a cycle (like there is in our pipeline) where you do x,y,z,x,y,z,x,y,z then as Gabe points out you can run into stack overflow problems too. My hypothesis is that the old TimingSimpleCPU code worked because it always did the former, and Gabe has introduced two points that do the latter: one in timingTranslate(), and one in fetch(). I think the right solution is that for each of these we should either change it into the first model or eliminate the bypass option altogether and always do a separately scheduled callback. I think the distinction of having main() call y() directly rather than x_cb() is potentially important, as this gives you points where you can do slightly different things depending on whether you did the event or bypassed it. It also (to me) provides some logical separation between "what comes next" (the code in y()) and how you got there. Coming at this from a different angle, while the code is getting increasingly messy (or maybe just inherently complex), I'd say a significant fraction of the complexity is dealing with cache/page-crossing memory operations, which I don't think would be significantly improved by a global restructuring. (Let me know if anyone thinks otherwise.) Thus I'm not too keen on doing a significant restructuring since I think the code will still be messy afterward. On Wed, May 6, 2009 at 11:42 AM, Gabriel Michael Black < gbl...@eecs.umich.edu> wrote: > The example I mentioned would be if > you have a microcode loop that doesn't touch memory to, for instance, > stall until you get an interrupt or a countdown expires for a small > delay. Although I agree that it's good to avoid this possibility altogether, I'd argue that any microcode loop like you describe is broken. If for no other reason than power dissipation I don't think you'd ever want to busy-wait in a real system, and certainly even if you did we wouldn't want to write it that way in m5 for performance reasons. Steve
_______________________________________________ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev