Brad, your reply clears some air.

The current patch allows us to use the existing O3 CPU with Ruby. Since the O3 CPU already provides Alpha's memory model, we get that for free. Now that we would like to have TSO as well, we need to work out how the two models would co-exist. I'll think more about this, but we need a broader consensus on this.

--
Nilay

On Wed, 9 Nov 2011, Beckmann, Brad wrote:

I see. It sounds like you're still worried about how the RubyPort can support multiple M5 cpu ports and still adhere to a stronger consistency model. Sorry for not directly responding to that question earlier, but to me that seems like an orthogonal issue that you've already solved. If I recall correctly, the patch you sent out for review essentially attaches the multiple M5 cpu ports, representing simultaneous cpu requests, to the single RubyPort that represents the CPUs connection to the L1 caches. That seems reasonable to me and I don't see any problem with it. The key is that the cpu LSQ cannot blindly issue simultaneous requests to the memory system without expecting and acting upon probes that occur between issue and retirement. Furthermore, the CPU needs to communicate to Ruby when the instructions associated with the memory operations retire (for loads) or reach the head of the store buffer (for stores). Once Ruby receives that notification, it can stop monitoring that location and move the cache block to a base state.

Now to answer your specific question: We are definitely interested in a TSO model and in my opinion that is the only consistency model that we have to implement. Remember TSO is a valid implementation of Alpha's or ARM's weaker models. We can certainly implement subsequent models, but that should not be a short term goal.

I know this can be a complicated subject so please send me questions if you disagree or are confused. I certainly may be overlooking something and my thoughts are constantly evolving as well as I page more of this into my memory. For instance, I realize that my previous mail was incorrect because I confused the LSQ, which contains pre-retirement memory instructions, with the store buffer, which contains post-retirement store instruction values. If a probe hits in the store buffer, the CPU doesn't (it can't) reissue the store instruction. The store buffer shields the CPU from that probe. As long as the cache has write permission when the store reaches the head of the store buffer, stores have a global order and TSO is maintained. Of course probing loads in the LSQ also needs to occur, along with several other features for supporting locks, fences, etc.

If you do have further questions, please be specific as possible. It is hard to talk about this subject using generalities.

Brad


_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to