In sum, I think we all agree that Ruby is going to handle *only non-speculative stores*. M5 CPU model(s) handles all of speculative and non-speculative stores that are *yet to be revealed to the memory sub-system*.

To make it clearer, as I understand,  we now have following:

1. All store buffering (speculative and non-speculative) is handled by CPU model in M5. 2. Ruby needs to forward intervention/invalidation received at L1 cache controller to the CPU model to let it take appropriate action to guarantee required memory consistency guarantees (e.g t may need to flush pipeline).
                        OR
CPU models need to check coherence permission at L1 cache at the commit time to know if intervening writes has happened or not (might be required to implement stricter model like SC).

I think we need to provide one of the functionality from Ruby side to allow the second condition above. Which one to provide depends upon what M5 CPU models wants to do to guarantee consistency.

Please let me know if you disagree or if I am missing something.

Thanks
Arka




On 02/24/2011 05:22 PM, Beckmann, Brad wrote:
So I think Steve and I are in agreement here.  We both agree that both 
speculative and non-speculative store buffers should be on the CPU side of the 
RubyPort interface.  I believe that was the same line that existed when Ruby 
tied to Opal in GEMS.  I believe the non-speculative store buffer was only a 
feature used when Opal was not attached, and it was just the simple 
SimicsProcessor driving Ruby.

The sequencer is a separate issue.  Certain functionality of the sequencer can 
probably be eliminated in gem5, but I think other functionality needs to remain 
or at least be moved to some other part of Ruby.  The sequencer performs a lot 
of protocol independent functionality including: updating the actual data 
block, performing synchronization with respect to the cache memory, translating 
m5 packets to ruby requests, checking for per-cacheblock deadlock, and 
coalescing requests to the same cache block.  The coalescing functionality can 
probably be eliminated, but I think the other functionality needs to remain.

Brad


From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Steve Reinhardt
Sent: Thursday, February 24, 2011 1:52 PM
To: M5 Developer List
Subject: Re: [m5-dev] Store Buffer


On Thu, Feb 24, 2011 at 1:32 PM, Nilay 
Vaish<ni...@cs.wisc.edu<mailto:ni...@cs.wisc.edu>>  wrote:
On Thu, 24 Feb 2011, Beckmann, Brad wrote:
Steve, I think we are in agreement here and we may just be disagreeing with the 
definition of speculative.  From the Ruby perspective, I don't think it really 
matters...I don't think there is difference between a speculative store address 
request and a prefetch-with-write-intent. Also we agree that probes will need 
to be sent to O3 LSQ to support the consistency model.
My point is that if we believe this functionality is required, what is the 
extra overhead of adding a non-speculative store buffer to the O3 model as 
well?  I think that will be easier than trying to incorporate the current Ruby 
non-speculative store buffer into each protocol.

I don't know the O3 LSQ model very well, but I assume it buffers both 
speculative and non-speculative stores.  Are there two different structures in 
Ruby for that?

I think the general issue here is that the dividing line between "processor" and "memory 
system" is different in M5 than it was with GEMS. with M5 assuming that write buffers, redundant request 
filtering, etc. all happens in the "processor".  For example, I know I've had you explain this to 
me multiple times already, but I still don't understand why we still need Ruby sequencers either :-).

Brad, I raise the same point that Arka raised earlier. Other processor models 
can also make use of store buffer. So, why only O3 should have a store buffer?

Nilay, I think that's a different issue... we're not saying that other CPU 
models can't have store buffers, but in practice, the simple CPU models block 
on memory accesses so they don't need one.  If the inorder model wants to add a 
store buffer (if it doesn't already have one), it would be an internal decision 
for them whether they want to write one from scratch or try to reuse the O3 
code.  There are already some shared structures in src/cpu like branch 
predictors that can be reused across CPU models.

So in other words we need to decide first where the store buffer should live 
(CPU or memory system) and then we can worry about how to reuse that code if 
that's useful.
Steve



_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to