On Thu, 10 Nov 2011, Steve Reinhardt wrote:
On Thu, Nov 10, 2011 at 8:12 PM, Nilay Vaish <[email protected]> wrote:
On Thu, 10 Nov 2011, Steve Reinhardt wrote:
- While TSO is a valid implementation of the Alpha memory model, we should
not unnecessarily restrict performance by constraining memory order. Note
that even though Alpha is not used much, we have other ISAs (most notably
ARM but also Power) that have weak memory models. In the near term it's
fine to just work on implementing TSO without considering Alpha, but for
the final commit it would be good to find a minimal set of changes that
enforce TSO and condition them on the ISA being x86 (or if we want to get
fancy, we could introduce a "memory consistency model" flag and set it to
TSO when the ISA is x86).
I think the minimal change is to allow only one store to be in flight.
This means that once a store issued to the memory system, the load store
queue waits till the store gets done, before issuing another store. The
load store queue already forwards values from prior stores to the loads.
Sorry, I didn't phrase that quite right... I was not implying that we
should make the minimal set of changes necessary to implement TSO. I meant
that once we make whatever set of changes we want to make for TSO
(preferably a reasonably aggressive one), then we should find a minimal
subset of those changes to condition on the ISA.
- Since we need to implement some consistency mechanism in O3 more or less
from scratch, I suggest we do a reasonably aggressive mechanism that
corresponds most closely with what modern processors do, without being
overly complicated. O3 is not intended to be an extremely accurate model
of any particular modern CPU, but we don't want to create unnecessary
differences between its behavior and that of a typical modern CPU either.
We need to decide on this aggressive mechanism. Brad outlined several in
one of his previous emails. The approach described above is essentially the
first proposal that Brad suggested.
Right, it looks like Brad's proposals were in order of aggressiveness, and
I was thinking that we should probably go for #2 or #3.
I am thinking of implementing a modified version Brad's second proposal.
Each store accesses the cache at most twice. Once, when the store is
committed, following cases can happen --
* store is at the head of the store buffer, then issue write req to
memory system
* store is not at the head of the store buffer. If architecture has TSO
as memory model, then issue a read exclusive request to the memory system.
Else if a relaxed model is in place, a write request is sent to the memory
system.
Once the store reaches the head of the store buffer, the actual store is
issued in case of TSO. Even if the store buffer receives an invalidation
for some address to which a store exists in the buffer, the read ex
request would not be issued again.
--
Nilay
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev