On Thu, 10 Nov 2011, Steve Reinhardt wrote:
It sounds like you guys are doing a good job of working this out, but I
have a few comments. Sorry not to jump in sooner.
- It's absolutely true that O3 was written for Alpha and as such did not
have to worry about the CPU model enforcing any memory orderings beyond the
explicit barrier/fence ops. It's no surprise that O3 needs to be modified
to support stronger consistency models.
- While TSO is a valid implementation of the Alpha memory model, we should
not unnecessarily restrict performance by constraining memory order. Note
that even though Alpha is not used much, we have other ISAs (most notably
ARM but also Power) that have weak memory models. In the near term it's
fine to just work on implementing TSO without considering Alpha, but for
the final commit it would be good to find a minimal set of changes that
enforce TSO and condition them on the ISA being x86 (or if we want to get
fancy, we could introduce a "memory consistency model" flag and set it to
TSO when the ISA is x86).
I think the minimal change is to allow only one store to be in flight.
This means that once a store issued to the memory system, the load store
queue waits till the store gets done, before issuing another store. The
load store queue already forwards values from prior stores to the loads.
- Since we need to implement some consistency mechanism in O3 more or less
from scratch, I suggest we do a reasonably aggressive mechanism that
corresponds most closely with what modern processors do, without being
overly complicated. O3 is not intended to be an extremely accurate model
of any particular modern CPU, but we don't want to create unnecessary
differences between its behavior and that of a typical modern CPU either.
We need to decide on this aggressive mechanism. Brad outlined several in
one of his previous emails. The approach described above is essentially
the first proposal that Brad suggested.
- If we need to make some changes in the Port interface to make this work
well, that's OK. Someday I would still like to see Port and RubyPort
integrated so we don't have to do a translation between the two structs on
every memory access. That probably doesn't affect this directly, but it's
good to keep in mind as we evolve the code.
--
Nilay
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev