Yea, the protocol does assume that a full cache-block request is from another coherent entity. Has O3 always fetched full cache blocks at a time? If so, then I'm also surprised we haven't seen it (or maybe we have seen it, but not recognized it).
Getting rid of this code would solve the problem, but as you say, would degrade performance in the case where an L2 could hand ownership to an L1 dcache and avoid a later upgrade transaction, particularly since (IIRC) an L1 dcache miss that also misses in the L2 is handled as a pair of mostly independent misses, thus getting rid of this optimization would mean that a cold read miss in a multilevel cache could never take fully advantage of the E state. (Which was why I added it in to begin with.) Another question is whether it ever makes sense for an icache to be an owner of a dirty block... I'd think not. So a third possible solution would be to add a flag (or a distinct request type) to distinguish the situation where a read is OK with getting an exclusive/owned copy from one where it isn't, and factor that into the condition on this code. Then you could flag icaches to only issue the latter type of read, and the icache would never get a dirty copy. Of course, you'd want to have the O3 fetch stage use this type of request too (for completeness), and you'd still have to have a parameter indicating that this is an icache and should use this different request type, so it's just as complicated as your option 1 (maybe a little more so) and may not have a really significant impact, but yet it would be more realistic in that the same L2 could perform this optimization for L1 dcaches but not L1 icaches and the icache would never have a dirty block. Steve On Sun, Feb 20, 2011 at 9:05 AM, Ali Saidi <[email protected]> wrote: > If you look at the attached annotated trace you can see that a cache block > was written to, was dirty and then the dirty flag goes away at some point. I > traced it down to this code in the cache: > // special considerations if we're owner: > if (!deferred_response) { > // if we are responding immediately and can > // signal that we're transferring ownership > // along with exclusivity, do so > pkt->assertMemInhibit(); > blk->status &= ~BlkDirty; > > What seems to be happening is that the o3 cpu's fetch stage is grabbing an > entire block at a time, which makes the L1I cache believe it can provide > ownership to the cache above it (there isn't one) during the fetch. The > fetch stage doesn't do anything special when it's provided ownership of the > block, nor should it ever be provided ownership, so the information gets > lost. I'm actually very surprised we haven't hit this before. Anyway, the > question is how to fix it. I can think of two solutions and I'm going to > implement (1) if no one has a better suggestion. > > 1) Add a parameter indicating that the cache is at the top level and > disable this stuff when the parameter is set > 2) Not do anything special (remove the optimization above, which make the > protocol less performant) > 3) ???? > > > Ali > > > > > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev > >
_______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
