Let me start off by saying that the cache code works fine in the mainline
MARSSx86, however I've come across an interesting corner case that could
arise when trying to do something like what I am trying to do: writing a
ridiculous bus model that will push as many requests to the memory system as
fast as possible. It might be something to think about since this paradigm
is used throughout the ptlsim code.

I have an 8 core setup with a shared L2. All of the L1I/D and the L2 caches
are on the mesiBus. When a request arrives at the L2 cache, the code checks
for dependencies. It finds the last dependence in the chain, stores the idx
value in the CacheQueueEntry->depends, and stashes the request for later.
Later on, when a request goes through clear_entry_cb, it will check if
something was dependent on it and then schedule that dependent entry for
cache access.

So far so good.

What happens in a pathological case like the one I've built is that the rate
at which requests pile up in the L2 is so high that the following happens:

- Request B comes into the L2, finds a dependence on request A
- RequestA->depends is set to the idx of B (since now B depends on A, A will
wake up B when A finishes)
- Let's say request A is waiting on a memory request so its takes at least
100 cycles
- RequestB is annulled because an L1 cache fulfilled the request (RequestB's
pending entry is freed)
- 100 more requests come in while A is waiting on memory
- since B got annulled, and 100 more requests came in, the slot that used to
be occupied by B (which is now annulled and dead) is now occupied by X
- A completes the memory access, executes  and wakes up the dependent entry:
pendingRequests_[depends] -- which is the idx of where B was, but is now X
(and since X is valid, A cannot tell the difference since all it has is the
idx)
- X is incorrectly woken up and sent for cache access

It took me about 2 weeks to figure out what was happening. A normal mesiBus
won't encounter this corner case because it's way too slow and so there will
never be a high enough request rate to have the pendingRequests_ cycle back
around and refill that entry with something else. Chances are it'll just
stay in that queue and just be marked free (invalid).

However, it's something to consider since this type of memory pooling is
very prevalent throughout the code.

The way I'm currently mitigating this error is by storing not only the idx
of the dependent entry, but also the address of that entry. That way, if I
find idx but the address doesn't match what I saved back when it was queued,
I can safely assume that the entry got annulled and someone else took its
place, so I just ignore the dependent entry.
_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

Reply via email to