Re: [marss86-devel] Ridiculous cache corner cases: annulled entries + dependencies + high churn

DRAM Ninjas Sun, 03 Apr 2011 17:19:43 -0700

Actually I would hold off on changing that code. I realized that the problem
is a little more subtle than what I initially thought. So storing the
address was a good detection algorithm, but it doesn't fix the problem
fully.


Going back to my example, if entry A will wake up B and X takes B's spot,
then you can build an incorrect dependence chain such that:

A--> X --> Y (where Y is some valid request that comes in while A is still
waiting for memory)

But now, if you simply detect that X is the wrong entry and drop it, Y (a
legitimate dependency) will never be sent for access. I have a fix that I'm
currently testing (essentially going through the pending list and removing
any depends fields that point to freed entries),

If it works, I'll let you know.

On Sun, Apr 3, 2011 at 8:11 PM, avadh patel <[email protected]> wrote:

>
>
> On Sat, Apr 2, 2011 at 10:27 PM, DRAM Ninjas <[email protected]> wrote:
>
>> So actually I just noticed that annulled entries don't seem to get freed
>> at all ... the only .free() in cacheController.cpp is in clear_entry_cb
>> which will only free an entry if it's not annulled ....
>>
>> 'annul_request' function calls 'clear_entry_cb' right before it sets the
> 'annuled' flag to 'true'. So 'clear_entry_cb' will free this annulled entry.
>
> Regarding your previous email, thanks for finding this bug. Your idea of
> using 'address' to verify that dependent entry really need to be waken up is
> good and I'll change the code to fix this problem.
>
> Thanks,
> Avadh
>
>
>> Maybe the contradicts my whole theory ... or perhaps I'm going crazy at
>> this point ...
>>
>> On Sat, Apr 2, 2011 at 7:58 PM, DRAM Ninjas <[email protected]> wrote:
>>
>>> Let me start off by saying that the cache code works fine in the mainline
>>> MARSSx86, however I've come across an interesting corner case that could
>>> arise when trying to do something like what I am trying to do: writing a
>>> ridiculous bus model that will push as many requests to the memory system as
>>> fast as possible. It might be something to think about since this paradigm
>>> is used throughout the ptlsim code.
>>>
>>> I have an 8 core setup with a shared L2. All of the L1I/D and the L2
>>> caches are on the mesiBus. When a request arrives at the L2 cache, the code
>>> checks for dependencies. It finds the last dependence in the chain, stores
>>> the idx value in the CacheQueueEntry->depends, and stashes the request for
>>> later. Later on, when a request goes through clear_entry_cb, it will check
>>> if something was dependent on it and then schedule that dependent entry for
>>> cache access.
>>>
>>> So far so good.
>>>
>>> What happens in a pathological case like the one I've built is that the
>>> rate at which requests pile up in the L2 is so high that the following
>>> happens:
>>>
>>> - Request B comes into the L2, finds a dependence on request A
>>> - RequestA->depends is set to the idx of B (since now B depends on A, A
>>> will wake up B when A finishes)
>>> - Let's say request A is waiting on a memory request so its takes at
>>> least 100 cycles
>>> - RequestB is annulled because an L1 cache fulfilled the request
>>> (RequestB's pending entry is freed)
>>> - 100 more requests come in while A is waiting on memory
>>> - since B got annulled, and 100 more requests came in, the slot that used
>>> to be occupied by B (which is now annulled and dead) is now occupied by X
>>> - A completes the memory access, executes  and wakes up the dependent
>>> entry: pendingRequests_[depends] -- which is the idx of where B was, but is
>>> now X (and since X is valid, A cannot tell the difference since all it has
>>> is the idx)
>>> - X is incorrectly woken up and sent for cache access
>>>
>>> It took me about 2 weeks to figure out what was happening. A normal
>>> mesiBus won't encounter this corner case because it's way too slow and so
>>> there will never be a high enough request rate to have the pendingRequests_
>>> cycle back around and refill that entry with something else. Chances are
>>> it'll just stay in that queue and just be marked free (invalid).
>>>
>>> However, it's something to consider since this type of memory pooling is
>>> very prevalent throughout the code.
>>>
>>> The way I'm currently mitigating this error is by storing not only the
>>> idx of the dependent entry, but also the address of that entry. That way, if
>>> I find idx but the address doesn't match what I saved back when it was
>>> queued, I can safely assume that the entry got annulled and someone else
>>> took its place, so I just ignore the dependent entry.
>>>
>>>
>>>
>>>
>>>
>>>
>>
>> _______________________________________________
>> http://www.marss86.org
>> Marss86-Devel mailing list
>> [email protected]
>> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>>
>>
>

_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

Re: [marss86-devel] Ridiculous cache corner cases: annulled entries + dependencies + high churn

Reply via email to