[Simflex] A private L2 cache

lu peng Thu Apr 27 23:02:17 2006

An HTML attachment was scrubbed...
URL: 
http://sos.ece.cmu.edu/pipermail/simflex/attachments/20060428/c47d160c/attachment.html
From mrinal at ece.umn.edu  Fri Apr 28 16:55:52 2006
From: mrinal at ece.umn.edu (Mrinal Nath)
List-Post: [email protected]
Date: Fri Apr 28 16:56:10 2006
Subject: [Simflex] Questions about Simflex
Message-ID: <[email protected]>

Hi,
I had a few questions:

1. In ArchitecturalInstruction, the data gathered from Simics in kept in 
a "DoubleWord", which has a mask indicating the valid bytes in the 
double word and the double word itself is unsigned. Whereas in 
MemoryMessage, the field which could store data is DataWord, which is 
supposed a 64-bit (v9) signed number.
    Why is this difference there? Why was the data in MemoryMessage not 
kept the same way as it was kept in ArchitecturalInstruction.

2. Why is the namespace "APIFwd" required? It looks like all it does is 
wrap the functions of namespace "API".

3. I have read somewhere that the L1 cache is exclusive, in that lines 
in L1 are not kept in L2. Could someone give some idea which lines of 
code implement this functionality. Also, if I want to change the L2 
cache to inclusive, so that lines in L1 are also present in L2, then how 
can I go about doing that?

Thanks a lot
- Mrinal
From twenisch at ece.cmu.edu  Fri Apr 28 18:42:03 2006
From: twenisch at ece.cmu.edu (Thomas Wenisch)
List-Post: [email protected]
Date: Fri Apr 28 18:42:15 2006
Subject: [Simflex] Re: A private L2 cache
In-Reply-To: <[email protected]>
References: <[email protected]>
Message-ID: <[email protected]>

Hi Lu,

On Fri, 28 Apr 2006, lu peng wrote:

> Hi Tom,
> 
> How are you recently?

Good, thanks.

> I am thinking of implementing a private L2 cache. According to an old 
> post, I should modify the wiring.cpp. Anything else I should to do? Will 
> this violate the coherence protocol? In addition, is it reasonable that 
> a CMP with exclusive L1s and private L2s?

Are you trying to implement a CMP with 3 levels of cache (Private L1, 
Private L2, Shared L3) or just two levels of cache (Private L1, Private 
L2, shared off-chip interface but no shared cache)?

The 3-level cache breaks assumptions of the CMP coherence protocol, but 
there is a hackish way to make it work with relatively few code 
changes.  The 2-level private hierarchy is straight-forward, but requires 
writing a new (very simple) component to perform snooping coherence 
between the private L2s and multiplex the off-chip interface.

Implementing exclusive caches will require more significant code 
changes--exclusivity requires changing the meaning of a variety of 
messages (i.e., adding request messages that don't allocate, and instead 
allocating on evictions).

What design are you trying to implement?

Regards,
-Tom Wenisch
Computer Architecture Lab
Carnegie Mellon University
From twenisch at ece.cmu.edu  Fri Apr 28 19:08:47 2006
From: twenisch at ece.cmu.edu (Thomas Wenisch)
List-Post: [email protected]
Date: Fri Apr 28 19:08:55 2006
Subject: [Simflex] Questions about Simflex
In-Reply-To: <[email protected]>
References: <[email protected]>
Message-ID: <[email protected]>

Hi Mrinal,

On Fri, 28 Apr 2006, Mrinal Nath wrote:

> Hi,
> I had a few questions:
>
> 1. In ArchitecturalInstruction, the data gathered from Simics in kept in a 
> "DoubleWord", which has a mask indicating the valid bytes in the double word 
> and the double word itself is unsigned. Whereas in MemoryMessage, the field 
> which could store data is DataWord, which is supposed a 64-bit (v9) signed 
> number.
>   Why is this difference there? Why was the data in MemoryMessage not kept 
> the same way as it was kept in ArchitecturalInstruction.

The DoubleWord class contains functionality for combining, extracting, and 
comparing non-64-bit-aligned values.  This functionality is used in the 
Execute component when running with the TSO memory model.  The Execute 
component must be able to check if a load operation can be satisfied by a 
store (or several coalesced stores) in the store buffer despite 
differences in their alignment (i.e., a byte load which forwards from a 
word store).

In the memory system, we figured that comparing differently-aligned values 
wouldn't be necessary, so we would just LSB-justify anything in the data 
field, and include a size field instead of a valid mask.  The 
DoubleWord class adds a lot of overhead.  However, I don't think it would 
cause any problems to use DoubleWord in MemoryMessage (note that most 
components do not maintain these data fields.)

>
> 2. Why is the namespace "APIFwd" required? It looks like all it does is wrap 
> the functions of namespace "API".

This indirection was added so that we could build simulation targets using 
Flexus that run stand-alone (without Simics).  For functionality we expect 
Simics to supply, alternate implementations can be provided.  The 
namespace API always refers to Simics, while APIFwd may either contain 
forwarding functions or proxy implementations.

>
> 3. I have read somewhere that the L1 cache is exclusive, in that lines in L1 
> are not kept in L2. Could someone give some idea which lines of code 
> implement this functionality. Also, if I want to change the L2 cache to 
> inclusive, so that lines in L1 are also present in L2, then how can I go 
> about doing that?
>

The Flexus cache hierarchy is not exclusive, but it is non-inclusive. 
'Exclusive' implies that the L1 and L2 cache never contain the same blocks 
(like some AMD processors).  'Inclusive' implies that any block in L1 is 
also in L2.  By 'non-inclusive', I mean that the L1 and L2 perform their 
replacements independantly, so blocks may or may not be present at both 
hierarchy levels.

To maintain inclusion, when L2 chooses a victim for replacement, it must 
force this block to be evicted from L1 as well (if present).  You can 
implement this by adding new messages that work like an Invalidation 
message, and having L2 send messages to L1 on each replacement (there are 
probably some messy race conditions here to work out--you will have to 
study the implementation of Invalidations).  If you want to avoid the 
overhead of sending messages to L1 on *every* L2 replacement, even when 
the block is not present in L1, you could add a state bit to L2 to track 
if the block is in L1. Then, L1 must notify L2 upon L1 evictions (via 
CleanEvict messages).

Regards,
-Tom Wenisch
Computer Architecture Lab
Carnegie Mellon University

[Simflex] A private L2 cache

Reply via email to