Hi all, I am trying to implement a two level CMP cache design, which has private L1 and private L2 caches, based on the components provided by Flexus. The existing simulator CMPFlex has private L1 cache (Cache component) and shared L2 cache (CmpCache component), both having cache contorllers but different cache controller implementations. In this case, The shared L2 cache is responsible for the coherence among different private L1 caches. I have read the souce codes in these components, and I think I can use the Cache component as my private L2 cache if only modifying the ports to connnect to the L1 caches in the front side and the shared bus in the back side. However, how can I maintain the coherence of the private L2 caches? I noticed that the TraceFlex has the same structure as I desired, and it uses FastBus component as the interconnection to the different L2 caches to perform the coherence. So I intended to focus on Fastbus rather than CmpCache.
1. In CMPFlex, each Cache component has three ports (Request, Snoop, Out) in both front and back sides, but FastBus has only two ports (FromCaches, ToSnoops) in front side. How can I connect them? or What are the main functions of the various ports, respectively? 2. In TraceFlex, Fastbus isn't connected to the memory, so the back side ports (Writes, Reads, Evictions, etc.) are not used, right? Why is that? 3. There are also two ports (DMA, NonAllocateWrite) in FastBus connected to the feeder. What are they used for? Do I need to use them in my implementation? Most likely, I will implement a new component as an external shared bus connected to the L2 caches, just like what FastBus does in TraceFlex. But I am worrying about the correctness of the coherence. Do you have any suggestion to simplify the implementation? Any help would be appreciated! Regards, Lide -------------- next part -------------- An HTML attachment was scrubbed... URL: http://sos.ece.cmu.edu/pipermail/simflex/attachments/20070224/50077be7/attachment.html From jsmolens+ at ece.cmu.edu Tue Feb 27 14:29:12 2007 From: jsmolens+ at ece.cmu.edu (Jared C. Smolens) List-Post: [email protected] Date: Tue Feb 27 14:29:17 2007 Subject: [Simflex] Private L2 caches design Message-ID: <029747430.1172604...@miura> Hi Lide, If you only want to have two levels of cache (L1 & L2, both private to each core and no shared cache), you might actually be able to use the DSMFlex simulators, after re-tuning for on-chip CMP latencies/bandwidth. The Cache/CmpCache components are used for "timing" simulations, whereas the TraceFlex simulator's Fast* components are for "functional" simulations (where all cache transactions are atomic and have zero latency). If you want correct coherence with timing, you will have to use the Cache/CmpCaches. 1. The snoop/request channels exist to prevent races between requests and acknowledgements which can occur in timing simulations. The "snoop" channel is a high priority channel for acknowledgement and eviction messages, while the request channel sends request messages. Prioritizing the snoop channel allows older requests to complete before starting new ones, avoiding deadlock scenarios. The Fast components have no concurrency and, therefore, don't need these channels. Their implementation is also far simpler because of this. 2. I'm not sure on this one. 3. We have found that DMA and non-allocating writes are important for correctly modeling cache behaviors of I/O-intensive commercial workloads. - Jared Excerpts From "Lide Duan" <[email protected]>: [Simflex] Private L2 caches design: "Lide Duan" <[email protected]> >Hi all, > >I am trying to implement a two level CMP cache design, which has private L1 >and private L2 caches, based on the components provided by Flexus. The >existing simulator CMPFlex has private L1 cache (Cache component) and shared >L2 cache (CmpCache component), both having cache contorllers but different >cache controller implementations. In this case, The shared L2 cache is >responsible for the coherence among different private L1 caches. I have read >the souce codes in these components, and I think I can use the Cache >component as my private L2 cache if only modifying the ports to connnect to >the L1 caches in the front side and the shared bus in the back side. >However, how can I maintain the coherence of the private L2 caches? I >noticed that the TraceFlex has the same structure as I desired, and it uses >FastBus component as the interconnection to the different L2 caches to >perform the coherence. So I intended to focus on Fastbus rather than >CmpCache. > >1. In CMPFlex, each Cache component has three ports (Request, Snoop, Out) in >both front and back sides, but FastBus has only two ports (FromCaches, >ToSnoops) in front side. How can I connect them? or What are the main >functions of the various ports, respectively? >2. In TraceFlex, Fastbus isn't connected to the memory, so the back side >ports (Writes, Reads, Evictions, etc.) are not used, right? Why is that? >3. There are also two ports (DMA, NonAllocateWrite) in FastBus connected to >the feeder. What are they used for? Do I need to use them in my >implementation? >Most likely, I will implement a new component as an external shared bus >connected to the L2 caches, just like what FastBus does in TraceFlex. But I >am worrying about the correctness of the coherence. Do you have any >suggestion to simplify the implementation? > >Any help would be appreciated! > >Regards, >Lide
