Hi Shan, We base our L2 CMP Cache design on Piranha.
It may be helpfu for you to read the following paper to determine what we have done and why. Piranha: A Scalable Architecture Based on Single-Chip Multiprocessing, ISCA 2000 http://www.research.compaq.com/wrl/projects/Database/isca00.pdf Regards, -Tom Wenisch On Wed, 26 Oct 2005, shan wrote: > I see. > Emm, much more clear now :). > > A last question about this cache module is: is this a common way to > implement cache coherence in shared L2 CMP? I mean this one directory based > implementation. I know another way is to put M-O-S-I or M-E-S-I state in > each L1 cache's cache line, instead of sharing them. Which one is more > popular? Or equally popular? > > Thanks very much > Shan > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On > Behalf Of Jared C. Smolens > Sent: Tuesday, October 25, 2005 11:22 PM > To: [email protected] > Subject: RE: [Simflex] CMP cache coherence protocol question (fwd) > > > Hi Shan, > > The idea behind "owner" is to track which cache is responsible for the > last copy of the block in the CMP. This is independent of whether the > line is modifiable or not. > > The meaning in the M state is clear, since there should only be one cache > with permission to modify the block. > > In the S state, the owner isn't important, since all caches are equal. > However, if the line only exists in an L1 (and not the shared L2), the > owner field reflects that. When the line is eventually replaced by the > owner's L1, instead of silently dropping the block, it is then placed in > the L2 as a victim (this becomes the case you described below). As you > correctly said, no caches in the CMP have permission to write to the line > when the directory is in S state. To write, permission must first be > requested from memory. > > There is no function named "SendRequest" in the Piranha* code, but you > probably really meant "sendMessage". In all cases, messages in Piranha* > are sent from the perspective of the shared L2 cache (so sending a > message TO the L2 is nonsensical) and there is a well-defined > destination. Here's a quick rundown of the methods that are used to send > messages and what they do: > > 1) sendBroadcast() - send a message to all L1 sharers within the CMP, > typically used for invalidations. The coreIdx is always set within > sendBroadcast(). > > 2) sendOwnerRequest() - send a message to the L1 owner of the block. > This is only happens when an L1 is the owner and coreIdx is set to the > owner. > > 3) setupExternalRequest() - send a message to external memory. Here, the > L2 makes an off-chip request. There is no notion of an off-chip core in > this system, therefore none is set. > > Cheers, > > Jared > > Excerpts From "shan" <[email protected]>: > RE: [Simflex] CMP cache coherence p: "shan" <[email protected]> > > I see. > > Emm, I am not sure if I catch the idea. > > My current understanding is, for example, the 'owner' means which L1 > or L2 > >caches own the writable version of the block. If the current MESI state > is > >'S', then the owner is the L2 cache and the copies of this cache line in > all > >sharer do not have 'modifiable' privilege. Is my understanding correct? > > B.T.W., usually how can I tell what's the receiver of a message from > the > >code? In the cache PiranhaCacheXXX files, there are a lot of > SendRequest, > >are those messages send to L2 cache itself, if the request's coreIndx is > not > >specially set? > > > >Thanks very much > >shan > > > > > >-----Original Message----- > >From: [email protected] [mailto:[email protected]] > On > >Behalf Of Thomas Wenisch > >Sent: Tuesday, October 25, 2005 1:50 PM > >To: [email protected] > >Subject: Re: [Simflex] CMP cache coherence protocol question (fwd) > > > >Hi Shan, > > > >See below. We will try to get the state diagram out soon. > > > >Regards, > >-Tom Wenisch > > > >---------- Forwarded message ---------- > >Date: Tue, 25 Oct 2005 13:53:49 -0400 (EDT) > >From: Jared C. Smolens > >To: [email protected] > >Subject: Re: [Simflex] CMP cache coherence protocol question > > > > > >Excerpts From "shan" <[email protected]>: > > [Simflex] CMP cache coherence proto: "shan" <[email protected]> > >>Hi Tom, > >> Is there some document or something explaining the cache coherence > for > >> CMP in the SimFlex? I know the general MOSI protocol and I read the > >>PiranhaCache-Controller files, but maybe because I am not very familiar > >>with the CMP and cache coherence, I still do not understand this module > >>very well. > > > >I am preparing the state diagram for distribution. Stay tuned. > > > >> The document said the SimFlex CMP has private L1 and shared L2. Does > >>that mean there are only one L2 and only one directory shared by all > >>cores? > > > >Yes. We maintain a logical directory at a single, shared L2 in CMPFlex. > >The directory covers all lines in the CMP core. > > > >>Is the M-E-S-I states shared by all cores? > > > >The coherence states (including transient states) are maintained in a > >directory structure at the shared L2. The L1 caches have the same > states > >as in DSMFlex (permutations of Valid, Modifiable, and Dirty bits). > > > >> Does the L1 cache in CMP need to be configured somehow different from > >> that in the Uni-processor scenario? > > > >Yes, there is one difference from the Uhiprocessor/DSMFlex > >configurations: the EvictClean option is set to true in CMPFlex. This > >option causes clean lines to be sent to the L2, instead of being > silently > >dropped on replacement. This models Piranha's use of the shared cache > as > >a large "victim cache" for L1 replacements. > > > >>I didn't find the difference in the > >>wiring.cpp. However, shouldn't the L1 cache at least be write-through > >>instead of write-back? > > > >The private caches are still write-back, but both clean and dirty lines > >are written back to the shared cache on replacement. > > > >> Sorry to take your time with so many questions. > >> > >>Thanks > >>Shan > > > > > > Jared Smolens ----------- Electrical and Computer Engineering > www.rabidpenguin.org ------------- Carnegie Mellon University > jsmolens AT ece.cmu.edu ------ HH A-313 ------ Pittsburgh, PA > > _______________________________________________ > SimFlex mailing list > [email protected] > https://sos.ece.cmu.edu/mailman/listinfo/simflex > SimFlex web page: http://www.ece.cmu.edu/~simflex > > _______________________________________________ > SimFlex mailing list > [email protected] > https://sos.ece.cmu.edu/mailman/listinfo/simflex > SimFlex web page: http://www.ece.cmu.edu/~simflex > From penglu01 at hotmail.com Wed Oct 26 19:32:04 2005 From: penglu01 at hotmail.com (lu peng) List-Post: [email protected] Date: Wed Oct 26 19:32:16 2005 Subject: [Simflex] Re: Installation setting problem In-Reply-To: <pine.lnx.4.53l-ece.cmu.edu.0510251550380.10...@dalmore.ece.cmu.edu> Message-ID: <[email protected]> An HTML attachment was scrubbed... URL: http://sos.ece.cmu.edu/pipermail/simflex/attachments/20051026/da26822b/attachment.html From shanlu at cs.uiuc.edu Thu Oct 27 21:30:49 2005 From: shanlu at cs.uiuc.edu (shan) List-Post: [email protected] Date: Thu Oct 27 21:32:57 2005 Subject: FW: [Simflex] x86 CMP configuration Message-ID: <[email protected]> Skipped content of type multipart/alternative-------------- next part -------------- _______________________________________________ SimFlex mailing list [email protected] https://sos.ece.cmu.edu/mailman/listinfo/simflex SimFlex web page: http://www.ece.cmu.edu/~simflex From twenisch at ece.cmu.edu Fri Oct 28 22:05:06 2005 From: twenisch at ece.cmu.edu (Thomas Wenisch) List-Post: [email protected] Date: Fri Oct 28 22:04:31 2005 Subject: [Simflex] Re: Installation setting problem In-Reply-To: <[email protected]> References: <[email protected]> Message-ID: <pine.lnx.4.53l-ece.cmu.edu.0510282200040.16...@dalmore.ece.cmu.edu> Hi Lu, Based on the path names below, I suspect these are 64-bit libraries. Is this an AMD machine running 64-bit linux? If so, I am not sure I can help you. I have no experience getting 32-bit and 64-bit libraries to play nicely together. You may want to see if there is a 64-bit version of Simics available. Alternatively, you will need to find a way to build Flexus as a 32-bit library on this machine (and thus link against 32-bit gcc libraries). I am fairly certain that mixing 32-bit and 64-bit shared libraries (i.e. 32 bit Simics and 64-bit Flexus) will be difficult/impossible. I am not sure if Flexus will link successfully with 64-bit libraries. You could try the adding the compiler option suggested in the error message below (add it in makefile.defs). Sorry I can't be of more help on this one. Regards, -Tom Wenisch Computer Architecture Lab Carnegie Mellon University On Wed, 26 Oct 2005, lu peng wrote: > > Hi Tom, > > Thanks for your reply. I tried to find the library files by ldd. They are in > dir as follows: > > libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003cc3e00000) > > libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003cc3c00000) > > Then I copied them to the dir of simics-2.0.28/x86-linux/sys/lib. After I ran > 'make CMPFlex', still an error message: > > /usr/bin/ld: wiring.v9_iface_gcc_o: relocation R_X86_64_32S against > `Flexus::Wiring::resolve_channel<Flexus::Core::ComponentHandle<Flexus::Core::ComponentInstance<InorderSimicsFeederInterface>, > &(Flexus::Wiring::theFeeder_instance)>, > InorderSimicsFeederInterface::InstructionOutputPort, > Flexus::Core::aux_::pull, true>::invoke_available(unsigned int)' can > not be used when making a shared object; recompile with -fPIC > wiring.v9_iface_gcc_o: could not read symbols: Bad value > collect2: ld returned 1 exit status > make[6]: *** [libflexus_CMPFlex_v9_iface_gcc.so] Error 1 > make[5]: *** [simics-v9] Error 2 > make[4]: *** [CMPFlex] Error 2 > make[3]: *** [CMPFlex] Error 2 > make[2]: *** [CMPFlex] Error 2 > make[1]: *** [CMPFlex] Error 2 > make: *** [CMPFlex] Error 2 > > Could you please take a look? After that, I also tried to install a gcc 3.4.4 > on my machine followed your online pdf as I did before. However it also > stopped when > I ran 'make profiledbootstrap': > > make[4]: Leaving directory `/pengdata/simflex/gcc-build/gcc' > ./xgcc -B./ -B/pengdata/simflex/gcc-3.4.4/x86_64-unknown-linux-gnu/bin/ > -isystem /pengdata/simflex/gcc-3.4.4/x86_64-unknown-linux-gnu/include -isystem > /pengdata/simflex/gcc-3.4.4/x86_64-unknown-linux-gnu/sys-include > -L/pengdata/simflex/gcc-build/gcc/../ld -O2 -DIN_GCC -W -Wall > -Wwrite-strings > -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem > ./include -fPIC -g -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -shared > -nodefaultlibs > -Wl,--soname=libgcc_s.so.1 -Wl,--version-script=libgcc/32/libgcc.map -o > 32/libgcc_s.so.1.tmp -m32 libgcc/32/_muldi3.o libgcc/32/_negdi2.o > libgcc/32/_lshrdi3.o > libgcc/32/_ashldi3.o libgcc/32/_ashrdi3.o libgcc/32/_cmpdi2.o > libgcc/32/_ucmpdi2.o libgcc/32/_floatdidf.o libgcc/32/_floatdisf.o > libgcc/32/_fixunsdfsi.o > libgcc/32/_fixunssfsi.o libgcc/32/_fixunsdfdi.o libgcc/32/_fixdfdi.o > libgcc/32/_fixunssfdi.o libgcc/32/_fixsfdi.o libgcc/32/_fixxfdi.o > libgcc/32/_fixunsxfdi.o > libgcc/32/_floatdixf.o libgcc/32/_fixunsxfsi.o libgcc/32/_fixtfdi.o > libgcc/32/_fixunstfdi.o libgcc/32/_floatditf.o libgcc/32/_clear_cache.o > libgcc/32/_enable_execute_stack.o libgcc/32/_trampoline.o libgcc/32/__main.o > libgcc/32/_absvsi2.o libgcc/32/_absvdi2.o libgcc/32/_addvsi3.o > libgcc/32/_addvdi3.o > libgcc/32/_subvsi3.o libgcc/32/_subvdi3.o libgcc/32/_mulvsi3.o > libgcc/32/_mulvdi3.o libgcc/32/_negvsi2.o libgcc/32/_negvdi2.o > libgcc/32/_ctors.o > libgcc/32/_ffssi2.o libgcc/32/_ffsdi2.o libgcc/32/_clz.o libgcc/32/_clzsi2.o > libgcc/32/_clzdi2.o libgcc/32/_ctzsi2.o libgcc/32/_ctzdi2.o > libgcc/32/_popcount_tab.o > libgcc/32/_popcountsi2.o libgcc/32/_popcountdi2.o libgcc/32/_paritysi2.o > libgcc/32/_paritydi2.o libgcc/32/_divdi3.o libgcc/32/_moddi3.o > libgcc/32/_udivdi3.o > libgcc/32/_umoddi3.o libgcc/32/_udiv_w_sdiv.o libgcc/32/_udivmoddi4.o > libgcc/32/unwind-dw2.o libgcc/32/unwind-dw2-fde-glibc.o > libgcc/32/unwind-sjlj.o > libgcc/32/gthr-gnat.o libgcc/32/unwind-c.o -lc && rm -f libgcc_s_32.so && if > [ -f 32/libgcc_s.so.1 ]; then mv -f 32/libgcc_s.so.1 32/libgcc_s.so.1.backup; > else > true; fi && mv 32/libgcc_s.so.1.tmp 32/libgcc_s.so.1 && ln -s > 32/libgcc_s.so.1 libgcc_s_32.so > /usr/bin/ld: crti.o: No such file: No such file or directory > collect2: ld returned 1 exit status > make[3]: *** [32/libgcc_s_32.so] Error 1 > make[3]: Leaving directory `/pengdata/simflex/gcc-build/gcc' > make[2]: *** [stmp-multilib] Error 2 > make[2]: Leaving directory `/pengdata/simflex/gcc-build/gcc' > make[1]: *** [stage1_build] Error 2 > make[1]: Leaving directory `/pengdata/simflex/gcc-build/gcc' > make: *** [profiledbootstrap] Error 2 > > So the GCC installation also has problem. Please help. > > Thanks a lot, > > Lu > > > From twenisch at ece.cmu.edu Fri Oct 28 22:17:35 2005 From: twenisch at ece.cmu.edu (Thomas Wenisch) List-Post: [email protected] Date: Fri Oct 28 22:16:58 2005 Subject: FW: [Simflex] x86 CMP configuration In-Reply-To: <[email protected]> References: <[email protected]> Message-ID: <pine.lnx.4.53l-ece.cmu.edu.0510282205180.16...@dalmore.ece.cmu.edu> On Thu, 27 Oct 2005, shan wrote: > > I am trying to run the x86 CMP module. I am not sure if what I did is > correct. > > I used the simics enterprise-4p.simics macro and didn't change any > configuration on the flexus side (should I change any configuration at the > flexus side?) > > When I load the module, it shows: > > 1 <startup.cpp:107> {0}- Initializing Flexus. ... > 4 <InorderSimicsFeederImpl.cpp:338> (feeder[<undefined>]) {0}- Connecting: > cpu0 > > 5 <InorderSimicsFeederImpl.cpp:338> (feeder[<undefined>]) {0}- Connecting: > cpu1 > > 6 <InorderSimicsFeederImpl.cpp:338> (feeder[<undefined>]) {0}- Connecting: > cpu2 > > 7 <InorderSimicsFeederImpl.cpp:338> (feeder[<undefined>]) {0}- Connecting: > cpu3 > > Why above feeder index is 'undefined', does it imply I made some mistake in > configuration? The <undefined> notation is harmless. It indicates that this particular debug message does not have a {FlexusIdx} field associated with it, because it is in a piece of code that is not associated with a particular node. All debug messages actually consist of a bunch of fields, like the sequence number, file, line, source component, etc. You can filter, redirect, and reformat debug output based on these fields. Take a look at debug.cfg to get an idea of how this works. > > > Then I type 'continue'. After a while, the screen shows: > > 25 <MemoryMapImpl.cpp:319> {0}- Assigned 0 pages. > > 26 <flexus.cpp:240> {0}- Timestamp: 2005-Oct-25 19:49:10 > > 27 <flexus.cpp:326> {108288}- Watchdog timer expired. No progress by CPU 0 > for 100215cycles > > 28 <flexus.cpp:330> (<undefined>[<undefined>]) {108288}- Assertion failed: > ((!(theWatchdogCounts[i] < theWatchdogTimeout + 10))) : Watchdog timer > expired. No progress by CPU 0 for 100215cycles > > *** Simics getting shaky, switching to 'safe' mode. > > *** Simics (main thread) received an abort signal, probably an assertion. > > <Simics is running in 'safe' mode> > I checked the assertion at flexus.cpp line 330. It seems that if the cpu > idles for some time, the assertion would fail. Yes? I temporarily removed > the assertion and continues. And get following debug output: > This assertion indicates that one of the CPUs has not issued an instruction or completed a memory request for 100k cycles. This can only come about if there is a deadlock in a Flexus coherence protocol (very unlikely, as we have run trillions of cycles since we last found a deadlock bug, but perhaps there is something x86 specific that we are missing), or for some reason Flexus has not connected itself properly to your 4 CPUs. Given that you only see output from execute[2], the latter is very likely. I suggest you turn up the debug output to confirm that none of the CPUs are doing anything. Issue the command "flexus.set-debug-severity iface" before starting the simulation. You should get lots more output, from which you can figure out if any of the other nodes are active. The relevant pieces of code to study to figure out this bug are the init() method in InorderSimicsFeeder/SimicsTracer.hpp and doInitialize() in InorderSimicsFeeder/InorderSimicsFeederImpl.cpp. If there is a problem with hooking up to Simics, it will be in one of these two places. > > > 52 <ExecuteImpl.cpp:341> (execute[2]) {250546}- EX received Reply > MemoryMessage[Store Reply]: Addr:0xp:00ff3ffac Size:8 Core: 0 > > 53 <ExecuteImpl.cpp:1287> (execute[2]) {250548}- EX Issuing memory request: > MemoryMessage[Store Request]: Addr:0xp:00ff3ffa8 Size:8 Core: 0 > > > > Again, I fear there is something wrong. The reason is, based on the debug > output only execute[2] is doing work, however, all MemoroyMessage shows > 'core 0'. Is this something abnormal? Core 0 does not neccessarily indicate anything is wrong, because the core field is not set in the execute component, and would still be zero at the point this debug message is generated. > :-) sorry for the whole bunch of questions :-). > > Thanks very much > > Shan > > Regards, -Tom Wenisch Computer Architecture Lab Carnegie Mellon University
