Hi Lu, all,

On Thu, 1 Sep 2005, lu peng wrote:

>
> Hi Tom,
>
> How are you doing? I ran the test program with turning on the debug switch. I 
> got some debug info.
>
> 97 <CacheImpl.cpp:246> (L1i[7]) {1}- Sent on Port BackSideOut(Request): 
> MemoryMessage[Fetch Request]: Addr:0xp:00baa8fc0 Size
> :64 Core: 0

This debug statement indicates that the L1 instruction cache on node 7
sent a "Fetch Request" message out of its BackSideOutRequest port for the
cache block with physical address 0xbaa8fc0 to L2 in cycle 1.

> 98 <CmpCacheImpl.cpp:182> (L2[0]) {1}- Received on Port 
> FrontSideIn[14](Request): MemoryMessage[Fetch Request]: Addr:0xp:00ba
> a8fc0 Size:64 Core: 14

This statement indicates that the L2 CMP cache received the message.  The
CMP Cache counts each of the L1 caches as a separate "core", so the L1-I
cache on node 7 is numbered as "Core: 14".  The L1D on node 7 would be
"Core: 15".  For node 0, the I and D are numbered "Core: 0" and "Core: 1",
respectively.

>
> Could you please explain a little about the relationship of CmpCacheImpl and 
> CacheImpl? And the meanings of BackSide and FrontSide? Another question: what 
> does the
> 'Core: 14' mean? If I'd like to set the number of cache banks to be 16 or 32, 
> where can I set this number?

flexus/components/Cache/CacheImpl.cpp contains the definition of the
Flexus Cache component.  Every component in Flexus has, at a minimum, one
declaration file and one implementation file.  Both files share the name
of the component - for the Cache component, they are
flexus/components/Cache/Cache.hpp and
flexus/components/Cache/CacheImpl.cpp.  Likewise for the CmpCache
component.  A components declaration file contains two things: a list of
configuration parameters that the component supports; and a "component
interface" specification, which indicates how this component can be
connected to other components.  The implementation file then contains a
class which implements the component interface.  The two files, CacheImpl
and CmpCacheImpl, both contain the "glue" logic that connects our cache
data structures and FSMs to the rest of Flexus.  The underlying cache code
is shared between the two components, and lives in
flexus/components/Cache.

Throughout all of Flexus, we use "Front" "Top" and "Up" to refer to the
interface of a cache that is nearest the CPU.  Thus, the
"FrontSideIn(Request)" port of the L1 cache is connected to the CPU core
(the Execute component), and on the L2 cache, it is connected to the L1
cache(s).  We use "Back", "Bottom", and "Down" to refer to interfaces
closer to memory, so the L2 BackSideOut(Request) connects to memory.

Although we have a configuration parameter for the number of banks in a
cache, this parameter currently has no effect.  We are planning to add
banking support to the CmpCache, but it has not been implemented yet.

Regards,
-Tom Wenisch
Computer Architecture Lab
Carnegie Mellon University

>
> Thanks,
>
> Lu
>
>
>
From twenisch at ece.cmu.edu  Mon Sep  5 14:20:36 2005
From: twenisch at ece.cmu.edu (Thomas Wenisch)
List-Post: [email protected]
Date: Mon Sep  5 14:20:17 2005
Subject: [Simflex] Problem running CMPFlex.OoO and other simulators
In-Reply-To: <[email protected]>
References: <[email protected]>
        <pine.lnx.4.53l-ece.cmu.edu.0508051503460.6...@dalmore.ece.cmu.edu>
        <[email protected]> <[email protected]>
        <pine.lnx.4.53l-ece.cmu.edu.0508071330560.25...@dalmore.ece.cmu.edu>
        <[email protected]>
        <pine.lnx.4.53l-ece.cmu.edu.0508112336340.8...@dalmore.ece.cmu.edu>
        <[email protected]>
        <pine.lnx.4.53l-ece.cmu.edu.0508182240000.25...@dalmore.ece.cmu.edu>
        <[email protected]>
Message-ID: <pine.lnx.4.53l-ece.cmu.edu.0509051404270.30...@dalmore.ece.cmu.edu>

Hi Mrinal,

On Tue, 30 Aug 2005, Mrinal Nath wrote:

> Hi Tom,
>
> To recap, I am able to run the Flexus test application using CMPFlex and
> it completes in 5-6 minutes. When I tried running the same application
> using CMPFlex.OoO (please refer to the mail below), the simulation kept
> running for several hours. I let it run for 4 days and then I got an
> error in the 'console window' (see below). This error (panic) seems to
> be from the OS running on Simics !
> Also, note that the simulation was stuck at "thread 2 iteration 0" for
> 3-4 days. Basically, I think that the nothing is really happening in the
> simulator, but the time does advance.
>
> There are similar panic messages along with Flexus and Simics error
> messages. I have put these messages in the attached file
> 'error-messages.txt' for convenience.

The kernel panic you are seeing probably indicates that the Flexus
out-of-order core has managed to crash Solaris within Simics.  While we
don't see this all that often, it can happen that the Flexus OoO core
messes up corner cases, especially when interacting with peripheral
devices (e.g., the OoO core delays taking an interrupt, and a device
doesn't like it).  It looks like this particular kernel panic is in the
driver that talks to the "glm" disk controller.  We have also seen kernel
panics come about because of bugs in Simics' checkpointing support (mostly
with bugs if there are I/Os outstanding while a checkpoint is being
written), but then I would expect the panic to happen in the in-order
simulation as well.  You could try continuing the inorder simulation for
another few hours past the end of the test application (at the prompt) to
see if it also panics.

Nevertheless, the test application should take far less than 4 days, and
should be completing long before you hit this kernel panic.

>
> Has anyone tried running the test application using simulators other
> than CMPFlex? Is it likely that the checkpointing created is valid only
> with CMPFlex, and that we would need to create different checkpoints
> with the other simulators.

You should not need different checkpoints for In-order and Out-of-order
simulation.  I have not personally run the test app with CMPFlex.OoO.  I
will have someone here test this and report back on how long it takes to
complete.  Give me a few days to get this done.

>
> Can you let me know which is the test application that we are trying to
> run, and how to run it directly from simics?

The source code for the flexus test app is in /flexus-test-app/src.  Take
a look at flexus-test-app/flexus-test.simics to see how we boot the system
and launch the test app.  You could comment out much of the python code in
here and walk through the steps manually to see how the checkpoint is
created, and to run the test app in Simics without flexus.

Regards,
-Tom Wenisch
Computer Architecture Lab
Carnegie Mellon University

>
> Thanks
> - Mrinal
From penglu01 at hotmail.com  Tue Sep 13 17:30:00 2005
From: penglu01 at hotmail.com (lu peng)
List-Post: [email protected]
Date: Tue Sep 13 17:31:13 2005
Subject: [Simflex] Re: About the cache structure
In-Reply-To: 
<pine.lnx.4.53l-ece.cmu.edu.0509051327330.30...@dalmore.ece.cmu.edu>
Message-ID: <[email protected]>

An HTML attachment was scrubbed...
URL: 
http://sos.ece.cmu.edu/pipermail/simflex/attachments/20050913/4a50e958/attachment.html
From penglu01 at hotmail.com  Tue Sep 13 18:30:04 2005
From: penglu01 at hotmail.com (lu peng)
List-Post: [email protected]
Date: Tue Sep 13 18:30:25 2005
Subject: [Simflex] Re: About the cache structure
In-Reply-To: <[email protected]>
Message-ID: <[email protected]>

An HTML attachment was scrubbed...
URL: 
http://sos.ece.cmu.edu/pipermail/simflex/attachments/20050913/2996e096/attachment.html
From twenisch at ece.cmu.edu  Tue Sep 13 18:36:22 2005
From: twenisch at ece.cmu.edu (Thomas Wenisch)
List-Post: [email protected]
Date: Tue Sep 13 18:35:52 2005
Subject: [Simflex] Re: Timing in Flexus
In-Reply-To: <[email protected]>
References: <[email protected]>
        <pine.lnx.4.53l-ece.cmu.edu.0509051354020.30...@dalmore.ece.cmu.edu>
        <[email protected]>
Message-ID: <[email protected]>

Hi Arrvindh,

On Thu, 8 Sep 2005, Arrvindh Shriraman wrote:

> Hi Tom,
>
> I am getting a bit confused with the memory consistency model and timing
> issues in the system. Since simics is serializing when it's simulating a
> multiprocessor system (that is for every quantum set using
> cpu_switch_time it goes through a loop simulating every processor). How
> exactly is Flexus able to simulate TSO ?

Flexus separates the completion of a store instruction in Simics from
writing a value to the Simics memory image, by inhibiting the completion
of each store access in the InorderSimicsFeeder component.  We control
when the value gets written to Simics' memory.  At the point where the
value is globally visible (visible to all CPUs), we complete the write to
Simics' memory.  If the processor that issued the write reads the location
while the write is still pending, we ensure that it sees the value it
wrote (rather than the old value that is still visible to other
processors).

All of this is coordinated by the InorderSimicsFeeder and Execute
components.  You can enable/disable the TSO behavior with the
SequentialConsistency configuration parameter of the Execute component.
(Similar settings exist in the OoO code, but different code implements the
TSO vs. SC differences).

>
> Also what is the notion of "time" in Flexus and SIMICS. Foe eg: consider
> the following
> Load R1,<mem>
> Load R2,<mem>
> Add R1,R2
>
> Lets say first load misses and we need to stall (or account for the notion
> of stall). How does SIMICS and Flexus handle this ?

In in-order simulation, the processor stalls, and will not fetch the
second load, until the first completes.  Thus, neither Simics nor Flexus
will see the second load until the first is done. In OoO simulation, our
code will put both loads in its simulated ROB, and they can both execute
in parallel.  They will execute in Simics once we commit them in the OoO
simulation.

> I remember you briefly telling me in the In-Order world SIMICS driving
> Flexus and in OOO Flexus driving SIMICS. How do these things actually work
> ?

This is correct.  In inorder simulation, Simics iterates over all CPUs
every cycle, and calls Flexus by using the timing-model and snoop-memory
interfaces of each CPU object.  This kind of interaction is documented in
the Simics manuals (look up the generic cache component).

In OoO mode, Flexus iterates over all CPUs each cycle, and performs
"timing-first" simulation as described in the Wisconsin TFSim paper (I can
send you the citation if you need it).  We advance Simics at the point of
irrevocable commit of each instruction as an in-order checker (kind of
like DIVA) to confirm that our simulation did the right thing.


> With Regards,
> Arrvindh Shriraman


Hope that helps.

-Tom Wenisch
Computer Architecture Lab
Carnegie Mellon University
From twenisch at ece.cmu.edu  Tue Sep 13 18:44:06 2005
From: twenisch at ece.cmu.edu (Thomas Wenisch)
List-Post: [email protected]
Date: Tue Sep 13 18:43:38 2005
Subject: [Simflex] Re: About the cache structure
In-Reply-To: <[email protected]>
References: <[email protected]>
Message-ID: <[email protected]>

Hello Lu,


On Tue, 13 Sep 2005, lu peng wrote:

>
> Hi Tom,
>
> How are you? I just built an 8-processor check point and tried to run it with 
> SimFlex. I used 'read-configuration' and 'load-module' (the .so file was 
> copied into
> the /x86-linux/lib). However, it seems that there is no interaction between 
> the checkpoint and the module because I print many debug info in the simflex 
> module).
> Could you please let me know why?

The process you described sounds correct to me.  It may be that you are
not getting debug output because the debugging system cannot find a
configuration file telling it where to write its output.  You can confirm
this is the case by changing one of your debug statements to a printf and
seeing it if shows up.

To fix this, ensure that file flexus-test-app/config/debug.cfg is in the
current working directory when you start Simics.  Flexus reads this file
on startup to configure its debugging system to tell it where to write
output.  It should print a warning message if it doesn't find a
configuration file.

If neither of these things help, send me a code sample containing one of
your debug statements, and the commands you are using to make Flexus,
start simics, and start the simulation, and their output, and I will see
if all the steps look correct.

Regards,
-Tom Wenisch
Computer Architecture Lab
Carnegie Mellon University

>
> Thanks,
>
> Lu
>
>
>
From twenisch at ece.cmu.edu  Tue Sep 13 18:48:37 2005
From: twenisch at ece.cmu.edu (Thomas Wenisch)
List-Post: [email protected]
Date: Tue Sep 13 18:48:09 2005
Subject: [Simflex] Re: About the cache structure
In-Reply-To: <[email protected]>
References: <[email protected]>
Message-ID: <[email protected]>

Hi Lu,

On Tue, 13 Sep 2005, lu peng wrote:

>
> Hi Tom,
>
> Another two questions:
>
> 1. The class CacheArray has both theBanks and theBankNumber as initialization 
> values. This makes me a little confused. In my mind, theBankNumber should be
> determined by the physical address. If I need declare a CacheArray with 64 
> banks, what value should I assinged to theBankNumber?

I believe that neither of these variables have any effect on the
CacheArray code.  Let me confirm that with the author of that code and get
back to you.

>
> 2. For the simulation speed, if I just use CMPFlex instead of its ooo
> mode, but I'd like to know the timing issue of Cache, i.e, the cache
> hit/miss latency. Will the cycle infomation has meaning? Here is the
> cycle the bus cycle or cpu cycle?

If you set latencies for the caches, the Inorder simulation will use those
latencies (i.e., it will wait n cycles before telling Simics to proceed to
the enxt instruction).  Thus, the inorder simulation is a timing
simulation, but is constrained to in-order execution.

If you issue the command "flexus.fast-mode" within Simics, the simulator
will behave as if all latency settings were set to their minimum, and run
as fast as possible.  In this mode, Flexus is really a trace simulator,
and the cycle count doesn't mean anything (it's approximately the number
of instructions completed).

Regards,
-Tom Wenisch
Computer Architecture Lab
Carnegie Mellon University
From penglu01 at hotmail.com  Wed Sep 14 15:20:56 2005
From: penglu01 at hotmail.com (lu peng)
List-Post: [email protected]
Date: Wed Sep 14 15:21:11 2005
Subject: [Simflex] Re: About the cache structure
In-Reply-To: <[email protected]>
Message-ID: <[email protected]>

An HTML attachment was scrubbed...
URL: 
http://sos.ece.cmu.edu/pipermail/simflex/attachments/20050914/13a72478/attachment.html
From twenisch at ece.cmu.edu  Wed Sep 14 17:19:12 2005
From: twenisch at ece.cmu.edu (Thomas Wenisch)
List-Post: [email protected]
Date: Wed Sep 14 17:18:47 2005
Subject: [Simflex] Re: About the cache structure
In-Reply-To: <[email protected]>
References: <[email protected]>
Message-ID: <pine.lnx.4.53l-ece.cmu.edu.0509141713190.11...@dalmore.ece.cmu.edu>

Hi Lu,

You must have separate phys_memory and physical_io objects for each CPU
for Flexus to be able to connect to each of the CPUs individually, hence
the assertion message.

When using the Virtutech scripts, you can make them create separate memory
spaces for each cpu by setting:

@cpu_spaces = 1

You can add this right after the @boards line.  Then reboot, and
everything should work.


If you have created existing checkpoints that you want to "rescue" (to
avoid rebooting, or whatever), you can also manually edit the checkpoint
file that Simics creates to add in the memory space and io space objects
for each CPU.  This is a hassle, but can be useful sometimes.  For each
CPU, you need to add lines like the following:

OBJECT cpu0_mem TYPE memory-space {
        map: ((0, phys_mem0, 0, 0, -1, phys_mem0, 0, 0, 0))
}
OBJECT cpu0_io TYPE memory-space {
        map: ((0, phys_io0, 0, 0, -1, phys_io0, 0, 0, 0))
}

Then, within each CPU definition, update the memory and IO space
attributes:

        physical_memory: cpu0_mem
        physical_io: cpu0_io

Hope that helps.

Regards,
-Tom Wenisch
Computer Architecture Lab
Carnegie Mellon University



On Wed, 14 Sep 2005, lu peng wrote:

>
> Hi, Tom,
>
> Thanks for your reply. I have another question. Perhaps it's more relate to 
> simics. I installed the solaris 8 followed the instruction of simics. Then I 
> set the
> number of processor before I call sol8-run.simics.
>
> # set up 8 processors with 512MB
> @boards = {0 : [[0, 4, 512], [1, 4, 0]]}
>
> run-command-file "sol8-run.simics"
>
> I got an 8-cpu system. However, I found all of my cpu use the same phys_mem0. 
> So this caused a shakey complained by SimFlex
>
> Simics Flexus simulator - Built as CMPFlex v1.0
>
> 2 <ComponentManager.cpp:80> {0}- Instantiating system with a width factor of: 
> 8
> 3 <InorderSimicsFeederImpl.cpp:290> (feeder[0]) {0}- Initializing 
> InorderSimicsFeeder.
> 3 <InorderSimicsFeederImpl.cpp:290> (feeder[0]) {0}- Initializing 
> InorderSimicsFeeder.
> 4 <InorderSimicsFeederImpl.cpp:338> (feeder[<undefined>]) {0}- Connecting: 
> cpu0
> 5 <InorderSimicsFeederImpl.cpp:338> (feeder[<undefined>]) {0}- Connecting: 
> cpu1
> 6 <SimicsTracer.hpp:103> (<undefined>[<undefined>]) {0}- Assertion failed: 
> ((!(false))) : Two CPUs connected to the same memory timing_model: phys_mem0
> ***  Simics getting shaky, switching to 'safe' mode.
> ***  Simics (main thread) received an abort signal, probably an assertion.
> <Simics is running in 'safe' mode>
>
> I checked the sample checkpoint in the directory of flexus-test-app. Your 
> checkpoint uses different phys_mem for different CPU. Did I miss something?
>
> Thanks a lot,
>
> Lu
>
>
>
>
>
>
From babak at ece.cmu.edu  Tue Sep 27 21:32:20 2005
From: babak at ece.cmu.edu (Babak Falsafi)
List-Post: [email protected]
Date: Wed Sep 28 13:22:47 2005
Subject: [Simflex] Re: SimFlex and x86
In-Reply-To: <[email protected]>
References: <[email protected]>
Message-ID: <[email protected]>

Dear Joe,

Welcome to the SimFlex project. The place to post questions is the
simflex mailing list. It will get an expert response fairly quickly  
and it
would also benefit others. I am hereby forwarding your question to
the mailing list.

Regards,
Babak

On Sep 27, 2005, at 7:37 PM, Joseph A Tucek wrote:

> Hello,
>
> My name is Joe Tucek, and I am a student of YY Zhou's at UIUC.  We  
> were wanting to use your SimFlex tool to do some timing  
> simulations, and I had a few questions.
>
> Primarily, there is conflicting information about x86 support, and  
> I was wondering what the actual status is.  Some places on the  
> project site say that x86 is supported, and other places only  
> mention SPARC--how well is x86 currently supported for SMP/CMP  
> simulation?  How accurate is the associated timing model?  Also,  
> what level of support is there (again, for x86) for different  
> coherence methods (ie. snooping/directory based, MSI, MOSI, etc)?
>
> Finally, is there a student or other place where I could send some  
> additional questions in the future as issues come up?
>
> Thank you very much for your time.
>
> -Joe
>
>
>

___________________________
Babak Falsafi
Associate Professor
Electrical & Computer Engineering
Computer Science
Carnegie Mellon University
http://www.ece.cmu.edu/~babak

Reply via email to