On Fri, Sep 3, 2010 at 4:21 PM, DRAM Ninjas <[email protected]> wrote:

> So I'm back to trying to figure out which statistics numbers I should be
> looking at. I ran 'ptlstats -snapshot user' and I'm looking for the L2 miss
> rate. There's a few sections labeled L2 { } (I'm not sure why there are
> several since I'm using a shared L2 configuration, but maybe 1 global + 1
> per core?) and each of those indicate about 25% miss rate.
>
> Because of PTLsim's stats collection mechanism, we can't create dynamic
stats structures at compile time to reflect our cores/caches configurations.
So workaround used in PTLsim and Marss is to statically create stats for
each possible core/cache (for caches, we have cache stats for upto 8 cores
even if we simulate only one core) and don't use the ones that are not
simulated.

Also for multiple caches we also collect a total cache stats by default. So
for shared L2 configuration you will see two L2 stats, one in 'c0' and one
in 'total'.


> However there's another section labeled dcache {} which has ~87% miss rate
> (which is honestly closer to what I'm looking for for my workload). Can
> someone point me in the right direction of what I should be looking at ... ?
>
> About dcache, these stats are deprecated and its not used anymore.

One more thing about stats is if your stats are showing up weird numbers
then I suggest you to clean the whole build directory manually and then
re-compile Marss. This is because of a small bug I found lately in 'scons'
and also because of PTLsim's stats collection template system.

- Avadh

On Thu, Aug 12, 2010 at 11:24 PM, avadh patel <[email protected]> wrote:
>
>>
>>
>> On Thu, Aug 12, 2010 at 4:41 PM, DRAM Ninjas <[email protected]>wrote:
>>
>>> Greetings All,
>>>
>>> I am in the process of trying to tweak the cache hierarchy and I just had
>>> a few question since I believe the comments in cacheConstants.h don't match
>>> the actual code.
>>>
>>>  So let's say I want a 32kb L1D, 8 way set associative, 64 byte lines (I
>>> believe these are actual core 2 duo numbers for certain models).
>>>
>>> So the "way count" is the associativity, right?
>>>
>>> So I want WAY_COUNT=8, LINE_SIZE=64, SET_COUNT=64 = 64 * 8 * 64 = 32k.
>>> Are those the correct parameters?
>>>
>>> The other thing I haven't heard of before is L1D_DCACHE_BANKS ... what
>>> does this parameter do? It seems like it is also factored into the total
>>> size of the L1D, but the comment says that the cache should be 32KB with
>>> LINE_SIZE=16, SET_COUNT=2048, WAY_COUNT=2, BANKS=8 ... which doesn't seem
>>> right. Is this a case of just forgetting to update the comment? If the
>>> comment is incorrect, given these parameters, what is the size of the L1D?
>>>
>>> Yes, these numbers and comments are confusing... Bad job from our side..
>> I am looking for a better and esay way to configure the caches to some
>> pre-defined configuration like core2, corei5, corei7 etc.
>> Please send me some suggestions.
>>
>> Also I am looking to find a better way to configure the core parameters
>> like caches - pre-defined configurations.
>> Original PTLsim had a way to define a different 'ooocore*.h' file to do
>> that but biggest issue in this method is that there is a lot of *shared*
>> code between base ooocore.h and other configuration file. We need to cleanly
>> separate the configuration from base pipeline code, which will allow us to
>> create separate configuration files.
>>
>>
>>> As far as cache statistics are concerned: is there a way to get only
>>> userland L2 cache misses?
>>>
>>> I'm running a workload where the L2 cache hit rate seems excessively high
>>> even though the workload is designed to be extremely cache unfriendly
>>> (though I must say I'm a bit overwhelmed at the ptlstats output and am not
>>> quite sure where to look). Using performance counters I'm getting on the
>>> order of 75% miss rate on my laptop. I'm wondering if there's maybe a
>>> discrepancy in how the L2 hit rate is computed. Is it simply (total L2 hits
>>> / total L2 references)?
>>>
>>>
>> To get the user mode stats you can use '-snapshot user' option.
>>
>> About comparison with the real machine, are you using perf. counters in
>> booting your machine in single user mode and no X running? In my personal
>> experience it screws up the results by a large amount. L2 hit rate is simply
>> total L2 hits / total L2 references.
>>
>> - Avadh
>>
>>
>>> Thanks.
>>> Paul
>>>
>>> _______________________________________________
>>> http://www.marss86.org
>>> Marss86-Devel mailing list
>>> [email protected]
>>> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>>>
>>>
>>
>
_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

Reply via email to