On Thu, Apr 16, 2009 at 5:03 AM, Dominic Account
<[email protected]> wrote:
> Dear Nicholas
>
> I hope it is fine to bug you directly with my question about CacheGrind:

It's better to contact the lists, as other people may be able to
answer your questions better and/or faster than me.

>  First of all thank you for writing CacheGrind. It is a very good
> starting point for me!
>  I am currently trying to extend it to support multi-core cache
> simulation (MESI protocol, 1:1 thread/L1 cache mapping).
>
> However, there is one thing which I find puzzling:
>
> From what I understand CacheGrind tries to combine
> "read/write/instruction-events"
> in order to improve performance ("addEvent_Dw" e.g. merges writes with
> preceding reads Dr+Dw=Dm)
>
> The instrumentation in "flushEvents" - however - turns all Dm-events
> into Dr-instrumentation.
>
> I assume this hides all writes which get merged in "addEvent_Dw" and
> all writes that happen
> in Dm-events constructed in "cg_instrument".
>
> Thus the cache-statistics are partially wrong !? The number of writes
> should be too low.
>
> I stumbled about this when I included memory bus event annotations in
> "InstrInfo".
> "log_0I_1Dw_cache_access" and "log_1I_1Dw_cache_access" never ever
> reported locked
> writes (locked = the Intel instruction prefix for cache exclusive
> reads) but "log_0I_1Dr_cache_access"
> would happily report locked reads. For my CMP-cache simulation I must not 
> loose
> those writes...
>
> I temporarily disabled merging in "addEvent_Dw" and immediately saw
> locked writes!
>
> My current assumption is that CacheGrind was not designed to be really
> accurate - or would you consider it a bug?

It's been several years since I wrote that code and my memory of the
cache simulation stuff is hazy.  What I remember is that a "modify"
event is, for some reason, equivalent to a "read" event, in terms of
what the cache has to do.  So converting "modify" events into "read"
events is reasonable.  Ie. it's a deliberate decision.

I can't remember now the hardware details of why this is so;  Josef
might have a suggestion.  Whether it is true for multiprocessor
machines is another question.

Hope this helps.

Nick

------------------------------------------------------------------------------
Stay on top of everything new and different, both inside and 
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today. 
Use priority code J9JMT32. http://p.sf.net/sfu/p
_______________________________________________
Valgrind-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to