https://bugs.kde.org/show_bug.cgi?id=465465

            Bug ID: 465465
           Summary: Eviction Emulation in Cachegrind for Detecting Cache
                    Conflicts
    Classification: Developer tools
           Product: valgrind
           Version: 3.20 GIT
          Platform: Other
                OS: All
            Status: REPORTED
          Severity: wishlist
          Priority: NOR
         Component: cachegrind
          Assignee: n...@valgrind.org
          Reporter: ibogosavlje...@gmail.com
  Target Milestone: ---

Created attachment 156067
  --> https://bugs.kde.org/attachment.cgi?id=156067&action=edit
Eviction Simulation Patch

Let me first give some background:

I am looking for a tool that can detect cache conflicts, but I am not
finding any. There are a few that are mostly academic, and thus not
maintained. I think it is important for the performance analysis
community to have a tool that to some extent can detect cache
conflicts. Is it possible to implement support for detecting source
code lines where cache conflicts occur? More info on cache conflicts
below.

=== What are cache conflicts? ===
Cache conflict happens when a cache line is brought up from the memory
to the cache, but very soon has to be evicted to the main memory
because another cache line is mapped to the same entry.

The problem with detecting cache conflicts is that it is normal that
one cache line gets evicted because it is replaced by another cache
line. Therefore, a cache conflict is an outlier: the cache line spent
very little time in the cache before it got evicted.

=== How to detect cache conflicts? ===
There are several ways to detect cache conflicts, but I here propose one
that is easy to implement in cachegrind.

When an instruction that touches the data cache executes, there is 
a possibility that this instruction evicts a certain line from the cache.
When the line enters cache, we timestamp it. When it gets evicted, we 
can measure how much time it the line spent in the cache.

Then, for each instruction we count the number of evictions and total
time line spent in cache to calculate the average time spent in cache
before eviction. The instructions where the average time before eviction is low
compared to the rest of the program are lines where cache conflicts
happen.

== Advantages and disadvantages ==
Advantages of this model is that it is very straightforward to implement
and the results are good enough. The disadvantage is that Valgrind is
not cycle-accurate, so the times spent in cache will not be given in 
cycles, but in memory accesses (e.g. the average cache line spent
800.000 loads in the LL data cache before it got evicted).

== Does it work ==

I already implemented this in cachegrind and ran it on a matrix
multiplication example with matrix sizes 1008, 1024 and 1040. The 
tool discovered the following anomaly:
average stay in cache, LL cache, data reads for different matrix size:
1008: 890,688 memory accesses
1024: 98,394 memory accesses
1040: 890,965 memory accesses


== Implementation ==
The first draft of the implementation is attached as a patch and 
should be applied on top of VALGRIND 3.20 tag in Github.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to