https://bugs.kde.org/show_bug.cgi?id=465465
Bug ID: 465465 Summary: Eviction Emulation in Cachegrind for Detecting Cache Conflicts Classification: Developer tools Product: valgrind Version: 3.20 GIT Platform: Other OS: All Status: REPORTED Severity: wishlist Priority: NOR Component: cachegrind Assignee: n...@valgrind.org Reporter: ibogosavlje...@gmail.com Target Milestone: --- Created attachment 156067 --> https://bugs.kde.org/attachment.cgi?id=156067&action=edit Eviction Simulation Patch Let me first give some background: I am looking for a tool that can detect cache conflicts, but I am not finding any. There are a few that are mostly academic, and thus not maintained. I think it is important for the performance analysis community to have a tool that to some extent can detect cache conflicts. Is it possible to implement support for detecting source code lines where cache conflicts occur? More info on cache conflicts below. === What are cache conflicts? === Cache conflict happens when a cache line is brought up from the memory to the cache, but very soon has to be evicted to the main memory because another cache line is mapped to the same entry. The problem with detecting cache conflicts is that it is normal that one cache line gets evicted because it is replaced by another cache line. Therefore, a cache conflict is an outlier: the cache line spent very little time in the cache before it got evicted. === How to detect cache conflicts? === There are several ways to detect cache conflicts, but I here propose one that is easy to implement in cachegrind. When an instruction that touches the data cache executes, there is a possibility that this instruction evicts a certain line from the cache. When the line enters cache, we timestamp it. When it gets evicted, we can measure how much time it the line spent in the cache. Then, for each instruction we count the number of evictions and total time line spent in cache to calculate the average time spent in cache before eviction. The instructions where the average time before eviction is low compared to the rest of the program are lines where cache conflicts happen. == Advantages and disadvantages == Advantages of this model is that it is very straightforward to implement and the results are good enough. The disadvantage is that Valgrind is not cycle-accurate, so the times spent in cache will not be given in cycles, but in memory accesses (e.g. the average cache line spent 800.000 loads in the LL data cache before it got evicted). == Does it work == I already implemented this in cachegrind and ran it on a matrix multiplication example with matrix sizes 1008, 1024 and 1040. The tool discovered the following anomaly: average stay in cache, LL cache, data reads for different matrix size: 1008: 890,688 memory accesses 1024: 98,394 memory accesses 1040: 890,965 memory accesses == Implementation == The first draft of the implementation is attached as a patch and should be applied on top of VALGRIND 3.20 tag in Github. -- You are receiving this mail because: You are watching all bug changes.