Re: [Valgrind-users] Cache conflict detection support in cachegrind
On 1/30/2023 7:08 AM, Ivica B wrote: Can you please share the instructions on how to do it? On Sun, Jan 29, 2023, 9:07 PM Eliot Moss mailto:m...@cs.umass.edu>> wrote: I have used lackey to get traces, which I have fed into a cache model to detect conflicts and such. You could also start with the lackey code and model the cache model into the tool (which a student of mine did at one point). Lackey is one of the built-in valgrind tools. It has instructions. It produces a trace giving one memory access per line, and indicating if the access is for instruction fetch, memory read, memory write, or both read and write, with the address and size. You write a program to parse that and run your own model of whatever cache you're concerned with. Doing that part is for you to figure out. You do need to know the details of the cache you're going to model. There may be programs or libraries out there for analyzing address traces, but this would not be the list to find them. Sorry, but I'm not prepared to go through how to code a cache model ... EM ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Cache conflict detection support in cachegrind
I have used lackey to get traces, which I have fed into a cache model to detect conflicts and such. You could also start with the lackey code and model the cache model into the tool (which a student of mine did at one point). Regards - Eliot Moss ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Cache conflict detection support in cachegrind
Can you please share the instructions on how to do it? On Sun, Jan 29, 2023, 9:07 PM Eliot Moss wrote: > I have used lackey to get traces, which I have fed into > a cache model to detect conflicts and such. You could > also start with the lackey code and model the cache model > into the tool (which a student of mine did at one point). > > Regards - Eliot Moss > ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Cache conflict detection support in cachegrind
Hi Paul! I read the info you provided, but none of the programs actually support detecting cache conflicts. Performance counters can detect cache misses, similar to cachegrind, but they cannot distinguish between cache misses related to cache conflicts and other cache misses. pahole is a tool with completely different usage, and that is to detect paddings in data structures. This isn't related to cache conflicts in any way. DHAT provides useful information, by allowing you to assess which data is accessed more frequently, but you need additional data to verify that the hot data is not evicted from the cache too soon. On Sun, Jan 29, 2023 at 4:25 PM Paul Floyd wrote: > > > > On 29-01-23 14:31, Ivica B wrote: > > Hi! > > > > I am looking for a tool that can detect cache conflicts, but I am not > > finding any. There are a few that are mostly academic, and thus not > > maintained. I think it is important for the performance analysis > > community to have a tool that to some extent can detect cache > > conflicts. Is it possible to implement support for detecting source > > code lines where cache conflicts occur? More info on cache conflicts > > below. > > [snip] > > I agree that this is an interesting topic. If anyone else has ideas I'm > all ears. > > My recommendations for this are: > > 1/ PMU/PMC (performance monitoring unit/counter) event counting tools > (perf record on Linux, pmcstat on FreeBSD, Oracle Studio collect on > Solaris, don't know for macOS). These can record events such as cache > misses with the associated callstacks. You can then use tools HotSpot > and perfgrind/kcachegrind (I hae used HotSpot but not perfgrind). > > The big advantage of this is that the PMCs are part of the hardware and > the overhead of doing this is minor. The only slight limitation is that > then number of counters is limited. > > 2/ pahole > https://github.com/acmel/dwarves > A really nice binary analysis tool. It will analyze your binary (with > debuginfo) and generate a report for all structures showing holes, > padding and cache lines. It can even generate modified source with > members reordered to improve the packing. However as this is a static > tool working only on the data structures it knows nothing about your > access patterns. > > 3/ DHAT > One of the Valgrind tools. This profiles heap memory. If the block is > less than 1k it will also generate a kind of ascii-html heat map. That > map is an aggregate, but you can usually guess which offsets get hit the > most together. > > Cachegrind doesn't really do this with the kind of accuracy that PMCs > do. It has a reduced model of the cache and has a basic branch > predictor. I don't know if or how speculative execution affects the > cache hit rate, but Valgrind doesn't do any of that. > > A+ > Paul > > > ___ > Valgrind-users mailing list > Valgrind-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/valgrind-users ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Cache conflict detection support in cachegrind
On 2023-01-29, Paul Floyd wrote: My recommendations for this are: 1/ PMU/PMC (performance monitoring unit/counter) event counting tools (perf record on Linux, pmcstat on FreeBSD, Oracle Studio collect on Solaris, don't know for macOS). These can record events such as cache misses with the associated callstacks. You can then use tools HotSpot and perfgrind/kcachegrind (I hae used HotSpot but not perfgrind). The big advantage of this is that the PMCs are part of the hardware and the overhead of doing this is minor. The only slight limitation is that then number of counters is limited. Another disadvantage: the hardware does not know which accesses belong to the target code versus which accesses belong to the code of valgrind itself. Even if the hardware could separate accesses on that basis, it does not know about stack frames. Allocating a stack frame shortly after CALL, and discarding it shortly before RETURN, can be significant reasons for cache misses, either immediately or in the near future. Then there are system calls, which might significantly alter cache contents. Sometimes the resulting cache misses should be included (they most certainly do affect wall clock time), but in some other cases you may wish that the operating system was ignored. If the target program uses threads, then using memory for inter-thread communication (semaphore, mutex, pipeline, etc.) becomes another factor. ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Cache conflict detection support in cachegrind
On 29-01-23 14:31, Ivica B wrote: Hi! I am looking for a tool that can detect cache conflicts, but I am not finding any. There are a few that are mostly academic, and thus not maintained. I think it is important for the performance analysis community to have a tool that to some extent can detect cache conflicts. Is it possible to implement support for detecting source code lines where cache conflicts occur? More info on cache conflicts below. [snip] I agree that this is an interesting topic. If anyone else has ideas I'm all ears. My recommendations for this are: 1/ PMU/PMC (performance monitoring unit/counter) event counting tools (perf record on Linux, pmcstat on FreeBSD, Oracle Studio collect on Solaris, don't know for macOS). These can record events such as cache misses with the associated callstacks. You can then use tools HotSpot and perfgrind/kcachegrind (I hae used HotSpot but not perfgrind). The big advantage of this is that the PMCs are part of the hardware and the overhead of doing this is minor. The only slight limitation is that then number of counters is limited. 2/ pahole https://github.com/acmel/dwarves A really nice binary analysis tool. It will analyze your binary (with debuginfo) and generate a report for all structures showing holes, padding and cache lines. It can even generate modified source with members reordered to improve the packing. However as this is a static tool working only on the data structures it knows nothing about your access patterns. 3/ DHAT One of the Valgrind tools. This profiles heap memory. If the block is less than 1k it will also generate a kind of ascii-html heat map. That map is an aggregate, but you can usually guess which offsets get hit the most together. Cachegrind doesn't really do this with the kind of accuracy that PMCs do. It has a reduced model of the cache and has a basic branch predictor. I don't know if or how speculative execution affects the cache hit rate, but Valgrind doesn't do any of that. A+ Paul ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
[Valgrind-users] Cache conflict detection support in cachegrind
Hi! I am looking for a tool that can detect cache conflicts, but I am not finding any. There are a few that are mostly academic, and thus not maintained. I think it is important for the performance analysis community to have a tool that to some extent can detect cache conflicts. Is it possible to implement support for detecting source code lines where cache conflicts occur? More info on cache conflicts below. === What are cache conflicts? === Cache conflict happens when a cache line is brought up from the memory to the cache, but very soon has to be evicted to the main memory because another cache line is mapped to the same entry. The problem with detecting cache conflicts is that it is normal that one cache line gets evicted because it is replaced by another cache line. Therefore, a cache conflict is an outlier: the cache line spent very little time in the cache before it got evicted. === How to detect cache conflicts? === As I said, there are a few science papers that talk about it. And probably there are a few different approaches on how to do it. One approach is to count the amount of time a cache line has been sitting in cache before it got evicted. For each instruction that causes an eviction, we count what is the amount of time that the evicted cache line spent in the cache. Next we build a statistic. Instructions evicting mostly shortly-lived cache lines are the ones where cache conflicts are most likely to happen. = Please comment! Ivica ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users