Hi Gregg, On Tue, Mar 31, 2015 at 2:31 PM, Brendan Gregg <brendan.d.gr...@gmail.com> wrote: > > On Tue, Mar 31, 2015 at 12:33 AM, Brendan Gregg > <brendan.d.gr...@gmail.com> wrote: > > G'Day Stephane, > > > > On Mon, Mar 30, 2015 at 3:19 PM, Stephane Eranian <eran...@google.com> > > wrote: > > [...] > >> The current support only works when the runtime is monitored from > >> start to finish: perf record java --agentpath:libpfmjvmti.so my_class. > >> > >> Once the run is completed, the jitdump file needs to be injected into > >> the perf.data file. This is accomplished by using the perf inject command. > >> This will also generate an ELF image for each jitted function. The > >> inject MMAP records will point to those ELF images. The reasoning > >> behind using ELF images is that it makes processing for perf report > >> and annotate automatic and transparent. It also makes it easier to > >> package and analyze on a remote machine. > > [...] > > > > This is really impressive work. Do we have an idea of the overhead for > > running the java agent? > >
Thanks Gregg. Happy to see you find these patches useful. I think with PeterZ's latest clock changes, things are easier to run now. > > Today, I'm using perf-map-agent, loaded dynamically, to dump a > > /tmp/perf*.map file as needed. My company has tens of thousands of > > Linux instances running Java, but very few need profiling, and we > > don't know which beforehand. So a snapshot-on-demand approach is > > ideal. An always-on approach, well, we'd have to know the overhead (I > > can build the agent and test...). > > I built the agent and tested with an application framework > micro-benchmark, and saw the performance overhead drop after start > from about 13% initially (measured as a reduction in maximum req/sec > given fixed CPU capacity), to 1.1% after a minute, and then 0.13% > (which is really just noise) after several minutes of high load. > If you're JIT runtime does not keep recompiling, then yes, I expect the overhead to be concentrated on startup and each time a new function is executed. Then after no callback is really needed. And this is what you observed. > > So the overhead is basically zero after (minutes of) warmup, at least > for my test. My jit.dump file reached 8 Mbytes, and was growing by a > tiny amount every 30 seconds or so (hence the near-zero overhead). I'm > much less concerned about overheads now. > > I'll test with a production workload if I can... But I'm still curious > about why we're even doing this, instead of the previous method of > taking symbol snapshots. Is there a backstory? If it involves a case > of high symbol churn, then this should also mean non-zero overhead to > constantly log. > Yes, so either you have the JIT runtime activate that agent from startup or we need to have a mechanism to kick the agent when perf is running. As for the fsync() question, yes, there is a race between JIT runtime startup and dumping into the jitdump and perf inject. One thing I will add in the locking on the inject side to make sure inject reads a sane file (without truncated records). The layout of the jitdump is such that it does not hold the number of records in the file. Inject just reads until EOF, so that should be okay with locks. If you run perf inject, then you are done with the collection. Pipe mode is still not operational, will look at it next. Hopefully we can also make it work with the jitdump file. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/