[jira] [Commented] (LUCENE-10099) Add -Ptests.asyncprofile option.
[ https://issues.apache.org/jira/browse/LUCENE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17416975#comment-17416975 ] Mark Robert Miller commented on LUCENE-10099: - Okay, this actually seems to be working as intended, it's just those native methods for some of these quick tests are just what happens to be on top. I tried out linking to the .so lib built in agent and the result is not really any different, it's just a bit harder to control some things. You can see the correct Java methods showing up in runs though. Some of what comes out will depend on the sampling rate, currently set to 10ms: {noformat} PROFILE SUMMARY from 140 events (total: 140) tests.profile.mode=cpu tests.profile.count=10 tests.profile.stacksize=1 tests.profile.linenumbers=false PERCENT CPU SAMPLES STACK 5.00% 7 #SymbolTable::lookup_dynamic() 2.86% 4 #IntervalWalker::walk_to() 2.14% 3 #no_Java_frame() 2.14% 3 #ValueStack::values_do() 1.43% 2 org.apache.lucene.store.ByteBuffersDataOutput#() 1.43% 2 #I2C/C2I adapters() 1.43% 2 java.util.Formatter$FormatSpecifier#print() 1.43% 2 #copy_user_generic_string() 1.43% 2 #vframe::sender() 0.71% 1 #ConstantPool::impl_klass_ref_index_at() {noformat} > Add -Ptests.asyncprofile option. > > > Key: LUCENE-10099 > URL: https://issues.apache.org/jira/browse/LUCENE-10099 > Project: Lucene - Core > Issue Type: Test > Components: modules/test-framework >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > Man, I don't even remember the last time I contributed an issue to Lucene > that was for Lucene. Feels like it has to be near a decade. > I recently finished some work that took quite an investment around automated > performance aids. It wasn't giving me that feeling of satisfaction I faintly > remember though. Yesterday, I was mostly feeling nostalgic about past times > when I would bring up some nasty inefficiency or ugly problem and not hear > about how it doesn't likely matter or doesn't sound right or 'meh' - I'd hear > "omg, you're right, this must be fixed" and then "you're an idiot, it's even > worse than that". > So new little project for today - the first of two things I'd like to > contribute - async profiler support for the tests.profile option. > I have a rough draft, it works roughly as tests.profile but instead of: > {noformat} > PROFILE SUMMARY from 174 events (total: 174) > tests.profile.mode=cpu > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT CPU SAMPLES STACK > 5.17% 9 java.util.zip.Inflater#inflateBytesBytes() > 3.45% 6 java.io.FileInputStream#open0() > 2.30% 4 java.io.UnixFileSystem#getBooleanAttributes0() > 2.30% 4 java.io.RandomAccessFile#readBytes() > 1.72% 3 java.io.UnixFileSystem#normalize() > 1.72% 3 > org.apache.lucene.util.automaton.RunAutomaton#() > 1.72% 3 java.util.Arrays#fill() > 1.15% 2 > java.lang.invoke.DirectMethodHandle$Holder#invokeStatic() > 1.15% 2 java.lang.StringLatin1#charAt() > 1.15% 2 java.io.DataOutputStream#writeUTF() > {noformat} > you get more like: > {noformat} > PROFILE SUMMARY from 368675 events (total: 368675) > tests.profile.mode=cpu > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT CPU SAMPLES STACK > 3.89% 14353 #SymbolTable::lookup_dynamic() > 1.67% 6171 #Interpreter() > 1.40% 5175 #InstanceKlass::find_method_index() > 1.34% 4958 #__tls_get_addr() > 1.13% 4177 #I2C/C2I adapters() > 1.09% 4022 #resource_allocate_bytes() > 0.69% 2528 #ValueStack::values_do() > 0.62% 2298 #no_Java_frame() > 0.62% 2281 #ScopeDesc::decode_body() > 0.55% 2017 #Dictionary::find() > {noformat} > as well as these additional files shown below. Sampling was a bit intense I > guess, sorry for the slow load on the flame graph ;) > Summary > https://elated-banach-e02e8b.netlify.app/asyncprofiler/hotspot-pid-686723-id-0-20210912-162040.txt > Results Tree > https://elated-banach-e02e8b.netlify.app/asyncprofiler/tree-hotspot-pid-686723-id-0-20210912-162044.html > Results Flame Graph > https://elated-banach-e02e8b.netlify.app/asyncprofiler/flame-hotspot-
[jira] [Commented] (LUCENE-10099) Add -Ptests.asyncprofile option.
[ https://issues.apache.org/jira/browse/LUCENE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415268#comment-17415268 ] Mark Robert Miller commented on LUCENE-10099: - {quote}Will it help this profiler? {quote} I think the reason the output above is not listing proper java methods may be how I am plugging in a custom agent (trying to keep the ability to allow filtering by java thread properties). I may need to just use the agent built into the .so library. > Add -Ptests.asyncprofile option. > > > Key: LUCENE-10099 > URL: https://issues.apache.org/jira/browse/LUCENE-10099 > Project: Lucene - Core > Issue Type: Test > Components: modules/test-framework >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > Man, I don't even remember the last time I contributed an issue to Lucene > that was for Lucene. Feels like it has to be near a decade. > I recently finished some work that took quite an investment around automated > performance aids. It wasn't giving me that feeling of satisfaction I faintly > remember though. Yesterday, I was mostly feeling nostalgic about past times > when I would bring up some nasty inefficiency or ugly problem and not hear > about how it doesn't likely matter or doesn't sound right or 'meh' - I'd hear > "omg, you're right, this must be fixed" and then "you're an idiot, it's even > worse than that". > So new little project for today - the first of two things I'd like to > contribute - async profiler support for the tests.profile option. > I have a rough draft, it works roughly as tests.profile but instead of: > {noformat} > PROFILE SUMMARY from 174 events (total: 174) > tests.profile.mode=cpu > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT CPU SAMPLES STACK > 5.17% 9 java.util.zip.Inflater#inflateBytesBytes() > 3.45% 6 java.io.FileInputStream#open0() > 2.30% 4 java.io.UnixFileSystem#getBooleanAttributes0() > 2.30% 4 java.io.RandomAccessFile#readBytes() > 1.72% 3 java.io.UnixFileSystem#normalize() > 1.72% 3 > org.apache.lucene.util.automaton.RunAutomaton#() > 1.72% 3 java.util.Arrays#fill() > 1.15% 2 > java.lang.invoke.DirectMethodHandle$Holder#invokeStatic() > 1.15% 2 java.lang.StringLatin1#charAt() > 1.15% 2 java.io.DataOutputStream#writeUTF() > {noformat} > you get more like: > {noformat} > PROFILE SUMMARY from 368675 events (total: 368675) > tests.profile.mode=cpu > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT CPU SAMPLES STACK > 3.89% 14353 #SymbolTable::lookup_dynamic() > 1.67% 6171 #Interpreter() > 1.40% 5175 #InstanceKlass::find_method_index() > 1.34% 4958 #__tls_get_addr() > 1.13% 4177 #I2C/C2I adapters() > 1.09% 4022 #resource_allocate_bytes() > 0.69% 2528 #ValueStack::values_do() > 0.62% 2298 #no_Java_frame() > 0.62% 2281 #ScopeDesc::decode_body() > 0.55% 2017 #Dictionary::find() > {noformat} > as well as these additional files shown below. Sampling was a bit intense I > guess, sorry for the slow load on the flame graph ;) > Summary > https://elated-banach-e02e8b.netlify.app/asyncprofiler/hotspot-pid-686723-id-0-20210912-162040.txt > Results Tree > https://elated-banach-e02e8b.netlify.app/asyncprofiler/tree-hotspot-pid-686723-id-0-20210912-162044.html > Results Flame Graph > https://elated-banach-e02e8b.netlify.app/asyncprofiler/flame-hotspot-pid-686723-id-0-20210912-162040.html > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10099) Add -Ptests.asyncprofile option.
[ https://issues.apache.org/jira/browse/LUCENE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415266#comment-17415266 ] Mark Robert Miller commented on LUCENE-10099: - Looks like the other item I was going to add doesn’t so much fit. I also wanted to add support for jmhs perfasm, which pulls out code region hotspots from perf and lines it up with assembly. It would be some dancing around to get it fully nice and integrated compared to this issue though, and more importantly doesn’t really seem to be that useful with these tests - it’s much more oriented towards running a single benchmark than tests. > Add -Ptests.asyncprofile option. > > > Key: LUCENE-10099 > URL: https://issues.apache.org/jira/browse/LUCENE-10099 > Project: Lucene - Core > Issue Type: Test > Components: modules/test-framework >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > Man, I don't even remember the last time I contributed an issue to Lucene > that was for Lucene. Feels like it has to be near a decade. > I recently finished some work that took quite an investment around automated > performance aids. It wasn't giving me that feeling of satisfaction I faintly > remember though. Yesterday, I was mostly feeling nostalgic about past times > when I would bring up some nasty inefficiency or ugly problem and not hear > about how it doesn't likely matter or doesn't sound right or 'meh' - I'd hear > "omg, you're right, this must be fixed" and then "you're an idiot, it's even > worse than that". > So new little project for today - the first of two things I'd like to > contribute - async profiler support for the tests.profile option. > I have a rough draft, it works roughly as tests.profile but instead of: > {noformat} > PROFILE SUMMARY from 174 events (total: 174) > tests.profile.mode=cpu > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT CPU SAMPLES STACK > 5.17% 9 java.util.zip.Inflater#inflateBytesBytes() > 3.45% 6 java.io.FileInputStream#open0() > 2.30% 4 java.io.UnixFileSystem#getBooleanAttributes0() > 2.30% 4 java.io.RandomAccessFile#readBytes() > 1.72% 3 java.io.UnixFileSystem#normalize() > 1.72% 3 > org.apache.lucene.util.automaton.RunAutomaton#() > 1.72% 3 java.util.Arrays#fill() > 1.15% 2 > java.lang.invoke.DirectMethodHandle$Holder#invokeStatic() > 1.15% 2 java.lang.StringLatin1#charAt() > 1.15% 2 java.io.DataOutputStream#writeUTF() > {noformat} > you get more like: > {noformat} > PROFILE SUMMARY from 368675 events (total: 368675) > tests.profile.mode=cpu > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT CPU SAMPLES STACK > 3.89% 14353 #SymbolTable::lookup_dynamic() > 1.67% 6171 #Interpreter() > 1.40% 5175 #InstanceKlass::find_method_index() > 1.34% 4958 #__tls_get_addr() > 1.13% 4177 #I2C/C2I adapters() > 1.09% 4022 #resource_allocate_bytes() > 0.69% 2528 #ValueStack::values_do() > 0.62% 2298 #no_Java_frame() > 0.62% 2281 #ScopeDesc::decode_body() > 0.55% 2017 #Dictionary::find() > {noformat} > as well as these additional files shown below. Sampling was a bit intense I > guess, sorry for the slow load on the flame graph ;) > Summary > https://elated-banach-e02e8b.netlify.app/asyncprofiler/hotspot-pid-686723-id-0-20210912-162040.txt > Results Tree > https://elated-banach-e02e8b.netlify.app/asyncprofiler/tree-hotspot-pid-686723-id-0-20210912-162044.html > Results Flame Graph > https://elated-banach-e02e8b.netlify.app/asyncprofiler/flame-hotspot-pid-686723-id-0-20210912-162040.html > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10099) Add -Ptests.asyncprofile option.
[ https://issues.apache.org/jira/browse/LUCENE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415264#comment-17415264 ] Mark Robert Miller commented on LUCENE-10099: - Their wiki has some info on that: “CPU profiling In this mode profiler collects stack trace samples that include Java methods, native calls, JVM code and kernel functions. The general approach is receiving call stacks generated by perf_events and matching them up with call stacks generated by AsyncGetCallTrace, in order to produce an accurate profile of both Java and native code. Additionally, async-profiler provides a workaround to recover stack traces in some corner cases where AsyncGetCallTrace fails. This approach has the following advantages compared to using perf_events directly with a Java agent that translates addresses to Java method names: Works on older Java versions because it doesn't require -XX:+PreserveFramePointer, which is only available in JDK 8u60 and later. Does not introduce the performance overhead from -XX:+PreserveFramePointer, which can in rare cases be as high as 10%. Does not require generating a map file to map Java code addresses to method names. Works with interpreter frames. Does not require writing out a perf.data file for further processing in user space scripts. If you wish to resolve frames within libjvm, the debug symbols are required.” > Add -Ptests.asyncprofile option. > > > Key: LUCENE-10099 > URL: https://issues.apache.org/jira/browse/LUCENE-10099 > Project: Lucene - Core > Issue Type: Test > Components: modules/test-framework >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > Man, I don't even remember the last time I contributed an issue to Lucene > that was for Lucene. Feels like it has to be near a decade. > I recently finished some work that took quite an investment around automated > performance aids. It wasn't giving me that feeling of satisfaction I faintly > remember though. Yesterday, I was mostly feeling nostalgic about past times > when I would bring up some nasty inefficiency or ugly problem and not hear > about how it doesn't likely matter or doesn't sound right or 'meh' - I'd hear > "omg, you're right, this must be fixed" and then "you're an idiot, it's even > worse than that". > So new little project for today - the first of two things I'd like to > contribute - async profiler support for the tests.profile option. > I have a rough draft, it works roughly as tests.profile but instead of: > {noformat} > PROFILE SUMMARY from 174 events (total: 174) > tests.profile.mode=cpu > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT CPU SAMPLES STACK > 5.17% 9 java.util.zip.Inflater#inflateBytesBytes() > 3.45% 6 java.io.FileInputStream#open0() > 2.30% 4 java.io.UnixFileSystem#getBooleanAttributes0() > 2.30% 4 java.io.RandomAccessFile#readBytes() > 1.72% 3 java.io.UnixFileSystem#normalize() > 1.72% 3 > org.apache.lucene.util.automaton.RunAutomaton#() > 1.72% 3 java.util.Arrays#fill() > 1.15% 2 > java.lang.invoke.DirectMethodHandle$Holder#invokeStatic() > 1.15% 2 java.lang.StringLatin1#charAt() > 1.15% 2 java.io.DataOutputStream#writeUTF() > {noformat} > you get more like: > {noformat} > PROFILE SUMMARY from 368675 events (total: 368675) > tests.profile.mode=cpu > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT CPU SAMPLES STACK > 3.89% 14353 #SymbolTable::lookup_dynamic() > 1.67% 6171 #Interpreter() > 1.40% 5175 #InstanceKlass::find_method_index() > 1.34% 4958 #__tls_get_addr() > 1.13% 4177 #I2C/C2I adapters() > 1.09% 4022 #resource_allocate_bytes() > 0.69% 2528 #ValueStack::values_do() > 0.62% 2298 #no_Java_frame() > 0.62% 2281 #ScopeDesc::decode_body() > 0.55% 2017 #Dictionary::find() > {noformat} > as well as these additional files shown below. Sampling was a bit intense I > guess, sorry for the slow load on the flame graph ;) > Summary > https://elated-banach-e02e8b.netlify.app/asyncprofiler/hotspot-pid-686723-id-0-20210912-162040.txt > Results Tree > https://elated-banach-e02e8b.netlify.app/asyncprofiler/tree-hotspot-pid-686723-id-0-20210912-162044.html > Results Flame Graph > https://elated-banach-e02e8b.netlify.app/asyncprofiler/flame-hotspot-pid-686723-id-0-2021091