[ https://issues.apache.org/jira/browse/SOLR-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397014#comment-17397014 ]
Mark Robert Miller commented on SOLR-15560: ------------------------------------------- I'm mixing in a few things beyond JIT optimization (encode has other bottlenecks that make that mostly irrelevant), so I'm renaming the issue to just reference JavaBinCodec perf improvements. > Look into JIT optimization in JavaBin. > -------------------------------------- > > Key: SOLR-15560 > URL: https://issues.apache.org/jira/browse/SOLR-15560 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Mark Robert Miller > Assignee: Mark Robert Miller > Priority: Minor > Attachments: javabin.decode.1.before.json, > javabin.decode.2.after.json, javabin.decode.before.and.after.compare.png, > javabin.decode.before.and.after.summary.png > > Time Spent: 20m > Remaining Estimate: 0h > > Javabin performance can be pretty impactful on search side scatter / gather > and especially the /export handler. > It turns out, in JavaBin, where it does a large switch to dispatch based on > the type, its a hot spot that is too large to be inlined. > You can pull some less common paths out into another method to address this. > I have not benchmark this yet, and it’s possible other bottlenecks may dampen > the win, but I noticed the following on ref branch (with a couple other > optimizations that were not nearly as wide affecting or quite as hot): > When you run the tests, you get the best results in “client” mode - eg you > prevent the C2 compiler from kicking in. Let’s say I could run the core > nightly tests serially on my laptop in about 8 minutes with C1 - C2 might > take another 2 to 3 minutes on top. This is because the work it does > optimizing and compiling and uncompiling on such a diverse task ends up being > the dominant performance drag. > With a bit of key optimization here, running the tests with C2 ends up about > on par with stopping at C1, even though C2 still dominates everything else. > That’s a pretty impactful win in order to be able to move the needle like > that. > Why such a win on C2 without C1 also dodging forward? It’s much more > manageable to reduce the byte code for a none inlined hot method below the C2 > size threshold for inlining than C1s. > So this should be a decent win i hope. There are a variety of differences > that may outweigh it though. > * javabin on master has tail recursion. > * generates a tremendous number of byte arrays > * converts between utf8 and utf16 > * manually does the encoding (the jvm can cheat) > * Has a number of classes that extend it (vs 1 here) > * lots of other things > I’m optimistic we can see some gain though. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org