[ https://issues.apache.org/jira/browse/LUCENE-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402203#comment-17402203 ]
Greg Miller commented on LUCENE-9918: ------------------------------------- [~gworah] could be worth trying. I wasn't aware of the concept of compiling/optimizing multiple times, but I'm no expert in the space. My understanding was that the decision to compile would happen once, and the code would get compiled. I was able to verify that the method under test is getting compiled by HotSpot, but you're suggesting that maybe with more iterations it would get compiled into something different? Wouldn't be too hard to try if you'd like to pull down that branch I was working against. For reference, it's setup to run 10 warmup iterations before running the test for 10 iterations: https://github.com/gsmiller/lucene-pfor-benchmark/blob/main/src/main/java/gsmiller/DecodeBenchmark.java > Can PForUtil be further auto-vectorized? > ---------------------------------------- > > Key: LUCENE-9918 > URL: https://issues.apache.org/jira/browse/LUCENE-9918 > Project: Lucene - Core > Issue Type: Task > Components: core/codecs > Affects Versions: main (9.0) > Reporter: Greg Miller > Priority: Minor > > While working on LUCENE-9850, we discovered the loop in PForUtil::prefixSumOf > is not getting auto-vectorized by the HotSpot compiler. We tried a few > different tweaks to see if we could change this, but came up empty. There are > some additional suggestions in the related > [PR|https://github.com/apache/lucene/pull/69#discussion_r608412309] that > could still be experimented with, and it may be worth doing so to see if > further improvements could be squeezed out. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org