See this already-open issue: https://issues.apache.org/jira/browse/LUCENE-9480
No threadlocals are necessary. DataInput/IndexInput are already intended for use by one thread. For e.g. mmapdirectory this should be "+=". no buffer is required. On Wed, Feb 17, 2021 at 4:15 PM Greg Miller <gsmil...@gmail.com> wrote: > Hi folks- > > I work on a Lucene-based search system and we recently added Java Flight > Recorder to our benchmark tooling. When looking through results, we found > DataInput#skipBytes() to be a top contributor to garbage creation. We're > using Lucene84SkipReader and always skipping over Impacts in our use-case. > At first glance, it appeared pretty obvious that creating new instances of > the skipBuffer byte[] for each instance of DataInput was the culprit. > > It looks like alternatives were discussed originally in LUCENE-5583 > <https://issues.amazon.com/issues/LUCENE-4947>, one of which being a > thread-local implementation of the skip buffer (since it can't be a static > field without breaking delegating subclasses, like ChecksumIndexInput). At > the time, a thread-local was advised against > <https://issues.apache.org/jira/browse/LUCENE-5583?focusedCommentId=13964258&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13964258> > by > Uwe due to GC expense, but in our benchmarks, bringing in a thread-local > implementation reduced overall GC time by ~7%. > > I'd like to revisit this implementation decision and discuss ways in which > we can reduce this unnecessary garbage creation. It seems like moving to a > thread-local implementation is a win here, but I'd love to hear more > thoughts or alternative suggestions from the group. I'm new to this > community, so I'm not sure the best way to proceed. Should I open a Jira > issue as a next step? Thanks in advance! > > Cheers, > -Greg >