See this already-open issue:
https://issues.apache.org/jira/browse/LUCENE-9480

No threadlocals are necessary. DataInput/IndexInput are already intended
for use by one thread.

For e.g. mmapdirectory this should be "+=". no buffer is required.

On Wed, Feb 17, 2021 at 4:15 PM Greg Miller <gsmil...@gmail.com> wrote:

> Hi folks-
>
> I work on a Lucene-based search system and we recently added Java Flight
> Recorder to our benchmark tooling. When looking through results, we found
> DataInput#skipBytes() to be a top contributor to garbage creation. We're
> using Lucene84SkipReader and always skipping over Impacts in our use-case.
> At first glance, it appeared pretty obvious that creating new instances of
> the skipBuffer byte[] for each instance of DataInput was the culprit.
>
> It looks like alternatives were discussed originally in LUCENE-5583
> <https://issues.amazon.com/issues/LUCENE-4947>, one of which being a
> thread-local implementation of the skip buffer (since it can't be a static
> field without breaking delegating subclasses, like ChecksumIndexInput). At
> the time, a thread-local was advised against
> <https://issues.apache.org/jira/browse/LUCENE-5583?focusedCommentId=13964258&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13964258>
>  by
> Uwe due to GC expense, but in our benchmarks, bringing in a thread-local
> implementation reduced overall GC time by ~7%.
>
> I'd like to revisit this implementation decision and discuss ways in which
> we can reduce this unnecessary garbage creation. It seems like moving to a
> thread-local implementation is a win here, but I'd love to hear more
> thoughts or alternative suggestions from the group. I'm new to this
> community, so I'm not sure the best way to proceed. Should I open a Jira
> issue as a next step? Thanks in advance!
>
> Cheers,
> -Greg
>

Reply via email to