Thanks Robert for the detailed response. Let me try to address a few points
inline if I may...

There is zero concurrency issue. I think you are reading discussion about
old patches that didn't get committed, ignore that and look at the code.
Look at DataInput javadoc: "DataInput may only be used from one thread,
because it is not thread safe (it keeps internal state like file position)."
Look at IndexInput javadoc: "IndexInput may only be used from one thread,
because it is not thread safe (it keeps internal state like file position)."
This applies to ChecksumIndexInput, too, it is a subclass.


Right, I am looking at the code but maybe we're talking about two
different things, so let me clarify. I agree that there is no concurrency
issue with the current code and I apologise if my point was confusing. The
reason skipBytes was made an instance variable as opposed to a static one
was to *avoid *creating a concurrency issue (which certainly would exist if
it had been made static). Making it an instance variable is wasteful for GC
though, no? My suggestion of moving to a threadlocal hits a "happy medium"
where we're not allocating these silly buffers for each DataInput instance
but making sure each thread has a separate one. Does this make more sense
now?

Let's try to avoid adding more complex logic to the situation. Base classes
like DataInput shouldn't have "slow" methods that do things like read bytes
and copy them around just to increment a position pointer... that is not
good. It is supposed to just have some simple decoding helpers (like vint
decode). So it is at the wrong "level" to have such a method, as it can't
do it in an efficient way.


Sure, agree. But the more I think about the Jira you pointed me to, the
more I think we're talking about two separate problems (that could
certainly have a common solution). I agree that the current approach of
skipping bytes in general is wasteful and I don't disagree that it
shouldn't be in DataInput (or that DataInput and IndexInput should be
collapsed). Seems right. But... there's a potential needle-moving change by
just reducing garbage creation here. I just want to make sure we don't let
perfection block progress here.

A good start would be to rename DataInput.skipBytes() to a deprecated
DataInput.skipBytesSlowly() and add a new abstract DataInput.skipBytes().
Now you just have to implement skipBytes() with all the subclasses, but you
can always start with skipBytesSlowly() // TODO: fix this, so it allows
incremental progress. For IndexInput, you can make skipBytes() just call
seek(), that is an easy win. ByteArrayDataInput is already good to go,
perhaps it is the only one with a correct implementation :)

I like this thought. Let me see if I can run with your suggestion and come
up with an incremental approach to reap some short-term GC gains while
moving in a better direction overall. Just need to noodle on it a bit...

Thanks again for the discussion!

Cheers,
-Greg

On Wed, Feb 17, 2021 at 3:34 PM Robert Muir <rcm...@gmail.com> wrote:

> There is zero concurrency issue. I think you are reading discussion about
> old patches that didn't get committed, ignore that and look at the code.
> Look at DataInput javadoc: "DataInput may only be used from one thread,
> because it is not thread safe (it keeps internal state like file position)."
> Look at IndexInput javadoc: "IndexInput may only be used from one thread,
> because it is not thread safe (it keeps internal state like file position)."
> This applies to ChecksumIndexInput, too, it is a subclass.
>
> Let's try to avoid adding more complex logic to the situation. Base
> classes like DataInput shouldn't have "slow" methods that do things like
> read bytes and copy them around just to increment a position pointer...
> that is not good. It is supposed to just have some simple decoding helpers
> (like vint decode). So it is at the wrong "level" to have such a method, as
> it can't do it in an efficient way.
>
> A good start would be to rename DataInput.skipBytes() to a deprecated
> DataInput.skipBytesSlowly() and add a new abstract DataInput.skipBytes().
> Now you just have to implement skipBytes() with all the subclasses, but you
> can always start with skipBytesSlowly() // TODO: fix this, so it allows
> incremental progress. For IndexInput, you can make skipBytes() just call
> seek(), that is an easy win. ByteArrayDataInput is already good to go,
> perhaps it is the only one with a correct implementation :)
>
>
> On Wed, Feb 17, 2021 at 5:10 PM Greg Miller <gsmil...@gmail.com> wrote:
>
>> Thanks for pointing me to the existing issue! I think I generally agree
>> that no threadlocal would be needed if the functionality of skipBytes() and
>> seek() were collapsed as per the issue you pointed me to. In the current
>> state though, concurrency causes problems for delegating implementations of
>> DataInput, specifically ChecksumIndexInput (as detailed in LUCENE-5583
>> <https://issues.apache.org/jira/browse/LUCENE-5583>).
>>
>> I'll spend a little more time thinking about LUCENE-9480 and comment
>> there. Seems like a better approach to eliminate the need for reading in
>> bytes just to skip, assuming it can be made to play nice with checksums,
>> etc. Thanks again!
>>
>> Cheers,
>> -Greg
>>
>> On Wed, Feb 17, 2021 at 1:47 PM Robert Muir <rcm...@gmail.com> wrote:
>>
>>> See this already-open issue:
>>> https://issues.apache.org/jira/browse/LUCENE-9480
>>>
>>> No threadlocals are necessary. DataInput/IndexInput are already intended
>>> for use by one thread.
>>>
>>> For e.g. mmapdirectory this should be "+=". no buffer is required.
>>>
>>> On Wed, Feb 17, 2021 at 4:15 PM Greg Miller <gsmil...@gmail.com> wrote:
>>>
>>>> Hi folks-
>>>>
>>>> I work on a Lucene-based search system and we recently added Java
>>>> Flight Recorder to our benchmark tooling. When looking through results, we
>>>> found DataInput#skipBytes() to be a top contributor to garbage creation.
>>>> We're using Lucene84SkipReader and always skipping over Impacts in our
>>>> use-case. At first glance, it appeared pretty obvious that creating new
>>>> instances of the skipBuffer byte[] for each instance of DataInput was the
>>>> culprit.
>>>>
>>>> It looks like alternatives were discussed originally in LUCENE-5583
>>>> <https://issues.amazon.com/issues/LUCENE-4947>, one of which being a
>>>> thread-local implementation of the skip buffer (since it can't be a static
>>>> field without breaking delegating subclasses, like ChecksumIndexInput). At
>>>> the time, a thread-local was advised against
>>>> <https://issues.apache.org/jira/browse/LUCENE-5583?focusedCommentId=13964258&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13964258>
>>>>  by
>>>> Uwe due to GC expense, but in our benchmarks, bringing in a thread-local
>>>> implementation reduced overall GC time by ~7%.
>>>>
>>>> I'd like to revisit this implementation decision and discuss ways in
>>>> which we can reduce this unnecessary garbage creation. It seems like moving
>>>> to a thread-local implementation is a win here, but I'd love to hear more
>>>> thoughts or alternative suggestions from the group. I'm new to this
>>>> community, so I'm not sure the best way to proceed. Should I open a Jira
>>>> issue as a next step? Thanks in advance!
>>>>
>>>> Cheers,
>>>> -Greg
>>>>
>>>

Reply via email to