The issue is that clone or not, they are both IndexInput.java. So if we go
with your proposal, then *sometimes* the code will have this reference and
*other times* it won't and the reference will be null. In that non-clone
case, where would its resource description (filename) come from?  I predict
too many bugs.

Making this part of the code complex would be the wrong tradeoff for 1%
performance.


On Fri, Feb 19, 2021 at 4:58 PM Viral Gandhi <viral.dev...@gmail.com> wrote:

> Hello everyone,
>
> We recently added Java Flight Recorder (JFR) to our internal benchmarking.
> While looking through top heap allocations from JFR output, we noticed the
> following to be in top 5 contributors to garbage creation.
>
> org.apache.lucene.store.IndexInput#getFullSliceDescription()
>   at org.apache.lucene.store.ByteBufferIndexInput#buildSlice()
>    at org.apache.lucene.store.ByteBufferIndexInput#slice()
>    at
> org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl#slice()
>    at org.apache.lucene.store.IndexInput#randomAccessSlice()
>
> It seems that we construct a new String resource description for each
> clone/slice of IndexInput instance. Resource description for a clone/slice
> instance is a String concatenation of resource description from original
> instance and current slice description. Also, clone/slice can happen
> multiple times per query per segment.
>
> Can we avoid upfront String allocation by late-evaluating
> IndexInput.toString() on the cloned instance? One approach could be to
> hold a reference of the base IndexInput instance in the cloned instance.
> Then while evaluating IndexInput.toString() on a cloned instance, we can
> construct a resource description by concatenating base instance's
> toString() with current resource description. My understanding is that
> IndexInput.toString() is primarily used for debugging purposes hence we
> can benefit from the late-evaluation.
>
> With this approach, we are seeing sustainable GC time reduction of ~6% and
> gain of ~1% to red-line QPS (throughput). My intention to start this thread
> is to collect feedback on this approach as well as to discuss any other
> ideas.
>
> Thanks,
> Viral Gandhi
>

Reply via email to