The issue is that clone or not, they are both IndexInput.java. So if we go with your proposal, then *sometimes* the code will have this reference and *other times* it won't and the reference will be null. In that non-clone case, where would its resource description (filename) come from? I predict too many bugs.
Making this part of the code complex would be the wrong tradeoff for 1% performance. On Fri, Feb 19, 2021 at 4:58 PM Viral Gandhi <viral.dev...@gmail.com> wrote: > Hello everyone, > > We recently added Java Flight Recorder (JFR) to our internal benchmarking. > While looking through top heap allocations from JFR output, we noticed the > following to be in top 5 contributors to garbage creation. > > org.apache.lucene.store.IndexInput#getFullSliceDescription() > at org.apache.lucene.store.ByteBufferIndexInput#buildSlice() > at org.apache.lucene.store.ByteBufferIndexInput#slice() > at > org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl#slice() > at org.apache.lucene.store.IndexInput#randomAccessSlice() > > It seems that we construct a new String resource description for each > clone/slice of IndexInput instance. Resource description for a clone/slice > instance is a String concatenation of resource description from original > instance and current slice description. Also, clone/slice can happen > multiple times per query per segment. > > Can we avoid upfront String allocation by late-evaluating > IndexInput.toString() on the cloned instance? One approach could be to > hold a reference of the base IndexInput instance in the cloned instance. > Then while evaluating IndexInput.toString() on a cloned instance, we can > construct a resource description by concatenating base instance's > toString() with current resource description. My understanding is that > IndexInput.toString() is primarily used for debugging purposes hence we > can benefit from the late-evaluation. > > With this approach, we are seeing sustainable GC time reduction of ~6% and > gain of ~1% to red-line QPS (throughput). My intention to start this thread > is to collect feedback on this approach as well as to discuss any other > ideas. > > Thanks, > Viral Gandhi >