Late evaluation sounds like it would definitely be nice, but I worry about holding on to object instances longer than necessary might lead to memory leaks. Sounds like a good issue to open on JIRA.
On Fri, Feb 19, 2021 at 3:58 PM Viral Gandhi <viral.dev...@gmail.com> wrote: > Hello everyone, > > We recently added Java Flight Recorder (JFR) to our internal benchmarking. > While looking through top heap allocations from JFR output, we noticed the > following to be in top 5 contributors to garbage creation. > > org.apache.lucene.store.IndexInput#getFullSliceDescription() > at org.apache.lucene.store.ByteBufferIndexInput#buildSlice() > at org.apache.lucene.store.ByteBufferIndexInput#slice() > at > org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl#slice() > at org.apache.lucene.store.IndexInput#randomAccessSlice() > > It seems that we construct a new String resource description for each > clone/slice of IndexInput instance. Resource description for a clone/slice > instance is a String concatenation of resource description from original > instance and current slice description. Also, clone/slice can happen > multiple times per query per segment. > > Can we avoid upfront String allocation by late-evaluating > IndexInput.toString() on the cloned instance? One approach could be to > hold a reference of the base IndexInput instance in the cloned instance. > Then while evaluating IndexInput.toString() on a cloned instance, we can > construct a resource description by concatenating base instance's > toString() with current resource description. My understanding is that > IndexInput.toString() is primarily used for debugging purposes hence we > can benefit from the late-evaluation. > > With this approach, we are seeing sustainable GC time reduction of ~6% and > gain of ~1% to red-line QPS (throughput). My intention to start this thread > is to collect feedback on this approach as well as to discuss any other > ideas. > > Thanks, > Viral Gandhi >