Thank you for your detailed feedback! I will explore the lambda approach and benchmark its benefits using our internal testing framework.
Viral Gandhi On Sat, 20 Feb 2021 at 03:23, Robert Muir <rcm...@gmail.com> wrote: > +1 to explore Uwe's lambda idea. I really hate lambdas (you should too, if > you know what happens behind the scenes), but this sounds like maybe an > actual justified use of lambdas, taking advantage of the capture facility > and not using them just to be "cute" > > On Sat, Feb 20, 2021 at 6:17 AM Uwe Schindler <u...@thetaphi.de> wrote: > >> Hi, >> >> >> >> I was thinking about something similar when working on the Java 17 >> replacement for MMapDirectory. My idea would still be late evaluation, but >> using a lambda. Same way how you can do lazy evaluation in Log4j2 when >> logging with debug/trace/… >> >> >> >> Instead of having “String resourceDescription“ as a field in each >> Indexinput, we can use “Supplier<String> resourceDescription” and >> initialize it like: () -> foo + bar + parent.toString +… >> >> >> >> Just because this might be an option to do, I fully agree with Mike Drob >> and Robert Muir: The risk of memory leaks for such a small improvement is >> too high. It also depends on the GC you use. If you use old >> ConcMarkSweepGC, I agree this is an overhead, but nowadays, Lucene should >> use G1GC. >> >> >> >> In addition, the strings are not for debugging, they are really useful >> when something goes wrong, e.g. when I/O errors occur. >> >> >> >> Uwe >> >> >> >> ----- >> >> Uwe Schindler >> >> Achterdiek 19, D-28357 Bremen >> >> https://www.thetaphi.de >> >> eMail: u...@thetaphi.de >> >> >> >> *From:* Robert Muir <rcm...@gmail.com> >> *Sent:* Saturday, February 20, 2021 11:56 AM >> *To:* dev@lucene.apache.org >> *Subject:* Re: GC cost of creating String resource description on >> IndexInput clone >> >> >> >> The issue is that clone or not, they are both IndexInput.java. So if we >> go with your proposal, then *sometimes* the code will have this reference >> and *other times* it won't and the reference will be null. In that >> non-clone case, where would its resource description (filename) come from? >> I predict too many bugs. >> >> >> >> Making this part of the code complex would be the wrong tradeoff for 1% >> performance. >> >> >> >> >> >> On Fri, Feb 19, 2021 at 4:58 PM Viral Gandhi <viral.dev...@gmail.com> >> wrote: >> >> Hello everyone, >> >> We recently added Java Flight Recorder (JFR) to our internal >> benchmarking. While looking through top heap allocations from JFR output, >> we noticed the following to be in top 5 contributors to garbage creation. >> >> org.apache.lucene.store.IndexInput#getFullSliceDescription() >> at org.apache.lucene.store.ByteBufferIndexInput#buildSlice() >> at org.apache.lucene.store.ByteBufferIndexInput#slice() >> at >> org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl#slice() >> at org.apache.lucene.store.IndexInput#randomAccessSlice() >> >> It seems that we construct a new String resource description for each >> clone/slice of IndexInput instance. Resource description for a clone/slice >> instance is a String concatenation of resource description from original >> instance and current slice description. Also, clone/slice can happen >> multiple times per query per segment. >> >> Can we avoid upfront String allocation by late-evaluating >> IndexInput.toString() on the cloned instance? One approach could be to >> hold a reference of the base IndexInput instance in the cloned instance. >> Then while evaluating IndexInput.toString() on a cloned instance, we can >> construct a resource description by concatenating base instance's >> toString() with current resource description. My understanding is that >> IndexInput.toString() is primarily used for debugging purposes hence we >> can benefit from the late-evaluation. >> >> With this approach, we are seeing sustainable GC time reduction of ~6% >> and gain of ~1% to red-line QPS (throughput). My intention to start this >> thread is to collect feedback on this approach as well as to discuss any >> other ideas. >> >> >> >> Thanks, >> >> Viral Gandhi >> >>