Thank you for your detailed feedback! I will explore the lambda approach
and benchmark its benefits using our internal testing framework.

Viral Gandhi

On Sat, 20 Feb 2021 at 03:23, Robert Muir <rcm...@gmail.com> wrote:

> +1 to explore Uwe's lambda idea. I really hate lambdas (you should too, if
> you know what happens behind the scenes), but this sounds like maybe an
> actual justified use of lambdas, taking advantage of the capture facility
> and not using them just to be "cute"
>
> On Sat, Feb 20, 2021 at 6:17 AM Uwe Schindler <u...@thetaphi.de> wrote:
>
>> Hi,
>>
>>
>>
>> I was thinking about something similar when working on the Java 17
>> replacement for MMapDirectory. My idea would still be late evaluation, but
>> using a lambda. Same way how you can do lazy evaluation in Log4j2 when
>> logging with debug/trace/…
>>
>>
>>
>> Instead of having “String resourceDescription“ as a field in each
>> Indexinput, we can use “Supplier<String> resourceDescription” and
>> initialize it like: () -> foo + bar + parent.toString +…
>>
>>
>>
>> Just because this might be an option to do, I fully agree with Mike Drob
>> and Robert Muir: The risk of memory leaks for such a small improvement is
>> too high. It also depends on the GC you use. If you use old
>> ConcMarkSweepGC, I agree this is an overhead, but nowadays, Lucene should
>> use G1GC.
>>
>>
>>
>> In addition, the strings are not for debugging, they are really useful
>> when something goes wrong, e.g. when I/O errors occur.
>>
>>
>>
>> Uwe
>>
>>
>>
>> -----
>>
>> Uwe Schindler
>>
>> Achterdiek 19, D-28357 Bremen
>>
>> https://www.thetaphi.de
>>
>> eMail: u...@thetaphi.de
>>
>>
>>
>> *From:* Robert Muir <rcm...@gmail.com>
>> *Sent:* Saturday, February 20, 2021 11:56 AM
>> *To:* dev@lucene.apache.org
>> *Subject:* Re: GC cost of creating String resource description on
>> IndexInput clone
>>
>>
>>
>> The issue is that clone or not, they are both IndexInput.java. So if we
>> go with your proposal, then *sometimes* the code will have this reference
>> and *other times* it won't and the reference will be null. In that
>> non-clone case, where would its resource description (filename) come from?
>> I predict too many bugs.
>>
>>
>>
>> Making this part of the code complex would be the wrong tradeoff for 1%
>> performance.
>>
>>
>>
>>
>>
>> On Fri, Feb 19, 2021 at 4:58 PM Viral Gandhi <viral.dev...@gmail.com>
>> wrote:
>>
>> Hello everyone,
>>
>> We recently added Java Flight Recorder (JFR) to our internal
>> benchmarking. While looking through top heap allocations from JFR output,
>> we noticed the following to be in top 5 contributors to garbage creation.
>>
>> org.apache.lucene.store.IndexInput#getFullSliceDescription()
>>   at org.apache.lucene.store.ByteBufferIndexInput#buildSlice()
>>    at org.apache.lucene.store.ByteBufferIndexInput#slice()
>>    at
>> org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl#slice()
>>    at org.apache.lucene.store.IndexInput#randomAccessSlice()
>>
>> It seems that we construct a new String resource description for each
>> clone/slice of IndexInput instance. Resource description for a clone/slice
>> instance is a String concatenation of resource description from original
>> instance and current slice description. Also, clone/slice can happen
>> multiple times per query per segment.
>>
>> Can we avoid upfront String allocation by late-evaluating
>> IndexInput.toString() on the cloned instance? One approach could be to
>> hold a reference of the base IndexInput instance in the cloned instance.
>> Then while evaluating IndexInput.toString() on a cloned instance, we can
>> construct a resource description by concatenating base instance's
>> toString() with current resource description. My understanding is that
>> IndexInput.toString() is primarily used for debugging purposes hence we
>> can benefit from the late-evaluation.
>>
>> With this approach, we are seeing sustainable GC time reduction of ~6%
>> and gain of ~1% to red-line QPS (throughput). My intention to start this
>> thread is to collect feedback on this approach as well as to discuss any
>> other ideas.
>>
>>
>>
>> Thanks,
>>
>> Viral Gandhi
>>
>>

Reply via email to