Re: A deficiency in lucene code that affects memory footprint and GC

Dawid Weiss Fri, 23 Jan 2026 01:57:30 -0800

Hi Misha,

Using a github fork and pull request is by far the best way to do it
because it allows automated checks to be run, easier reviews, etc. It is
also a good learning exercise to get up to speed with working with multiple
remotes in git, if you're willing to invest some time in it (you can have
both the upstream lucene repo and your own fork as two remotes in the same
local git clone).


Dawid

On Fri, Jan 23, 2026 at 2:30 AM Misha Dmitriev via java-user <
[email protected]> wrote:

> Hi again David,
>
> Forgive my ignorance, since I've never contributed to lucene or apache
> before. I created a git branch based on lucene main branch locally, made,
> build, checked and commited my fix, and then tried to create a PR by
> pushing that branch, see below. Unfortunately, I get an error. I used a
> classic PAT as a password, so the problem seems to be not with password
> itself, but with not having some "access rights". Could you please take a
> look? I am using my github login countmdm, email [email protected]
>
> Misha
>
>
> $ git push -u origin optimize-STEF-main
> Username for 'https://github.com': countmdm
> Password for 'https://[email protected]':
> remote: Permission to apache/lucene.git denied to countmdm.
> fatal: unable to access 'https://github.com/apache/lucene.git/': The
> requested URL returned error: 403
>
> ________________________________
> From: Dawid Weiss <[email protected]>
> Sent: Wednesday, January 21, 2026 10:21 PM
> To: [email protected] <[email protected]>
> Cc: Misha Dmitriev <[email protected]>
> Subject: Re: A deficiency in lucene code that affects memory footprint and
> GC
>
>
> Hi Misha,
>
> Please provide a pull request. Small, isolated improvements are easier to
> review and parse by us than large changes but all are welcome.
>
> Also, a lot of (trained) eyes are looking at this code... very often the
> reports of Lucene not performing well are caused by wrong usage rather than
> problems within the implementation - it would be good to share the entire
> context of when the problem is happening and the context of occurrence.
>
> Dawid
>
> On Wed, Jan 21, 2026 at 11:13 PM Misha Dmitriev via java-user <
> [email protected]<mailto:[email protected]>> wrote:
> Hi Lucene community,
>
> At LinkedIn, we use lucene in some important search apps. We recently
> found some problems with GC and memory footpring in one of them. We took a
> heap dump and analyzed it with JXRay (https://jxray.com<https://jxray.com/>).
> Unfortunately, we cannot share the entire jxray analysis due to security
> restrictions, but we can share one excerpt from it, see below. It comes
> from section 11 of jxray report, “Bad Primitive Arrays”, which tells us how
> much memory is wasted due to empty or under-utilized primitive arrays. That
> section says that nearly 4G of memory (25.6% of used heap) is wasted. And
> it turns out that most of that is due to byte[] arrays managed by
> SegmentTermsEnumFrame class.
>
> [X]
>
> To clarify: from the above screenshot, e.g. 80% of all arrays pointed by
> suffixBytes data field are just empty, i.e. contain only zeroes, which
> likely means that they have never been used. Of the remaining arrays, 3%
> are “trail-0s”, i.e. have more than a half of trailing zero elements, i.e.
> were only partially utilized. So only 17% of these arrays have been
> utilized more or less fully. The same is true for all other byte[] arrays
> managed by SegmentTermsEnumFrame. Note that from other sections of the heap
> dump it’s clear that the majority of these objects are garbage, i.e. they
> have already been used and discarded. Thus, at least 80% of memory that was
> allocated for these byte[] arrays has not been used and was wasted. From
> separate memory allocation profiling, we estimated that these arrays are
> responsible for ~2G/sec of memory allocation. If they were allocated lazily
> rather than eagerly, i.e. just before they would be really used, we could
> potentially reduce their memory allocation rate share from 2G/sec to (1 -
> 0.8)*2 = 0.4 G/sec.
>
> A switch from eager to lazy allocation of some data structure is usually
> easy to implement. Let’s take a quick look at the source code<
> https://fossies.org/linux/www/lucene-10.3.2-src.tgz/lucene-10.3.2/lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene90/blocktree/SegmentTermsEnumFrame.java>.
> The suffixBytes array usage has the following pattern:
>
> // Eager construction with hardcoded size
> byte[] suffixBytes = new byte[128];
>
> …  // Fast forward to the loadBlock() method
> …
> if (suffixBytes.length < numSuffixBytes) {
>   // If we need to read more than 128 bytes, increase the array…
>   // … or more precisely, throw away the old array and allocate another one
>   suffixBytes = new byte[ArrayUtil.oversize(numSuffixBytes, 1)];
> }
>
> From this code, it’s clear that two negative things can happen:
>
>   1.
> suffixBytes may not be used at all (the loadBlock() method may not be
> called or may return early). The memory used by the array will be
> completely wasted
>   2.
> If numSuffixBytes happens to be greater than 128, the current eagerly
> allocated array will be discarded. The memory used by it will be wasted.
>
> And as our heap dump illustrates, these things likely happen very often.
> To address this problem, it would be sufficient to change the code as
> follows:
>
> // Avoid eager construction
> byte[] suffixBytes;
> …
> if (suffixByte == null || suffixBytes.length < numSuffixBytes) {
>   // If we need to read more than 128 bytes, increase the array…
>   // … or more precisely, throw away the old array and allocate another one
>   suffixBytes = new byte[ArrayUtil.oversize(numSuffixBytes, 1)];
> }
>
> Note that reducing memory allocation rate results primarily in reduction
> of CPU usage and/or improved latency. That’s because each object allocation
> requires work from the JVM - updating pointers and setting all object bytes
> to zero. And then GCing these objects is also CPU-intensive, and results in
> pausing app threads, which affects latency. However, once memory allocation
> rate is reduced, it may be possible to also reduce the JVM heap memory. So
> the ultimate win is going to be in both CPU and memory.
>
> Please let us know how we can proceed with this. The proposed change is
> trivial, and thus maybe it can be done quickly by some established Lucene
> contributor. If not, I guess I can make it myself and then hope that it
> goes through review and release in reasonable time.
>
> Misha
>
>

Re: A deficiency in lucene code that affects memory footprint and GC

Reply via email to