[
https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220045#comment-15220045
]
Jeff Wartes commented on SOLR-8922:
-----------------------------------
Absolutely. Memory pools were my first thought, between when I saw that 60% and
when I looked at my hit rates and realized the allocation size was could just
be changed. I had started poking around the internet for terms like "slab
allocators" and "direct byte buffers", but even an on-heap persistent pool
sounded good to me. Or, if you had persistent tracking of hit rates for the
optimization, perhaps the size of the scratch array could optimize itself over
time. All of that would be more complicated, of course.
I did look one other place worth mentioning though. In Heliosearch the way the
DocSetCollector handles the "scratch" array isn't any different, but it's
interesting because it added a lifecycle with a close() method to the class, to
support the native bitset implementation. Knowing that it's possible to impose
a lifecycle on the class, checking things out and back into a persistent memory
pool should be easy.
> DocSetCollector can allocate massive garbage on large indexes
> -------------------------------------------------------------
>
> Key: SOLR-8922
> URL: https://issues.apache.org/jira/browse/SOLR-8922
> Project: Solr
> Issue Type: Improvement
> Reporter: Jeff Wartes
> Attachments: SOLR-8922.patch
>
>
> After reaching a point of diminishing returns tuning the GC collector, I
> decided to take a look at where the garbage was coming from. To my surprise,
> it turned out that for my index and query set, almost 60% of the garbage was
> coming from this single line:
> https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49
> This is due to the simple fact that I have 86M documents in my shards.
> Allocating a scratch array big enough to track a result set 1/64th of my
> index (1.3M) is also almost certainly excessive, considering my 99.9th
> percentile hit count is less than 56k.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]