Thanks David for the info!
-John
On Wed, May 27, 2020 at 8:03 PM David Smiley wrote:
> John: you may benefit from more eagerly merging small segments on commit.
> At Salesforce we have a *ton* of indexes, and we reduced the segment count
> in half from the default. The large number of fields
Thank you!
Sounds like it is a bad idea to rely on Accountable, the best path forward
is for us to rethink how to manage our cache.
-John
On Thu, May 28, 2020 at 7:02 AM Adrien Grand wrote:
> I opened https://issues.apache.org/jira/browse/LUCENE-9387.
>
> On Thu, May 28, 2020 at 2:41 PM
I opened https://issues.apache.org/jira/browse/LUCENE-9387.
On Thu, May 28, 2020 at 2:41 PM Michael McCandless <
luc...@mikemccandless.com> wrote:
> +1 to remove Accountable from Lucene's reader classes. Let's open an
> issue and discuss there?
>
> In the past, when we added Accountable,
+1 to remove Accountable from Lucene's reader classes. Let's open an issue
and discuss there?
In the past, when we added Accountable, Lucene's Codec/LeafReaders used
quite a bit of heap, and the implementation was much closer to correct (as
measured by %tg difference).
But now that we've moved
To be clear, there is no plan to remove RAM accounting from readers yet,
this is just something that I have been thinking about recently, so your
use-case caught my attention.
Given how low the memory usage is nowadays, I believe that it would be
extremely hard to make sure that RAM estimates are
John: you may benefit from more eagerly merging small segments on commit.
At Salesforce we have a *ton* of indexes, and we reduced the segment count
in half from the default. The large number of fields was a positive factor
in this being a desirable trade-off. You might look at this recent issue
Thanks Adrien!
It is surprising to learn this is an invalid use case and that Lucene is
planning to get rid of memory accounting...
In our test, there are indeed many fields. From our test, with 1000 numeric
doc values fields, and 5 million docs in 1 segment. (We will have many
segments in our
A couple major versions ago, Lucene required tons of heap memory to keep a
reader open, e.g. norms were on heap and so on. To my knowledge, the only
thing that is now kept in memory and is a function of maxDoc is live docs,
all other codec components require very little memory. I'm actually
Hello,
We have a reader cache that depends on the memory usage for each reader. We
found the calculation of reader size for doc values to be under counting.
See line: