Re: ram estimate for docvalues is incorrect

2020-05-27 Thread Adrien Grand
To be clear, there is no plan to remove RAM accounting from readers yet, this is just something that I have been thinking about recently, so your use-case caught my attention. Given how low the memory usage is nowadays, I believe that it would be extremely hard to make sure that RAM estimates are

Re: ram estimate for docvalues is incorrect

2020-05-27 Thread David Smiley
John: you may benefit from more eagerly merging small segments on commit. At Salesforce we have a *ton* of indexes, and we reduced the segment count in half from the default. The large number of fields was a positive factor in this being a desirable trade-off. You might look at this recent issue

Re: ram estimate for docvalues is incorrect

2020-05-27 Thread John Wang
Thanks Adrien! It is surprising to learn this is an invalid use case and that Lucene is planning to get rid of memory accounting... In our test, there are indeed many fields. From our test, with 1000 numeric doc values fields, and 5 million docs in 1 segment. (We will have many segments in our pr

Re: ram estimate for docvalues is incorrect

2020-05-27 Thread Adrien Grand
A couple major versions ago, Lucene required tons of heap memory to keep a reader open, e.g. norms were on heap and so on. To my knowledge, the only thing that is now kept in memory and is a function of maxDoc is live docs, all other codec components require very little memory. I'm actually wonderi

ram estimate for docvalues is incorrect

2020-05-27 Thread John Wang
Hello, We have a reader cache that depends on the memory usage for each reader. We found the calculation of reader size for doc values to be under counting. See line: https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/codecs/lucene80/Lucene80DocValuesProducer

Re: Skip indexing facet drill down terms

2020-05-27 Thread Michael McCandless
Hi Ankur, Indeed I don't think this is an option, in FacetsConfig today? I think it makes sense to add one ... it should be fairly simple. Just follow how the "foo" level (dimension only) drilldown option was added in LUCENE-8367? Could you open an issue? Thanks! Mike McCandless http://blog.

Re: BadApple report

2020-05-27 Thread Jason Gerlowski
> Hoss’s rollups are here: > http://fucit.org/solr-jenkins-reports/failure-report.html which show the > rates, but not where they came from. If I click on a particular test entry on "failure-report.html", I'm presented with dialog with links for each failure. Clicking that link takes me to a fi