vigyasharma commented on PR #12013:
URL: https://github.com/apache/lucene/pull/12013#issuecomment-1367688658

   > ... but I don't understand what this "cache" is doing and why it actually 
documents that it never frees memory.
   
   My understanding is that `TaxonomyWriterCache` caches ordinals for all 
categories created in the index so far, so that categories use the same 
ordinals when facet labels are added. 
   
   It seems that we need such a cache to get ordinals for newly added 
categories from documents that are still pending flush.
   From `TaxonomyWriterCache#put()` docstring:
   ```bash
      * <p>The reason why the caller needs to know if part of the cache was 
cleared is that in that
      * case it will have to commit its on-disk index (so that all the latest 
category additions can be
      * searched on disk, if we can't rely on the cache to contain them).
   ```
   
   However, with faster BinaryDocValue fields, maybe we don't need the "never 
evicting UTF8TaxonomyWriterCache" anymore. And we could make 
LruTaxonomyWriterCache as the default?
   
   I can close this PR and make that change if it makes sense. Or we can merge 
this change and take it up in a separate issue. Are there faceting specific 
benchmarks that can help validate the change?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to