Hello all,
Due to multiple languages and dirty OCR, our indexes have over 2 billion
unique terms (
http://www.hathitrust.org/blogs/large-scale-search/too-many-words-again).
In Solr 3.6 and previous we needed to reduce the memory used for storing
the in-memory representation of the tii file. We
Thanks Robert,
I'll have to spend some time understanding the default codec for Solr 4.0.
Did I miss something in the changes file?
I'll be digging into the default codec docs and testing sometime in next
week or two (with a 2 billion term index) If I understand it well enough,
I'll be happy
On Fri, Sep 7, 2012 at 2:19 PM, Tom Burton-West tburt...@umich.edu wrote:
Thanks Robert,
I'll have to spend some time understanding the default codec for Solr 4.0.
Did I miss something in the changes file?
http://lucene.apache.org/core/4_0_0-BETA/
see the file formats section, especially
Thanks Robert,
if not, just customize blocktree's params with a CodecFactory in solr,
or even pick another implementation (FixedGap, VariableGap, whatever).
Still trying to get my head around 4.0 and flexible indexing. I'll take
another look at Mike's and your presentations. I'm trying to