It looks like this is a good starting point:

http://wiki.apache.org/solr/SolrConfigXml#codecFactory

-Mike

On 01/12/2015 03:37 PM, Tom Burton-West wrote:
Hello all,

Our indexes have around 3 billion unique terms, so for Solr 3, we set
TermIndexInterval to about 8 times the default.  The net effect of this is
to reduce the size of the in-memory index by about 1/8th.  (For background
see for
http://www.hathitrust.org/blogs/large-scale-search/too-many-words-again, )

We would like to do something similar for Solr4.   T

he Lucene 4.10.2 JavaDoc for setTermIndexInterval suggests how this can be
done by setting the minimum and maximum size for a block in Lucene code (
http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/index/IndexWriterConfig.html#setTermIndexInterval%28int%29
)
"For example, Lucene41PostingsFormat
<http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/codecs/lucene41/Lucene41PostingsFormat.html>
implements the term index instead based upon how terms share prefixes. To
configure its parameters (the minimum and maximum size for a block), you
would instead use Lucene41PostingsFormat.Lucene41PostingsFormat(int, int)
<http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/codecs/lucene41/Lucene41PostingsFormat.html#Lucene41PostingsFormat%28int,%20int%29>.
which can also be configured on a per-field basis"

How can we configure Solr to use different (i.e. non-default) mimum and
maximum block sizes?

Tom


Reply via email to