Thanks Mike, Do you know how I can configure Solr to use the min=200 and max=398 block sizes you suggested? Or should I ask on the Solr list?
Tom On Sat, Jan 10, 2015 at 4:46 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > The first int to Lucene41PostingsFormat is the min block size (default > 25) and the second is the max (default 48) for the block tree terms > dict. > > The max must be >= 2*(min-1). > > Since you were using 8X the default before, maybe try min=200 and > max=398? However, block tree should have been more RAM efficient than > 3.x's terms index... if you run CheckIndex with -verbose it will print > additional details about the block structure of your terms indices... > > Mike McCandless > > http://blog.mikemccandless.com > > > On Fri, Jan 9, 2015 at 4:15 PM, Tom Burton-West <tburt...@umich.edu> > wrote: > > Hello all, > > > > We have over 3 billion unique terms in our indexes and with Solr 3.x we > set > > the TermIndexInterval to about 8 times its default value in order to > index > > without OOMs. ( > > http://www.hathitrust.org/blogs/large-scale-search/too-many-words-again) > > > > We are now working with Solr 4 and running into memory issues and are > > wondering if we need to do something analogous for Solr 4. > > > > The javadoc for IndexWriterConfig ( > > > http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/index/IndexWriterConfig.html#setTermIndexInterval%28int%29 > > ) > > indicates that the lucene 4.1 postings format has some parameters which > may > > be set: > > "..To configure its parameters (the minimum and maximum size for a > block), > > you would instead use Lucene41PostingsFormat.Lucene41PostingsFormat(int, > > int) > > < > https://lucene.apache.org/core/4_10_2/core/org/apache/lucene/codecs/lucene41/Lucene41PostingsFormat.html#Lucene41PostingsFormat%28int,%20int%29 > > > > " > > > > Is there documentation or discussion somewhere about how to determine > > appropriate parameters or some detail about what setting the maxBlockSize > > and minBlockSize does? > > > > Tom Burton-West > > http://www.hathitrust.org/blogs/large-scale-search > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >