If you want to suppress merging, set the 'mergeFactor' very high.
Perhaps 100. Note that Lucene opens many files (50? 100? 200?) for
each segment. You would have to set the 'ulimit' for file descriptors
to 'unlimited' or 'millions'.

Later, you can call optimize with a 'maxSegments' value. Optimize will
stop at maxSegments instead of merging down to one. Lucene these days
does not need to have one segment, so merging down to 20 or 50 is
fine.

On Wed, May 23, 2012 at 11:19 AM, Scott Preddy <scott.m.pre...@gmail.com> wrote:
> I am trying to do a very large insertion (about 68million documents) into a
> solr instance.
>
> Our schema is pretty simple. About 40 fields using these types:
>
>   <types>
>      <fieldType name="string" class="solr.StrField" sortMissingLast="true"
> omitNorms="true"/>
>      <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100">
>         <analyzer type="index">
>            <tokenizer class="solr.StandardTokenizerFactory"/>
>            <filter class="solr.LowerCaseFilterFactory"/>
>         </analyzer>
>         <analyzer type="query">
>            <tokenizer class="solr.StandardTokenizerFactory"/>
>            <filter class="solr.LowerCaseFilterFactory"/>
>         </analyzer>
>      </fieldType>
>      <fieldType name="int" class="solr.TrieIntField" precisionStep="0"
> omitNorms="true" positionIncrementGap="0"/>
>   </types>
>
> We are running solrj clients from a hadoop cluster, and are struggling with
> the merge process as time progresses.
> As the number of documents grows, merging will eventually hog everything.
>
> What we would really like to do is turn merging off and just do an index
> run with a sparse solrconfig and then
> start things back up with our runtime config which would kick off merging
> when it starts.
>
> Is there a way to do this?
>
> I came close to finding an answer in this post, but did not find out how to
> actually turn off merging.
>
> Post by Mike McCandless:
> http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html



-- 
Lance Norskog
goks...@gmail.com

Reply via email to