> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-restart-is-taking-more-than-1-hour-tp4054165p4054189.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Nabble says that the original message hasn't made it to the mailing list
yet, which explains why I only saw the reply come in.  Good thing nabble
sent along the URL so I could see the original question.

This is almost guaranteed to be caused by a huge updateLog - the tlog
directory added in version 4.0.  On Solr restart, all of the tlog data
that exists is replayed to ensure the index is fully up to date.  When
the tlog is huge, it takes a very long time.

A huge tlog is normally caused by one of two things: 1) only using soft
commits and never hard committing. 2) doing a very large import with the
dataimport handler and not committing until the end.

The solution is to do hard commits on intervals that are short (but not
super short) with openSearcher set to false.  A hard commit starts a new
tlog and flush index data to disk.  With openSearcher set to false, the
hard commit will not change document visibility - deleted documents are
still searchable, and new documents are not yet searchable.  You can
still make new content searchable with a commit (hard or soft) that has
openSearcher set to true.

By starting a new tlog on a regular basis, it will never get very big.
Solr trims old tlogs, only keeping a few of them around.  If you have
only a few tlogs and they are small, it won't take very long to replay
them on startup.

The easiest way to do this hard commit is to have Solr do it for you
automatically with the autoCommit feature.

<updateHandler class="solr.DirectUpdateHandler2">
  <autoCommit>
    <maxDocs>25000</maxDocs>
    <maxTime>300000</maxTime>
    <openSearcher>false</openSearcher>
  </autoCommit>
  <updateLog />
</updateHandler>

I've typed this often enough that I really need to just put it on the
wiki - when the question comes up, link the article. :)

Thanks,
Shawn

Reply via email to