Hello all, We have been using Solr 4.0 for a while and suddenly we couldn't get Solr to come up. As Solr was starting up it hung after opening a Searcher. There wasn't anything else obvious in the logs. Eventually we realized that the problem was that the updatelog was being read and that the update log contained the entire text of all 800,000+ books that we indexed (About 837GB).
We looked and didn't find any obvious note in the Solr 4.0 Release notes on upgrading from 3.6 or any documentation in the example solrconfig.xml that mentioned that perhaps if you have large documents and you aren't using real-time get, you may want to turn this off/comment this out to avoid transactions logs that can exceed the size of your index. In the latest 4.0 example/solrconfig.xml (r *1433064<http://svn.apache.org/viewvc?view=revision&revision=1433064>) , updateLog is enabled in the default Solr updateHandler by default and the only comment is:* * <!-- Enables a transaction log, currently used for real-time get. "dir" - the target directory for transaction logs, defaults to the solr data directory --> * Some users who are either new to Solr or upgrading from earlier versions of Solr may not understand whether or not they need "real-time get" and they may not want to delve into the details of near- realtime search or using Solr as an NoSQL server in order to determine whether they should comment out the updateLog entry. I think that either the updateLog should not be enabled by default (don't know the pros and cons of this), or at the very least, something should mention that this can lead to large transaction logs and there should be a pointer to some documentation that would enable the user to decide whether or not to enable/disable this. Is there documentation of this in some obvious place that I just missed? I did find the text below on the wiki http://wiki.apache.org/solr/SolrConfigXml#Update_Handler_Section, but a user-friendly translation would be helpful or a pointer to where someone could read to determine what this means would be helpful. <openSearcher>false</openSearcher> <!-- SOLR 4.0. Optionally don't open a searcher on hard commit. This is useful to minimize the size of transaction logs that keep track of uncommitted updates. --> I did see that several new Solr 4 users created very large logs before they asked the mailing list how to avoid this: http://lucene.472066.n3.nabble.com/Documentation-on-the-new-updateLog-transaction-log-feature-tc4000537.html#a4000538 Perhaps some of the information in this thread on the mailing list might be added to the documentation somewhere. http://lucene.472066.n3.nabble.com/Testing-Solr4-first-impressions-and-problems-tc4013628.html#a4013814 I think I almost understand the hard-commit/soft-commit/autocommit/opensearcher discussion in the above thread and it would seem that this could be put in the wiki or the comments in the config file as appropriate. Should I open a JIRA issue? Tom ---- Log entry. "Jan 14, 2013 12:40:31 PM org.apache.solr.search.SolrIndexSearcher <init> INFO: Opening Searcher@59db9f45 main