Jerome, See those "waitFlush=true,waitSearcher=true" ? Do things improve if you make them false? (not sure how with autocommit without looking at the config and not sure if this makes a difference when autocommit triggers commits)
Re deleted docs, they are probably getting expunged, it's just that you always have more deleted docs, so those 2 numbers will never be the same without optimize. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch ----- Original Message ---- > From: Jerome L Quinn <jlqu...@us.ibm.com> > To: solr-user@lucene.apache.org > Sent: Thu, January 14, 2010 9:59:12 PM > Subject: [1.3] help with update timeout issue? > > > > Hi, folks, > > I am using Solr 1.3 pretty successfully, but am running into an issue that > hits once in a long while. I'm still using 1.3 since I have some custom > code I will have to port forward to 1.4. > > My basic setup is that I have data sources continually pushing data into > Solr, around 20K adds per day. The index is currently around 100G, stored > on local disk on a fast linux server. I'm trying to make new docs > searchable as quickly as possible, so I currently have autocommit set to > 15s. I originally had 3s but that seems to be a little too unstable. I > never optimize the index since optimize will lock things up solid for 2 > hours, dropping docs until the optimize completes. I'm using the default > segment merging settings. > > Every once in a while I'm getting a socket timeout when trying to add a > document. I traced it to a 20s timeout and then found the corresponding > point in the Solr log. > > Jan 13, 2010 2:59:15 PM org.apache.solr.core.SolrCore execute > INFO: [tales] webapp=/solr path=/update params={} status=0 QTime=2 > Jan 13, 2010 2:59:15 PM org.apache.solr.update.DirectUpdateHandler2 commit > INFO: start commit(optimize=false,waitFlush=true,waitSearcher=true) > Jan 13, 2010 2:59:56 PM org.apache.solr.search.SolrIndexSearcher > INFO: Opening searc...@26e926e9 main > Jan 13, 2010 2:59:56 PM org.apache.solr.update.DirectUpdateHandler2 commit > INFO: end_commit_flush > > Solr locked up for 41 seconds here while doing some of the commit work. > So, I have a few questions. > > Is this related to GC? > Does Solr always lock up when merging segments and I just have to live with > losing the doc I want to add? > Is there a timeout that would guarantee me a write success? > Should I just retry in this situation? If so, how do I distinguish between > this and Solr just being down? > I already have had issues in the past with too many files open, so > increasing the merge factor isn't an option. > > > On a related note, I had previously asked about optimizing and was told > that segment merging would take care of cleaning up deleted docs. However, > I have the following stats for my index: > > numDocs : 2791091 > maxDoc : 4811416 > > My understanding is that numDocs is the docs being searched and maxDoc is > the number of docs including ones that will disappear after optimization. > How do I get this cleanup without using optimize, since it locks up Solr > for multiple hours. I'm deleting old docs daily as well. > > Thanks for all the help, > Jerry