I just grabbed another stack trace for a thread that has been similarly blocking for over an hour. Notice that there is no Commit in this one:
http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) org.apache.lucene.index.SegmentTermEnum.next() org.apache.lucene.index.SegmentTermEnum.scanTo(Term) org.apache.lucene.index.TermInfosReader.get(Term, boolean) org.apache.lucene.index.TermInfosReader.get(Term) org.apache.lucene.index.SegmentTermDocs.seek(Term) org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) org.apache.lucene.index.IndexWriter.applyDeletes() org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, AddUpdateCommand) org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, SolrQueryResponse, ContentStream) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) org.apache.catalina.core.StandardContextValve.invoke(Request, Response) org.apache.catalina.core.StandardHostValve.invoke(Request, Response) org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, Object[]) org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[]) org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() java.lang.Thread.run() -----Original Message----- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Monday, October 05, 2009 1:18 PM To: solr-user@lucene.apache.org Subject: Re: Solr Timeouts OK... next step is to verify that SolrCell doesn't have a bug that causes it to commit. I'll try and verify today unless someone else beats me to it. -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 1:04 PM, Giovanni Fernandez-Kincade <gfernandez-kinc...@capitaliq.com> wrote: > I'm fairly certain that all of the indexing jobs are calling SOLR with > commit=false. They all construct the indexing URLs using a CLR function I > wrote, which takes in a Commit parameter, which is always set to false. > > Also, I don't see any calls to commit in the Tomcat logs (whereas normally > when I make a commit call I do). > > This suggests that Solr is doing it automatically, but the extract handler > doesn't seem to be the problem: > <requestHandler name="/update/extract" > class="org.apache.solr.handler.extraction.ExtractingRequestHandler" > startup="lazy"> > <lst name="defaults"> > <str name="uprefix">ignored_</str> > <str name="map.content">fileData</str> > </lst> > </requestHandler> > > > There is no external config file specified, and I don't see anything about > commits here. > > I've tried setting up more detailed indexer logging but haven't been able to > get it to work: > <infoStream file="c:\solr\indexer.log">true</infoStream> > > I tried relative and absolute paths, but no dice so far. > > Any other ideas? > > -Gio. > > -----Original Message----- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley > Sent: Monday, October 05, 2009 12:52 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr Timeouts > >> This is what one of my SOLR requests look like: >> >> http://titans:8080/solr/update/extract/?literal.versionId=684936&literal.filingDate=1997-12-04T00:00:00Z&literal.formTypeId=95&literal.companyId=3567904&literal.sourceId=0&resource.name=684936.txt&commit=false > > Have you verified that all of your indexing jobs (you said you had 4 > or 5) have commit=false? > > Also make sure that your extract handler doesn't have a default of > something that could cause a commit - like commitWithin or something. > > -Yonik > http://www.lucidimagination.com > > > > On Mon, Oct 5, 2009 at 12:44 PM, Giovanni Fernandez-Kincade > <gfernandez-kinc...@capitaliq.com> wrote: >> Is there somewhere other than solrConfig.xml that the autoCommit feature is >> enabled? I've looked through that file and found autocommit to be commented >> out: >> >> >> >> <!-- >> >> Perform a <commit/> automatically under certain conditions: >> >> maxDocs - number of updates since last commit is greater than this >> >> maxTime - oldest uncommited update (in ms) is this long ago >> >> <autoCommit> >> >> <maxDocs>10000</maxDocs> >> >> <maxTime>1000</maxTime> >> >> </autoCommit> >> >> >> >> >> >> --> >> >> >> > >> >> >> >> -----Original Message----- >> From: Feak, Todd [mailto:todd.f...@smss.sony.com] >> Sent: Monday, October 05, 2009 12:40 PM >> To: solr-user@lucene.apache.org >> Subject: RE: Solr Timeouts >> >> >> >> Actually, ignore my other response. >> >> >> >> I believe you are committing, whether you know it or not. >> >> >> >> This is in your provided stack trace >> >> org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, >> SolrParams, boolean) >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, >> SolrQueryResponse) >> >> >> >> I think Yonik gave you additional information for how to make it faster. >> >> >> >> -Todd >> >> >> >> -----Original Message----- >> >> From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] >> >> Sent: Monday, October 05, 2009 9:30 AM >> >> To: solr-user@lucene.apache.org >> >> Subject: RE: Solr Timeouts >> >> >> >> I'm not committing at all actually - I'm waiting for all 6 million to be >> done. >> >> >> >> -----Original Message----- >> >> From: Feak, Todd [mailto:todd.f...@smss.sony.com] >> >> Sent: Monday, October 05, 2009 12:10 PM >> >> To: solr-user@lucene.apache.org >> >> Subject: RE: Solr Timeouts >> >> >> >> How often are you committing? >> >> >> >> Every time you commit, Solr will close the old index and open the new one. >> If you are doing this in parallel from multiple jobs (4-5 you mention) then >> eventually the server gets behind and you start to pile up commit requests. >> Once this starts to happen, it will cascade out of control if the rate of >> commits isn't slowed. >> >> >> >> -Todd >> >> >> >> ________________________________ >> >> From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] >> >> Sent: Monday, October 05, 2009 9:04 AM >> >> To: solr-user@lucene.apache.org >> >> Subject: Solr Timeouts >> >> >> >> Hi, >> >> I'm attempting to index approximately 6 million HTML/Text files using SOLR >> 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. >> I've fired up 4-5 different jobs that are making indexing requests using the >> ExtractionRequestHandler, and everything works well for about 30-40 minutes, >> after which all indexing requests start timing out. I profiled the server >> and found that all of the threads are getting blocked by this call to flush >> the Lucene index to disk (see below). >> >> >> >> This leads me to a few questions: >> >> >> >> 1. Is this normal? >> >> >> >> 2. Can I reduce the frequency with which this happens somehow? I've >> greatly increased the indexing options in SolrConfig.xml (attached here) to >> no avail. >> >> >> >> 3. During these flushes, resource utilization (CPU, I/O, Memory >> Consumption) is significantly down compared to when requests are being >> handled. Is there any way to make this index go faster? I have plenty of >> bandwidth on the machine. >> >> >> >> I appreciate any insight you can provide. We're currently using MS SQL 2005 >> as our full-text solution and are pretty much miserable. So far SOLR has >> been a great experience. >> >> >> >> Thanks, >> >> Gio. >> >> >> >> http-8080-Processor21 [RUNNABLE] CPU time: 9:51 >> >> java.io.RandomAccessFile.seek(long) >> >> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], >> int, int) >> >> org.apache.lucene.store.BufferedIndexInput.refill() >> >> org.apache.lucene.store.BufferedIndexInput.readByte() >> >> org.apache.lucene.store.IndexInput.readVInt() >> >> org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) >> >> org.apache.lucene.index.SegmentTermEnum.next() >> >> org.apache.lucene.index.SegmentTermEnum.scanTo(Term) >> >> org.apache.lucene.index.TermInfosReader.get(Term, boolean) >> >> org.apache.lucene.index.TermInfosReader.get(Term) >> >> org.apache.lucene.index.SegmentTermDocs.seek(Term) >> >> org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) >> >> org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) >> >> org.apache.lucene.index.IndexWriter.applyDeletes() >> >> org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) >> >> org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) >> >> org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) >> >> org.apache.lucene.index.IndexWriter.closeInternal(boolean) >> >> org.apache.lucene.index.IndexWriter.close(boolean) >> >> org.apache.lucene.index.IndexWriter.close() >> >> org.apache.solr.update.SolrIndexWriter.close() >> >> org.apache.solr.update.DirectUpdateHandler2.closeWriter() >> >> org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) >> >> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) >> >> org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, >> SolrParams, boolean) >> >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, >> SolrQueryResponse) >> >> org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, >> SolrQueryResponse) >> >> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, >> SolrQueryResponse) >> >> org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, >> SolrQueryResponse) >> >> org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, >> SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) >> >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, >> ServletResponse, FilterChain) >> >> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, >> ServletResponse) >> >> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, >> ServletResponse) >> >> org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) >> >> org.apache.catalina.core.StandardContextValve.invoke(Request, Response) >> >> org.apache.catalina.core.StandardHostValve.invoke(Request, Response) >> >> org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) >> >> org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) >> >> org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) >> >> org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) >> >> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, >> Object[]) >> >> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, >> TcpConnection, Object[]) >> >> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) >> >> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() >> >> java.lang.Thread.run() >> >> >> >> >> >> >> >