It sounds like he is indexing on a local disk, but reading the files to be index from NFS - which would be fine.
You can get Lucene indexes to work on NFS (though still not recommended) , but you need to use a custom IndexDeletionPolicy to keep older commit points around longer and be sure not to use NIOFSDirectory. Feak, Todd wrote: > I seem to recall hearing something about *not* putting a Solr index directory > on an NFS mount. Might want to search on that. > > That, of course, doesn't have anything to do with commits showing up > unexpectedly in stack traces, per your original email. > > -Todd > > -----Original Message----- > From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] > Sent: Tuesday, October 06, 2009 12:39 PM > To: solr-user@lucene.apache.org; yo...@lucidimagination.com > Subject: RE: Solr Timeouts > > That thread was blocking for an hour while all other threads were idle or > blocked. > > -----Original Message----- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley > Sent: Tuesday, October 06, 2009 3:07 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr Timeouts > > This specific thread was blocked for an hour? > If so, I'd echo Lance... this is a local disk right? > > -Yonik > http://www.lucidimagination.com > > > On Mon, Oct 5, 2009 at 2:11 PM, Giovanni Fernandez-Kincade > <gfernandez-kinc...@capitaliq.com> wrote: > >> I just grabbed another stack trace for a thread that has been similarly >> blocking for over an hour. Notice that there is no Commit in this one: >> >> http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 >> org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) >> org.apache.lucene.index.SegmentTermEnum.next() >> org.apache.lucene.index.SegmentTermEnum.scanTo(Term) >> org.apache.lucene.index.TermInfosReader.get(Term, boolean) >> org.apache.lucene.index.TermInfosReader.get(Term) >> org.apache.lucene.index.SegmentTermDocs.seek(Term) >> org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) >> org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) >> org.apache.lucene.index.IndexWriter.applyDeletes() >> org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) >> org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) >> org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) >> org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) >> org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) >> org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) >> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, >> AddUpdateCommand) >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, >> SolrQueryResponse, ContentStream) >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, >> SolrQueryResponse) >> org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, >> SolrQueryResponse) >> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, >> SolrQueryResponse) >> org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, >> SolrQueryResponse) >> org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, >> SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, >> ServletResponse, FilterChain) >> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, >> ServletResponse) >> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, >> ServletResponse) >> org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) >> org.apache.catalina.core.StandardContextValve.invoke(Request, Response) >> org.apache.catalina.core.StandardHostValve.invoke(Request, Response) >> org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) >> org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) >> org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) >> org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) >> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, >> Object[]) >> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, >> TcpConnection, Object[]) >> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) >> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() >> java.lang.Thread.run() >> >> >> -----Original Message----- >> From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley >> Sent: Monday, October 05, 2009 1:18 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Solr Timeouts >> >> OK... next step is to verify that SolrCell doesn't have a bug that >> causes it to commit. >> I'll try and verify today unless someone else beats me to it. >> >> -Yonik >> http://www.lucidimagination.com >> >> On Mon, Oct 5, 2009 at 1:04 PM, Giovanni Fernandez-Kincade >> <gfernandez-kinc...@capitaliq.com> wrote: >> >>> I'm fairly certain that all of the indexing jobs are calling SOLR with >>> commit=false. They all construct the indexing URLs using a CLR function I >>> wrote, which takes in a Commit parameter, which is always set to false. >>> >>> Also, I don't see any calls to commit in the Tomcat logs (whereas normally >>> when I make a commit call I do). >>> >>> This suggests that Solr is doing it automatically, but the extract handler >>> doesn't seem to be the problem: >>> <requestHandler name="/update/extract" >>> class="org.apache.solr.handler.extraction.ExtractingRequestHandler" >>> startup="lazy"> >>> <lst name="defaults"> >>> <str name="uprefix">ignored_</str> >>> <str name="map.content">fileData</str> >>> </lst> >>> </requestHandler> >>> >>> >>> There is no external config file specified, and I don't see anything about >>> commits here. >>> >>> I've tried setting up more detailed indexer logging but haven't been able >>> to get it to work: >>> <infoStream file="c:\solr\indexer.log">true</infoStream> >>> >>> I tried relative and absolute paths, but no dice so far. >>> >>> Any other ideas? >>> >>> -Gio. >>> >>> -----Original Message----- >>> From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley >>> Sent: Monday, October 05, 2009 12:52 PM >>> To: solr-user@lucene.apache.org >>> Subject: Re: Solr Timeouts >>> >>> >>>> This is what one of my SOLR requests look like: >>>> >>>> http://titans:8080/solr/update/extract/?literal.versionId=684936&literal.filingDate=1997-12-04T00:00:00Z&literal.formTypeId=95&literal.companyId=3567904&literal.sourceId=0&resource.name=684936.txt&commit=false >>>> >>> Have you verified that all of your indexing jobs (you said you had 4 >>> or 5) have commit=false? >>> >>> Also make sure that your extract handler doesn't have a default of >>> something that could cause a commit - like commitWithin or something. >>> >>> -Yonik >>> http://www.lucidimagination.com >>> >>> >>> >>> On Mon, Oct 5, 2009 at 12:44 PM, Giovanni Fernandez-Kincade >>> <gfernandez-kinc...@capitaliq.com> wrote: >>> >>>> Is there somewhere other than solrConfig.xml that the autoCommit feature >>>> is enabled? I've looked through that file and found autocommit to be >>>> commented out: >>>> >>>> >>>> >>>> <!-- >>>> >>>> Perform a <commit/> automatically under certain conditions: >>>> >>>> maxDocs - number of updates since last commit is greater than this >>>> >>>> maxTime - oldest uncommited update (in ms) is this long ago >>>> >>>> <autoCommit> >>>> >>>> <maxDocs>10000</maxDocs> >>>> >>>> <maxTime>1000</maxTime> >>>> >>>> </autoCommit> >>>> >>>> >>>> >>>> >>>> >>>> --> >>>> >>>> >>>> >>>> >>>> >>>> -----Original Message----- >>>> From: Feak, Todd [mailto:todd.f...@smss.sony.com] >>>> Sent: Monday, October 05, 2009 12:40 PM >>>> To: solr-user@lucene.apache.org >>>> Subject: RE: Solr Timeouts >>>> >>>> >>>> >>>> Actually, ignore my other response. >>>> >>>> >>>> >>>> I believe you are committing, whether you know it or not. >>>> >>>> >>>> >>>> This is in your provided stack trace >>>> >>>> org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, >>>> SolrParams, boolean) >>>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, >>>> SolrQueryResponse) >>>> >>>> >>>> >>>> I think Yonik gave you additional information for how to make it faster. >>>> >>>> >>>> >>>> -Todd >>>> >>>> >>>> >>>> -----Original Message----- >>>> >>>> From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] >>>> >>>> Sent: Monday, October 05, 2009 9:30 AM >>>> >>>> To: solr-user@lucene.apache.org >>>> >>>> Subject: RE: Solr Timeouts >>>> >>>> >>>> >>>> I'm not committing at all actually - I'm waiting for all 6 million to be >>>> done. >>>> >>>> >>>> >>>> -----Original Message----- >>>> >>>> From: Feak, Todd [mailto:todd.f...@smss.sony.com] >>>> >>>> Sent: Monday, October 05, 2009 12:10 PM >>>> >>>> To: solr-user@lucene.apache.org >>>> >>>> Subject: RE: Solr Timeouts >>>> >>>> >>>> >>>> How often are you committing? >>>> >>>> >>>> >>>> Every time you commit, Solr will close the old index and open the new one. >>>> If you are doing this in parallel from multiple jobs (4-5 you mention) >>>> then eventually the server gets behind and you start to pile up commit >>>> requests. Once this starts to happen, it will cascade out of control if >>>> the rate of commits isn't slowed. >>>> >>>> >>>> >>>> -Todd >>>> >>>> >>>> >>>> ________________________________ >>>> >>>> From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] >>>> >>>> Sent: Monday, October 05, 2009 9:04 AM >>>> >>>> To: solr-user@lucene.apache.org >>>> >>>> Subject: Solr Timeouts >>>> >>>> >>>> >>>> Hi, >>>> >>>> I'm attempting to index approximately 6 million HTML/Text files using SOLR >>>> 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. >>>> I've fired up 4-5 different jobs that are making indexing requests using >>>> the ExtractionRequestHandler, and everything works well for about 30-40 >>>> minutes, after which all indexing requests start timing out. I profiled >>>> the server and found that all of the threads are getting blocked by this >>>> call to flush the Lucene index to disk (see below). >>>> >>>> >>>> >>>> This leads me to a few questions: >>>> >>>> >>>> >>>> 1. Is this normal? >>>> >>>> >>>> >>>> 2. Can I reduce the frequency with which this happens somehow? I've >>>> greatly increased the indexing options in SolrConfig.xml (attached here) >>>> to no avail. >>>> >>>> >>>> >>>> 3. During these flushes, resource utilization (CPU, I/O, Memory >>>> Consumption) is significantly down compared to when requests are being >>>> handled. Is there any way to make this index go faster? I have plenty of >>>> bandwidth on the machine. >>>> >>>> >>>> >>>> I appreciate any insight you can provide. We're currently using MS SQL >>>> 2005 as our full-text solution and are pretty much miserable. So far SOLR >>>> has been a great experience. >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Gio. >>>> >>>> >>>> >>>> http-8080-Processor21 [RUNNABLE] CPU time: 9:51 >>>> >>>> java.io.RandomAccessFile.seek(long) >>>> >>>> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], >>>> int, int) >>>> >>>> org.apache.lucene.store.BufferedIndexInput.refill() >>>> >>>> org.apache.lucene.store.BufferedIndexInput.readByte() >>>> >>>> org.apache.lucene.store.IndexInput.readVInt() >>>> >>>> org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) >>>> >>>> org.apache.lucene.index.SegmentTermEnum.next() >>>> >>>> org.apache.lucene.index.SegmentTermEnum.scanTo(Term) >>>> >>>> org.apache.lucene.index.TermInfosReader.get(Term, boolean) >>>> >>>> org.apache.lucene.index.TermInfosReader.get(Term) >>>> >>>> org.apache.lucene.index.SegmentTermDocs.seek(Term) >>>> >>>> org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) >>>> >>>> org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) >>>> >>>> org.apache.lucene.index.IndexWriter.applyDeletes() >>>> >>>> org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) >>>> >>>> org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) >>>> >>>> org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) >>>> >>>> org.apache.lucene.index.IndexWriter.closeInternal(boolean) >>>> >>>> org.apache.lucene.index.IndexWriter.close(boolean) >>>> >>>> org.apache.lucene.index.IndexWriter.close() >>>> >>>> org.apache.solr.update.SolrIndexWriter.close() >>>> >>>> org.apache.solr.update.DirectUpdateHandler2.closeWriter() >>>> >>>> org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) >>>> >>>> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) >>>> >>>> org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, >>>> SolrParams, boolean) >>>> >>>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, >>>> SolrQueryResponse) >>>> >>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, >>>> SolrQueryResponse) >>>> >>>> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, >>>> SolrQueryResponse) >>>> >>>> org.apache.solr.core.SolrCore.execute(SolrRequestHandler, >>>> SolrQueryRequest, SolrQueryResponse) >>>> >>>> org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, >>>> SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) >>>> >>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, >>>> ServletResponse, FilterChain) >>>> >>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, >>>> ServletResponse) >>>> >>>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, >>>> ServletResponse) >>>> >>>> org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) >>>> >>>> org.apache.catalina.core.StandardContextValve.invoke(Request, Response) >>>> >>>> org.apache.catalina.core.StandardHostValve.invoke(Request, Response) >>>> >>>> org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) >>>> >>>> org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) >>>> >>>> org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) >>>> >>>> org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) >>>> >>>> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, >>>> Object[]) >>>> >>>> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, >>>> TcpConnection, Object[]) >>>> >>>> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) >>>> >>>> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() >>>> >>>> java.lang.Thread.run() >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> > > > -- - Mark http://www.lucidimagination.com