I have an index (different from the ones mentioned yesterday) that was working fine with 3M docs or so, but when I added a bunch more docs, bringing it closer to 4M docs, the index seemed to get corrupted. In particular, now when I start Solr up, or when when my indexing process tries add a document, I get a complaint about missing index files.
The error on startup looks like this: <record> <date>2008-08-15T10:18:54</date> <millis>1218820734592</millis> <sequence>92</sequence> <logger>org.apache.solr.core.MultiCore</logger> <level>SEVERE</level> <class>org.apache.solr.common.SolrException</class> <method>log</method> <thread>10</thread> <message>java.lang.RuntimeException: java.io.FileNotFoundException: /ssd/solr-9999/solr/exhibitcore/data/index/_p7.fdt (No such file or directory) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:733) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:387) at org.apache.solr.core.MultiCore.create(MultiCore.java:255) at org.apache.solr.core.MultiCore.load(MultiCore.java:139) at org.apache.solr.servlet.SolrDispatchFilter.initMultiCore(SolrDispatchFilter.java:147) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:75) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) Caused by: java.io.FileNotFoundException: /ssd/solr-9999/solr/exhibitcore/data/index/_p7.fdt (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233) at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:506) at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:536) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:445) at org.apache.lucene.index.FieldsReader.<init>(FieldsReader.java:75) at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:308) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:197) at org.apache.lucene.index.MultiSegmentReader.<init>(MultiSegmentReader.java:55) at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:75) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636) at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63) at org.apache.lucene.index.IndexReader.open(IndexReader.java:209) at org.apache.lucene.index.IndexReader.open(IndexReader.java:173) at org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:93) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:724) ... 29 more </message> </record> And the error on doc add looks like this: <record> <date>2008-08-15T09:51:30</date> <millis>1218819090142</millis> <sequence>6571937</sequence> <logger>org.apache.solr.core.SolrCore</logger> <level>SEVERE</level> <class>org.apache.solr.common.SolrException</class> <method>log</method> <thread>14</thread> <message>java.io.FileNotFoundException: /ssd/solr-9999/solr/exhibitcore/data/index/_p7.fdt (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233) at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:506) at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:536) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:445) at org.apache.lucene.index.FieldsReader.<init>(FieldsReader.java:75) at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:308) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:197) at org.apache.lucene.index.MultiSegmentReader.<init>(MultiSegmentReader.java:55) at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:75) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636) at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63) at org.apache.lucene.index.IndexReader.open(IndexReader.java:209) at org.apache.lucene.index.IndexReader.open(IndexReader.java:173) at org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:93) at org.apache.solr.core.SolrCore.newSearcher(SolrCore.java:213) at org.apache.solr.update.DirectUpdateHandler2.openSearcher(DirectUpdateHandler2.java:207) at org.apache.solr.update.DirectUpdateHandler2.doDeletions(DirectUpdateHandler2.java:466) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:295) at org.apache.solr.handler.RichDocumentLoader.doAdd(RichDocumentRequestHandler.java:231) at org.apache.solr.handler.RichDocumentLoader.addDoc(RichDocumentRequestHandler.java:236) at org.apache.solr.handler.RichDocumentLoader.load(RichDocumentRequestHandler.java:278) at org.apache.solr.handler.RichDocumentRequestHandler.handleRequestBody(RichDocumentRequestHandler.java:80) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:125) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:228) at org.apache.solr.core.SolrCore.execute(SolrCore.java:965) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:274) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) </message> </record> I just checked, and the files that Solr is complaining about are indeed not in the index directory. The earliest indication of trouble I found in my log was an error like this: <record> <date>2008-08-15T09:47:48</date> <millis>1218818868528</millis> <sequence>6525387</sequence> <logger>org.apache.solr.update.UpdateHandler</logger> <level>SEVERE</level> <class>org.apache.solr.update.DirectUpdateHandler2$CommitTracker</class> <method>run</method> <thread>15</thread> <message>auto commit error...</message> </record> There may have been SEVERE errors before this, but my log doesn't go back to the very beginning. It's interesting that while adding documents seems to be usually failing now (yielding the "file not found" exception), I could add documents successfully for some time before things started to go wrong. What's more, some documents do seem to *still* get added successfully. I'm using the rich document update handler, so the successful log entries look like this: <record> <date>2008-08-15T09:50:54</date> <millis>1218819054600</millis> <sequence>6561534</sequence> <logger>org.apache.solr.core.SolrCore</logger> <level>INFO</level> <class>org.apache.solr.core.SolrCore</class> <method>execute</method> <thread>14</thread> <message>[exhibitcore] webapp=/solr path=/update/rich params={filenumber=333-112076-85&formtype=S-4/A&stream.fieldname=body&exhibittype=EX-3.99&date=2004-02-09T00:00:00Z&companyname=PROGRESSIVE+VENTURE+CAPITAL+CORP&exhibitdescription=EXHIBIT+3.99&id=37684831&cik=1275089&stream.type=html&filingkey=0001193125-04-017196/1275089/FILER&stateofincorporation=WV&fieldnames=key,filingkey,companyname,accessionnumber,cik,date,exhibitdescription,exhibittype,exhibittypeint,filenumber,filename,formtype,stateofheadquarters,stateofincorporation&filename=dex399.htm&exhibittypeint=3&accessionnumber=0001193125-04-017196&stateofheadquarters=~&key=0001193125-04-017196/1275089/FILER/dex399.htm} status=0 QTime=9 </message> </record> The deletes I'm seeing in my log also seem to be working fine; I get log entries like <record> <date>2008-08-15T09:50:54</date> <millis>1218819054602</millis> <sequence>6561535</sequence> <logger>org.apache.solr.update.processor.UpdateRequestProcessor</logger> <level>INFO</level> <class>org.apache.solr.update.processor.LogUpdateProcessor</class> <method>finish</method> <thread>14</thread> <message>{delete=[0001193125-04-017196/1275096/FILER/dex231.htm]} 0 1</message> </record> and <record> <date>2008-08-15T09:51:30</date> <millis>1218819090153</millis> <sequence>6571944</sequence> <logger>org.apache.solr.update.UpdateHandler</logger> <level>INFO</level> <class>org.apache.solr.update.DirectUpdateHandler2</class> <method>doDeletions</method> <thread>13</thread> <message>DirectUpdateHandler2 deleting and removing dups for 100788 ids</message> </record> After I noticed this corruption thing, I thought I'd see if I could get it to happen again, so I went back to the original 3M-ish doc index, and tried adding the new documents again. (If it matters, the new docs would have come into the index in a different permutation on this retry.) This too resulted in an index with "file not found" problems. The following may or may not be relevant: I built the base 3M-ish doc index on a Windows machine, and it's a compound (.cfs) format index. (I actually created it not with Solr, but by using the index merging tool that comes with Lucene in order to merge three different non-compound format indexes that I'd previously made with Solr into a single index.) Before I started adding documents, I moved the index to a Linux machine running a newer version of Solr/Lucene than was on the Windows machine. The stuff described above all happened on Linux. Any thoughts? Thanks a bunch, Chris