Re: IOException: read past EOF during optimize phase
I did see that bug, which made me suspect Lucene. In my case, I tracked down the problem. It was my own application. I was using Java's FileChannel.transferTo functions to copy my index from one location to another. One of the files is bigger than 2^31-1 bytes. So, one of my files was corrupted during the copy because I was just doing one pass. I now loop the copy function until the entire file is copied and everything works fine. DOH! - Original Message From: Yonik Seeley <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, January 16, 2008 4:57:08 PM Subject: Re: IOException: read past EOF during optimize phase This may be a Lucene bug... IIRC, I saw at least one other lucene user with a similar stack trace. I think the latest lucene version (2.3 dev) should fix it if that's the case. -Yonik On Jan 16, 2008 3:07 PM, Kevin Osborn <[EMAIL PROTECTED]> wrote: > I am using the embedded Solr API for my indexing process. I created a brand new index with my application without any problem. I then ran my indexer in incremental mode. This process copies the working index to a temporary Solr location, adds/updates any records, optimizes the index, and then copies it back to the working location. There are currently not any instances of Solr reading this index. Also, I commit after every 10 rows. The schema.xml and solrconfig.xml files have not changed. > > Here is my function call. > protected void optimizeProducts() throws IOException { > UpdateHandler updateHandler = m_SolrCore.getUpdateHandler(); > CommitUpdateCommand commitCmd = new CommitUpdateCommand(true); > commitCmd.optimize = true; > > updateHandler.commit(commitCmd); > > log.info("Optimized index"); > } > > So, during the optimize phase, I get the following stack trace: > java.io.IOException: read past EOF > at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:89) > at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:34) > at org.apache.lucene.store.IndexInput.readChars(IndexInput.java:107) > at org.apache.lucene.store.IndexInput.readString(IndexInput.java:93) > at org.apache.lucene.index.FieldsReader.addFieldForMerge(FieldsReader.java:211) > at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:119) > at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:323) > at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:206) > at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96) > at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835) > at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195) > at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508) > at ... > > There are no exceptions or anything else that appears to be incorrect during the adds or commits. After this, the index files are still non-optimized. > > I know there is not a whole lot to go on here. Anything in particular that I should look at? > >
Re: IOException: read past EOF during optimize phase
This may be a Lucene bug... IIRC, I saw at least one other lucene user with a similar stack trace. I think the latest lucene version (2.3 dev) should fix it if that's the case. -Yonik On Jan 16, 2008 3:07 PM, Kevin Osborn <[EMAIL PROTECTED]> wrote: > I am using the embedded Solr API for my indexing process. I created a brand > new index with my application without any problem. I then ran my indexer in > incremental mode. This process copies the working index to a temporary Solr > location, adds/updates any records, optimizes the index, and then copies it > back to the working location. There are currently not any instances of Solr > reading this index. Also, I commit after every 10 rows. The schema.xml > and solrconfig.xml files have not changed. > > Here is my function call. > protected void optimizeProducts() throws IOException { > UpdateHandler updateHandler = m_SolrCore.getUpdateHandler(); > CommitUpdateCommand commitCmd = new CommitUpdateCommand(true); > commitCmd.optimize = true; > > updateHandler.commit(commitCmd); > > log.info("Optimized index"); > } > > So, during the optimize phase, I get the following stack trace: > java.io.IOException: read past EOF > at > org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:89) > at > org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:34) > at org.apache.lucene.store.IndexInput.readChars(IndexInput.java:107) > at org.apache.lucene.store.IndexInput.readString(IndexInput.java:93) > at > org.apache.lucene.index.FieldsReader.addFieldForMerge(FieldsReader.java:211) > at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:119) > at > org.apache.lucene.index.SegmentReader.document(SegmentReader.java:323) > at > org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:206) > at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96) > at > org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835) > at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195) > at > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508) > at ... > > There are no exceptions or anything else that appears to be incorrect during > the adds or commits. After this, the index files are still non-optimized. > > I know there is not a whole lot to go on here. Anything in particular that I > should look at? > >
Re: IOException: read past EOF during optimize phase
Our basic setup is master/slave. We just want to make sure that we are not syncing against an index that is in the middle of a large rebuild. But, I think these issues are still separate from what I am experiencing. I also tried this same scenario in a different development environment. No problems there. - Original Message From: Otis Gospodnetic <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, January 16, 2008 2:33:03 PM Subject: Re: IOException: read past EOF during optimize phase Kevin, Perhaps you want to look at how Solr can be used in a master-slave setup. This will separate your indexing from searching. Don't have the URL, but it's on zee Wiki. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Kevin Osborn <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, January 16, 2008 5:25:34 PM Subject: Re: IOException: read past EOF during optimize phase It is more of a file structure thing for our application. We build in one place and do our index syncing in a different place. I doubt it is relevant to this issue, but figured I would include this information anyway. - Original Message From: Otis Gospodnetic <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, January 16, 2008 2:21:31 PM Subject: Re: IOException: read past EOF during optimize phase Kevin, Don't have the answer to EOF but I'm wondering why the index moving. You don't need to do that as far as Solr is concerned. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Kevin Osborn <[EMAIL PROTECTED]> To: Solr Sent: Wednesday, January 16, 2008 3:07:23 PM Subject: IOException: read past EOF during optimize phase I am using the embedded Solr API for my indexing process. I created a brand new index with my application without any problem. I then ran my indexer in incremental mode. This process copies the working index to a temporary Solr location, adds/updates any records, optimizes the index, and then copies it back to the working location. There are currently not any instances of Solr reading this index. Also, I commit after every 10 rows. The schema.xml and solrconfig.xml files have not changed. Here is my function call. protected void optimizeProducts() throws IOException { UpdateHandler updateHandler = m_SolrCore.getUpdateHandler(); CommitUpdateCommand commitCmd = new CommitUpdateCommand(true); commitCmd.optimize = true; updateHandler.commit(commitCmd); log.info("Optimized index"); } So, during the optimize phase, I get the following stack trace: java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:89) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:34) at org.apache.lucene.store.IndexInput.readChars(IndexInput.java:107) at org.apache.lucene.store.IndexInput.readString(IndexInput.java:93) at org.apache.lucene.index.FieldsReader.addFieldForMerge(FieldsReader.java:211) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:119) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:323) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:206) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508) at ... There are no exceptions or anything else that appears to be incorrect during the adds or commits. After this, the index files are still non-optimized. I know there is not a whole lot to go on here. Anything in particular that I should look at?
Re: IOException: read past EOF during optimize phase
Kevin, Perhaps you want to look at how Solr can be used in a master-slave setup. This will separate your indexing from searching. Don't have the URL, but it's on zee Wiki. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Kevin Osborn <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, January 16, 2008 5:25:34 PM Subject: Re: IOException: read past EOF during optimize phase It is more of a file structure thing for our application. We build in one place and do our index syncing in a different place. I doubt it is relevant to this issue, but figured I would include this information anyway. - Original Message From: Otis Gospodnetic <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, January 16, 2008 2:21:31 PM Subject: Re: IOException: read past EOF during optimize phase Kevin, Don't have the answer to EOF but I'm wondering why the index moving. You don't need to do that as far as Solr is concerned. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Kevin Osborn <[EMAIL PROTECTED]> To: Solr Sent: Wednesday, January 16, 2008 3:07:23 PM Subject: IOException: read past EOF during optimize phase I am using the embedded Solr API for my indexing process. I created a brand new index with my application without any problem. I then ran my indexer in incremental mode. This process copies the working index to a temporary Solr location, adds/updates any records, optimizes the index, and then copies it back to the working location. There are currently not any instances of Solr reading this index. Also, I commit after every 10 rows. The schema.xml and solrconfig.xml files have not changed. Here is my function call. protected void optimizeProducts() throws IOException { UpdateHandler updateHandler = m_SolrCore.getUpdateHandler(); CommitUpdateCommand commitCmd = new CommitUpdateCommand(true); commitCmd.optimize = true; updateHandler.commit(commitCmd); log.info("Optimized index"); } So, during the optimize phase, I get the following stack trace: java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:89) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:34) at org.apache.lucene.store.IndexInput.readChars(IndexInput.java:107) at org.apache.lucene.store.IndexInput.readString(IndexInput.java:93) at org.apache.lucene.index.FieldsReader.addFieldForMerge(FieldsReader.java:211) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:119) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:323) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:206) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508) at ... There are no exceptions or anything else that appears to be incorrect during the adds or commits. After this, the index files are still non-optimized. I know there is not a whole lot to go on here. Anything in particular that I should look at?
Re: IOException: read past EOF during optimize phase
It is more of a file structure thing for our application. We build in one place and do our index syncing in a different place. I doubt it is relevant to this issue, but figured I would include this information anyway. - Original Message From: Otis Gospodnetic <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, January 16, 2008 2:21:31 PM Subject: Re: IOException: read past EOF during optimize phase Kevin, Don't have the answer to EOF but I'm wondering why the index moving. You don't need to do that as far as Solr is concerned. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Kevin Osborn <[EMAIL PROTECTED]> To: Solr Sent: Wednesday, January 16, 2008 3:07:23 PM Subject: IOException: read past EOF during optimize phase I am using the embedded Solr API for my indexing process. I created a brand new index with my application without any problem. I then ran my indexer in incremental mode. This process copies the working index to a temporary Solr location, adds/updates any records, optimizes the index, and then copies it back to the working location. There are currently not any instances of Solr reading this index. Also, I commit after every 10 rows. The schema.xml and solrconfig.xml files have not changed. Here is my function call. protected void optimizeProducts() throws IOException { UpdateHandler updateHandler = m_SolrCore.getUpdateHandler(); CommitUpdateCommand commitCmd = new CommitUpdateCommand(true); commitCmd.optimize = true; updateHandler.commit(commitCmd); log.info("Optimized index"); } So, during the optimize phase, I get the following stack trace: java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:89) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:34) at org.apache.lucene.store.IndexInput.readChars(IndexInput.java:107) at org.apache.lucene.store.IndexInput.readString(IndexInput.java:93) at org.apache.lucene.index.FieldsReader.addFieldForMerge(FieldsReader.java:211) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:119) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:323) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:206) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508) at ... There are no exceptions or anything else that appears to be incorrect during the adds or commits. After this, the index files are still non-optimized. I know there is not a whole lot to go on here. Anything in particular that I should look at?
Re: IOException: read past EOF during optimize phase
Kevin, Don't have the answer to EOF but I'm wondering why the index moving. You don't need to do that as far as Solr is concerned. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Kevin Osborn <[EMAIL PROTECTED]> To: Solr Sent: Wednesday, January 16, 2008 3:07:23 PM Subject: IOException: read past EOF during optimize phase I am using the embedded Solr API for my indexing process. I created a brand new index with my application without any problem. I then ran my indexer in incremental mode. This process copies the working index to a temporary Solr location, adds/updates any records, optimizes the index, and then copies it back to the working location. There are currently not any instances of Solr reading this index. Also, I commit after every 10 rows. The schema.xml and solrconfig.xml files have not changed. Here is my function call. protected void optimizeProducts() throws IOException { UpdateHandler updateHandler = m_SolrCore.getUpdateHandler(); CommitUpdateCommand commitCmd = new CommitUpdateCommand(true); commitCmd.optimize = true; updateHandler.commit(commitCmd); log.info("Optimized index"); } So, during the optimize phase, I get the following stack trace: java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:89) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:34) at org.apache.lucene.store.IndexInput.readChars(IndexInput.java:107) at org.apache.lucene.store.IndexInput.readString(IndexInput.java:93) at org.apache.lucene.index.FieldsReader.addFieldForMerge(FieldsReader.java:211) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:119) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:323) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:206) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508) at ... There are no exceptions or anything else that appears to be incorrect during the adds or commits. After this, the index files are still non-optimized. I know there is not a whole lot to go on here. Anything in particular that I should look at?
IOException: read past EOF during optimize phase
I am using the embedded Solr API for my indexing process. I created a brand new index with my application without any problem. I then ran my indexer in incremental mode. This process copies the working index to a temporary Solr location, adds/updates any records, optimizes the index, and then copies it back to the working location. There are currently not any instances of Solr reading this index. Also, I commit after every 10 rows. The schema.xml and solrconfig.xml files have not changed. Here is my function call. protected void optimizeProducts() throws IOException { UpdateHandler updateHandler = m_SolrCore.getUpdateHandler(); CommitUpdateCommand commitCmd = new CommitUpdateCommand(true); commitCmd.optimize = true; updateHandler.commit(commitCmd); log.info("Optimized index"); } So, during the optimize phase, I get the following stack trace: java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:89) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:34) at org.apache.lucene.store.IndexInput.readChars(IndexInput.java:107) at org.apache.lucene.store.IndexInput.readString(IndexInput.java:93) at org.apache.lucene.index.FieldsReader.addFieldForMerge(FieldsReader.java:211) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:119) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:323) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:206) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508) at ... There are no exceptions or anything else that appears to be incorrect during the adds or commits. After this, the index files are still non-optimized. I know there is not a whole lot to go on here. Anything in particular that I should look at?