[ 
https://issues.apache.org/jira/browse/LUCENE-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle L. updated LUCENE-2553:
----------------------------

    Description: 
We have been getting an {{IOException}} with the following stack trace:
\\
\\
{noformat}
java.io.IOException: read past EOF
        at 
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154)
        at 
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
        at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69)
        at org.apache.lucene.store.IndexInput.readLong(IndexInput.java:92)
        at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:218)
        at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:901)
        at 
com.cargurus.search.IndexManager$AllHitsUnsortedCollector.collect(IndexManager.java:520)
        at 
org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:275)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:212)
        at org.apache.lucene.search.Searcher.search(Searcher.java:67)
        ...
{noformat}
\\
\\
We have implemented a basic custom collector that collects all hits in an 
unordered manner:

{code}
    private class AllHitsUnsortedCollector extends Collector {

        private Log logger = LogFactory.getLog(AllHitsUnsortedCollector.class); 
        private IndexReader reader;
        private int baselineDocumentId;
        private List<Document> matchingDocuments = new ArrayList<Document>();
        
        @Override
        public boolean acceptsDocsOutOfOrder() {
            return true;
        }

        @Override
        public void collect(int docId) throws IOException {

            int documentId = baselineDocumentId + docId;
            Document document = reader.document(documentId, getFieldSelector());
            
            if (document == null) {
                logger.info("Null document from search results!");
            } else {
                matchingDocuments.add(document);
            }
        }

        @Override
        public void setNextReader(IndexReader segmentReader, int baseDocId) 
throws IOException {
            this.reader = segmentReader;
            this.baselineDocumentId = baseDocId;
        }

        @Override
        public void setScorer(Scorer scorer) throws IOException {
            // do nothing
        }

        public List<Document> getMatchingDocuments() {
            return matchingDocuments;
        }
    }

{code}

The exception arises when users perform searches while indexing/optimization is 
occurring. Our {{IndexReader}} is read-only. From the documentation I have 
read, a read-only {{IndexReader}} instance should be immune from any 
uncommitted index changes and should return consistent results during indexing 
and optimization. As this exception occurs during indexing/optimization, it 
seems to me that the read-only {{IndexReader}} is somehow stumbling upon the 
uncommitted content? 

The problem is difficult to replicate as it is sporadic in nature and so far 
has only occurred in Production.

We have rebuilt the indexes a number of times, but that does not seem to 
alleviate the issue.

Any other information I can provide that will help isolate the issue? 

The most likely other possibility is that the {{Collector}} we have written is 
doing something it shouldn't. Any pointers?

  was:
We have been getting an {{IOException}} with the following stack trace:
\\
\\
{noformat}
java.io.IOException: read past EOF
        at 
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154)
        at 
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
        at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69)
        at org.apache.lucene.store.IndexInput.readLong(IndexInput.java:92)
        at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:218)
        at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:901)
        at 
com.cargurus.search.IndexManager$AllHitsUnsortedCollector.collect(IndexManager.java:520)
        at 
org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:275)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:212)
        at org.apache.lucene.search.Searcher.search(Searcher.java:67)
        ...
{noformat}
\\
\\
We have implemented a basic custom collector that collects all hits in an 
unordered manner:

{code}
    private class AllHitsUnsortedCollector extends Collector {

        private Log logger = LogFactory.getLog(AllHitsUnsortedCollector.class); 
        private IndexReader reader;
        private int baselineDocumentId;
        private List<Document> matchingDocuments = new ArrayList<Document>();
        
        @Override
        public boolean acceptsDocsOutOfOrder() {
            return true;
        }

        @Override
        public void collect(int docId) throws IOException {

            int documentId = baselineDocumentId + docId;
            Document document = reader.document(documentId, getFieldSelector());
            
            if (document == null) {
                logger.info("Null document from search results!");
            } else {
                matchingDocuments.add(document);
            }
        }

        @Override
        public void setNextReader(IndexReader segmentReader, int baseDocId) 
throws IOException {
            this.reader = segmentReader;
            this.baselineDocumentId = baseDocId;
        }

        @Override
        public void setScorer(Scorer scorer) throws IOException {
            // do nothing
        }

        public List<Document> getMatchingDocuments() {
            return matchingDocuments;
        }
    }

{code}

The exception arises when users perform searches while indexing/optimization is 
occurring. Our {{IndexReader}} is read-only. From the documentation I have 
read, a read-only {{IndexReader}} instance should be immune from any 
uncommitted index changes and should return consistent results during indexing 
and optimization. As this exception occurs during indexing/optimization, it 
seems to me that the read-only {{IndexReader}} is somehow stumbling upon the 
uncommitted content? 

The problem is difficult to replicate as it is sporadic in nature and so far 
has only occurred in Production.

We have rebuilt the indexes a number of times, but that does not seem to 
alleviate the issue.

Any other information I can provide that will help isolate the issue? 

Most likely the other possibility is that the {{Collector}} we have written is 
doing something it shouldn't. Any pointers?


> IOException: read past EOF
> --------------------------
>
>                 Key: LUCENE-2553
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2553
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 3.0.2
>            Reporter: Kyle L.
>
> We have been getting an {{IOException}} with the following stack trace:
> \\
> \\
> {noformat}
> java.io.IOException: read past EOF
>       at 
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154)
>       at 
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
>       at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69)
>       at org.apache.lucene.store.IndexInput.readLong(IndexInput.java:92)
>       at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:218)
>       at 
> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:901)
>       at 
> com.cargurus.search.IndexManager$AllHitsUnsortedCollector.collect(IndexManager.java:520)
>       at 
> org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:275)
>       at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:212)
>       at org.apache.lucene.search.Searcher.search(Searcher.java:67)
>         ...
> {noformat}
> \\
> \\
> We have implemented a basic custom collector that collects all hits in an 
> unordered manner:
> {code}
>     private class AllHitsUnsortedCollector extends Collector {
>         private Log logger = 
> LogFactory.getLog(AllHitsUnsortedCollector.class); 
>         private IndexReader reader;
>         private int baselineDocumentId;
>         private List<Document> matchingDocuments = new ArrayList<Document>();
>         
>         @Override
>         public boolean acceptsDocsOutOfOrder() {
>             return true;
>         }
>         @Override
>         public void collect(int docId) throws IOException {
>             int documentId = baselineDocumentId + docId;
>             Document document = reader.document(documentId, 
> getFieldSelector());
>             
>             if (document == null) {
>                 logger.info("Null document from search results!");
>             } else {
>                 matchingDocuments.add(document);
>             }
>         }
>         @Override
>         public void setNextReader(IndexReader segmentReader, int baseDocId) 
> throws IOException {
>             this.reader = segmentReader;
>             this.baselineDocumentId = baseDocId;
>         }
>         @Override
>         public void setScorer(Scorer scorer) throws IOException {
>             // do nothing
>         }
>         public List<Document> getMatchingDocuments() {
>             return matchingDocuments;
>         }
>     }
> {code}
> The exception arises when users perform searches while indexing/optimization 
> is occurring. Our {{IndexReader}} is read-only. From the documentation I have 
> read, a read-only {{IndexReader}} instance should be immune from any 
> uncommitted index changes and should return consistent results during 
> indexing and optimization. As this exception occurs during 
> indexing/optimization, it seems to me that the read-only {{IndexReader}} is 
> somehow stumbling upon the uncommitted content? 
> The problem is difficult to replicate as it is sporadic in nature and so far 
> has only occurred in Production.
> We have rebuilt the indexes a number of times, but that does not seem to 
> alleviate the issue.
> Any other information I can provide that will help isolate the issue? 
> The most likely other possibility is that the {{Collector}} we have written 
> is doing something it shouldn't. Any pointers?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to