[ 
https://issues.apache.org/jira/browse/LUCENE-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891984#action_12891984
 ] 

Michael McCandless commented on LUCENE-2553:
--------------------------------------------

The IndexReader/Searcher.document call, itself, isn't that performant, 
regardless of whether you call it inside a custom Collector or outside.  If you 
need random-access to certain field(s) across all docs it's best to use 
FieldCache.DEFAULT.getXXX instead.

> IOException: read past EOF
> --------------------------
>
>                 Key: LUCENE-2553
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2553
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 3.0.2
>            Reporter: Kyle L.
>
> We have been getting an {{IOException}} with the following stack trace:
> \\
> \\
> {noformat}
> java.io.IOException: read past EOF
>       at 
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154)
>       at 
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
>       at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69)
>       at org.apache.lucene.store.IndexInput.readLong(IndexInput.java:92)
>       at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:218)
>       at 
> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:901)
>       at 
> com.cargurus.search.IndexManager$AllHitsUnsortedCollector.collect(IndexManager.java:520)
>       at 
> org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:275)
>       at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:212)
>       at org.apache.lucene.search.Searcher.search(Searcher.java:67)
>         ...
> {noformat}
> \\
> \\
> We have implemented a basic custom collector that collects all hits in an 
> unordered manner:
> {code}
>     private class AllHitsUnsortedCollector extends Collector {
>         private Log logger = 
> LogFactory.getLog(AllHitsUnsortedCollector.class); 
>         private IndexReader reader;
>         private int baselineDocumentId;
>         private List<Document> matchingDocuments = new ArrayList<Document>();
>         
>         @Override
>         public boolean acceptsDocsOutOfOrder() {
>             return true;
>         }
>         @Override
>         public void collect(int docId) throws IOException {
>             int documentId = baselineDocumentId + docId;
>             Document document = reader.document(documentId, 
> getFieldSelector());
>             
>             if (document == null) {
>                 logger.info("Null document from search results!");
>             } else {
>                 matchingDocuments.add(document);
>             }
>         }
>         @Override
>         public void setNextReader(IndexReader segmentReader, int baseDocId) 
> throws IOException {
>             this.reader = segmentReader;
>             this.baselineDocumentId = baseDocId;
>         }
>         @Override
>         public void setScorer(Scorer scorer) throws IOException {
>             // do nothing
>         }
>         public List<Document> getMatchingDocuments() {
>             return matchingDocuments;
>         }
>     }
> {code}
> The exception arises when users perform searches while indexing/optimization 
> is occurring. Our {{IndexReader}} is read-only. From the documentation I have 
> read, a read-only {{IndexReader}} instance should be immune from any 
> uncommitted index changes and should return consistent results during 
> indexing and optimization. As this exception occurs during 
> indexing/optimization, it seems to me that the read-only {{IndexReader}} is 
> somehow stumbling upon the uncommitted content? 
> The problem is difficult to replicate as it is sporadic in nature and so far 
> has only occurred in Production.
> We have rebuilt the indexes a number of times, but that does not seem to 
> alleviate the issue.
> Any other information I can provide that will help isolate the issue? 
> The most likely other possibility is that the {{Collector}} we have written 
> is doing something it shouldn't. Any pointers?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to