"fdx size mismatch" overflow causes RuntimeException
----------------------------------------------------

                 Key: LUCENE-2420
                 URL: https://issues.apache.org/jira/browse/LUCENE-2420
             Project: Lucene - Java
          Issue Type: Bug
          Components: Index
    Affects Versions: 3.0.1
         Environment: CentOS 5.4
            Reporter: Steven Bethard


I just saw the following error:

java.lang.RuntimeException: after flush: fdx size mismatch: -512764976 docs vs 
30257618564 length in bytes of _0.fdx file exists?=true
        at 
org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:97)
        at 
org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:51)
        at 
org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:371)
        at 
org.apache.lucene.index.IndexWriter.flushDocStores(IndexWriter.java:1724)
        at 
org.apache.lucene.index.IndexWriter.doFlushInternal(IndexWriter.java:3565)
        at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3491)
        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3482)
        at 
org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1658)
        at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1621)
        at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1585)

Note the negative SegmentWriteState.numDocsInStore. I assume this is because 
Lucene has a limit of 2 ^ 31 - 1 = 2147483647 (sizeof(int)) documents per 
index, though I couldn't find this documented clearly anywhere. It would have 
been nice to get this error earlier, back when I exceeded the limit, rather 
than now, after a bunch of indexing that was apparently doomed to fail.

Hence, two suggestions:
* State clearly somewhere that the maximum number of documents in a Lucene 
index is sizeof(int).
* Throw an exception when an IndexWriter first exceeds this number rather than 
only on close.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to