Hi Oli, Thanks for your reply,
I thought about this, but it feels like making a crude, inefficient implementation of what's already in lucene -- CompositeReader, isn't it? It would involve writing my CompositeCompositeReader which would forward the requests to the underlying CompositeReader... Is there a better way? Thanks, Artem. On Fri, Mar 21, 2014 at 6:33 PM, Oliver Christ <ochr...@ebsco.com> wrote: > Can you split your corpus across multiple Lucene instances? > > Cheers, Oli > > -----Original Message----- > From: Artem Gayardo-Matrosov [mailto:ar...@gayardo.com] > Sent: Friday, March 21, 2014 12:29 PM > To: java-user@lucene.apache.org > Subject: maxDoc/numDocs int fields > > Hi all, > > I am using lucene to index a large corpus of text, with every word being a > separate document (this is something I cannot change), and I am hitting a > limitation of the CompositeReader only supporting Integer.MAX_VALUE > documents. > > Is there any way to work around this limitation? For the moment I have > implemented my own DirectoryReader and BaseCompositeReader to at least make > them support documents from Integer.MIN_VALUE to -1 (for twice more > documents supported), the problem is that all the APIs are restricted to > use the int type and after the docID value wraps back to 0, I have no way > to restore the original docID. > > -- > Thanks in advance, > Artem. > -- Artem.