Maybe lucene has maxsize 2^31 because result set are java array where length is a int type. A suggestion for possible changes in future is to not use java array but Iterator. Iterator is a ADT more scalable , not sucking memory for returning documents.
2016-08-18 16:03 GMT+02:00 Glen Newton <glen.new...@gmail.com>: > Or maybe it is time Lucene re-examined this limit. > > There are use cases out there where >2^31 does make sense in a single index > (huge number of tiny docs). > > Also, I think the underlying hardware and the JDK have advanced to make > this more defendable. > > Constructively, > Glen > > > On Thu, Aug 18, 2016 at 9:55 AM, Adrien Grand <jpou...@gmail.com> wrote: > > > No, IndexWriter enforces that the number of documents cannot go over > > IndexWriter.MAX_DOCS (which is a bit less than 2^31) and > > BaseCompositeReader computes the number of documents in a long variable > and > > ensures it is less than 2^31, so you cannot have indexes that contain > more > > than 2^31 documents. > > > > Larger collections should be written to multiple shards and use > > TopDocs.merge to merge results. > > > > Le jeu. 18 août 2016 à 15:38, Cristian Lorenzetto < > > cristian.lorenze...@gmail.com> a écrit : > > > > > docid is a signed int32 so it is not so big, but really docid seams > not a > > > primary key unmodifiable but a temporary id for the view related to a > > > specific search. > > > > > > So repository can contains more than 2^31 documents. > > > > > > My deduction is correct ? is there a maximum size for lucene index? > > > > > >