normally databases supports at least long primary key. try to ask to twitter application , for example increasing every year more than 4 petabytes :) Maybe they use big storage devices bigger than a pc storage:) However If you offer a possibility to use shards ... it is a possibility anyway :) For this reason, my suggestion was different ... was not related to size of repository , but size of research result :):):)
" A suggestion for possible changes in future is to not use java array but > Iterator. Iterator is a ADT more scalable , not sucking memory for > returning documents." it is just a suggestion anyway for my loved lucene :):) 2016-08-18 17:43 GMT+02:00 Greg Bowyer <gbow...@fastmail.co.uk>: > What are you trying to index that has more than 3 billion documents per > shard / index and can not be split as Adrien suggests? > > > > On Thu, Aug 18, 2016, at 07:35 AM, Cristian Lorenzetto wrote: > > Maybe lucene has maxsize 2^31 because result set are java array where > > length is a int type. > > A suggestion for possible changes in future is to not use java array but > > Iterator. Iterator is a ADT more scalable , not sucking memory for > > returning documents. > > > > > > 2016-08-18 16:03 GMT+02:00 Glen Newton <glen.new...@gmail.com>: > > > > > Or maybe it is time Lucene re-examined this limit. > > > > > > There are use cases out there where >2^31 does make sense in a single > index > > > (huge number of tiny docs). > > > > > > Also, I think the underlying hardware and the JDK have advanced to make > > > this more defendable. > > > > > > Constructively, > > > Glen > > > > > > > > > On Thu, Aug 18, 2016 at 9:55 AM, Adrien Grand <jpou...@gmail.com> > wrote: > > > > > > > No, IndexWriter enforces that the number of documents cannot go over > > > > IndexWriter.MAX_DOCS (which is a bit less than 2^31) and > > > > BaseCompositeReader computes the number of documents in a long > variable > > > and > > > > ensures it is less than 2^31, so you cannot have indexes that contain > > > more > > > > than 2^31 documents. > > > > > > > > Larger collections should be written to multiple shards and use > > > > TopDocs.merge to merge results. > > > > > > > > Le jeu. 18 août 2016 à 15:38, Cristian Lorenzetto < > > > > cristian.lorenze...@gmail.com> a écrit : > > > > > > > > > docid is a signed int32 so it is not so big, but really docid seams > > > not a > > > > > primary key unmodifiable but a temporary id for the view related > to a > > > > > specific search. > > > > > > > > > > So repository can contains more than 2^31 documents. > > > > > > > > > > My deduction is correct ? is there a maximum size for lucene index? > > > > > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >