The # of documents that we are going to index could be potentially more than 2G. So I guess I have to split the index file into multiple of files with each contain up to 2G files. Any other suggestion?
Thanks. -----Original Message----- From: Karl Wettin [mailto:[EMAIL PROTECTED] Sent: Thursday, May 08, 2008 11:00 AM To: java-user@lucene.apache.org Subject: Re: Limit of Lucene Michael Siu skrev: > What is the limit of Lucene: # of docs per index? Integer.MAX_VALUE Multiple indices joined in a single MultiWhatNot is still limited to that number. > If RangeFilter.Bits(), for example, it initializes a bitset to the size of > maxDoc from the indexReader. I wonder what happen if the # of docs is huge, > say MaxInt (4G in 32bit or 2^63 in 64 bit)? ArrayIndexOutOfBoundsException ? It should not be that difficult to upgrade int to longs, but it is a rather large job. How many documents do you have? You might want to consider alternative ways to represent your corpus in the index so it takes less documents. karl --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]