The # of documents that we are going to index could be potentially more than
2G. So I guess I have to split the index file into multiple of files with
each contain up to 2G files. Any other suggestion?

Thanks.

-----Original Message-----
From: Karl Wettin [mailto:[EMAIL PROTECTED] 
Sent: Thursday, May 08, 2008 11:00 AM
To: java-user@lucene.apache.org
Subject: Re: Limit of Lucene

Michael Siu skrev:
> What is the limit of Lucene:  # of docs per index?

Integer.MAX_VALUE

Multiple indices joined in a single MultiWhatNot is still limited to 
that number.


> If RangeFilter.Bits(), for example, it initializes a bitset to the size of
> maxDoc from the indexReader.  I wonder what happen if the # of docs is
huge,
> say MaxInt (4G in 32bit or 2^63 in 64 bit)?

ArrayIndexOutOfBoundsException ?

It should not be that difficult to upgrade int to longs, but it is a 
rather large job.

How many documents do you have? You might want to consider alternative 
ways to represent your corpus in the index so it takes less documents.


           karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to