numDocs method of IndexReader
Hello Everyone,
I need to be able to iterate through the entire set of
documents within the index to perform some auditing. I
originally tried the following code snip:
int ndoc = idxReader.numDocs();
for (int i=0; i< ndoc; i++) {
Document doc = idxReader.document(i);
.
.
.
}
This is working with my initial project here, where
the number of Documents are small. I am concerned when
the indexes grow larger. Since this method returns an
int, does this mean that I am limited by the size of
an integer to the number of documents I can have?
Also, I wanted to search the archives of this list to
see if anyone had previously answered this question,
but I could not find a site that archives this list.
Can any one point me to such a site?
Many thanks in advance,
Tom C.
__
Do you Yahoo!?
Plan great trips with Yahoo! Travel: Now over 17,000 guides!
http://travel.yahoo.com/p-travelguide
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Re: numDocs method of IndexReader
Hi Otis, Thanks for your answer on the integer issue. I was not sure if the index was actually limited, or if it was just the numDocs method call. I guess it really does not matter which it is; and for me, I don't think my index will ever get that large! I do have a couple of questions from your response: > > Iterating the index like that is ok, but each call > do reader.document(int) pulls the entire Document > off the disk, which can get expensive. > Thanks for the clarification. Is there a 'better' i.e. less expensive way to extract values from each document than to iterate through them in that fashion? I would appreciate any suggestions that you can offer. > > The link to list archives should be on > lucene.apache.org. > I checked the lucene.apache.org site. The "Lucene-user" link (on the left under the Resources heading) which points to the eyebrowse apache site is broken. The Achive link for lucene-user under the "mailing lists" page points takes you to the mbox archive which is not searchable. Hopefully these problems will be corrected soon. In the mean time, does anyone know of any other sites that might be archiving the lucene-user list and which provide a search capability? Thanks, Tom C. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
