On Aug 3, 2007, at 9:47 AM, tierecke wrote:


Hi,

Can I know in how many documents a term appears (DF - Document Frequency)?
Does Lucene keep it? Can I retrieve it?


See the TermEnum class (IndexReader.terms()

Now - an even more advanced question:
Since I have a 77GB index, I cut it into 25 smaller indices of 3GB each and I query them using MultiSearcher. Is there a possibility to know the DF of a term throughout the whole collection or do I need to ask each index for the
DF of a specific term (supposing that my first question is solvable).


See the MultiReader and MultiReader.terms()

And the last question: Is there a way to know the total number of documents in a Lucene Index? Is there a way to know the total number of documents in
multiple indexes together?

IndexReader.numDocs()
MultiReader.numDocs()


I hope it's not too much. Suddenly I find myself dealing with stuff I never
dealt before.


Much better than doing the same stuff day after day for life, ain't it? :-)




--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to