> Given a term say "apache", I want to look up the lucene index > programmatically to find out its frequency in the corpus.
I think you are asking collection frequency of a term. Term Frequency is defined between a document and a term which is printed in the loop in the following code. And at the end there is collection freq. which sum of tfs. String path = "E:\\ThesaurusSolrHome\\data\\index"; String field = "contents"; Term term = new Term(field, "apache"); IndexReader indexReader = IndexReader.open(path); TermDocs termDocs = indexReader.termDocs(term); int collectionFreq = 0; while (termDocs.next()) { System.out.print("Document " + termDocs.doc() + " contains the term " + term.text() + " "); System.out.println(termDocs.freq() + " times"); collectionFreq += termDocs.freq(); } indexReader.close(); System.out.println("Collection frequency of " + term.text() + " = " + collectionFreq); --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org