Hi Hummel,
regarding df,
Term term = new Term(field, word);
TermStatistics termStatistics = searcher.termStatistics(term,
TermContext.build(reader.getContext(), term));
System.out.println(query + "\t totalTermFreq \t " +
termStatistics.totalTermFreq());
System.out.println(query + "\t docFreq \t " + termStatistics.docFreq());
regarding tf,
Term term = new Term(field, word);
Bits bits = MultiFields.getLiveDocs(reader);
PostingsEnum postingsEnum = MultiFields.getTermDocsEnum(reader, bits, field,
term.bytes());
if (postingsEnum == null) return;
int max = 0;
while (postingsEnum.nextDoc() != PostingsEnum.NO_MORE_DOCS) {
final int freq = postingsEnum.freq();
int docID = postingsEnum.docID();}
Ahmet
On Monday, June 15, 2015 9:12 AM, Shay Hummel <[email protected]> wrote:
Hi
I was wondering, what is the easiest way to get the term frequency of a
term t in document d, namely tf(t,d) ?
In the same spirit - what is the easieast way the get the document
frequency of a term in the collection, i.e. how many contain the term t,
namely df(t) ?
Regards,
Shay
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]