Erick and Ahmet - thank you Shay
On Mon, Jun 15, 2015 at 6:19 PM Ahmet Arslan <iori...@yahoo.com.invalid> wrote: > Hi, > > If you are interested in summed up tf values of multiple terms, > I suggest to extend SimilarityBase class to return raw freq as score. > > float score(BasicStats stats, float freq, float docLen){ > return freq; > } > > When you use this similarity, search for three term query, scores will > summed tf values. Also you can extract additional info from explain feature. > > Ahmet > > > > > On Monday, June 15, 2015 5:50 PM, Shay Hummel <shay.hum...@gmail.com> > wrote: > Hi Ahmet > > Thank you for the reply. > Can the term reflect a multi word expression? > For example: > I want to find the term frequency \ document frequency of "united states" > (two terms) or "free speech zones" (three terms). > > Shay > > > On Mon, Jun 15, 2015 at 4:55 PM Ahmet Arslan <iori...@yahoo.com.invalid> > wrote: > > > Hi Hummel, > > > > regarding df, > > > > Term term = new Term(field, word); > > TermStatistics termStatistics = searcher.termStatistics(term, > > TermContext.build(reader.getContext(), term)); > > System.out.println(query + "\t totalTermFreq \t " + > > termStatistics.totalTermFreq()); > > System.out.println(query + "\t docFreq \t " + termStatistics.docFreq()); > > > > regarding tf, > > > > Term term = new Term(field, word); > > Bits bits = MultiFields.getLiveDocs(reader); > > PostingsEnum postingsEnum = MultiFields.getTermDocsEnum(reader, bits, > > field, term.bytes()); > > > > if (postingsEnum == null) return; > > > > int max = 0; > > while (postingsEnum.nextDoc() != PostingsEnum.NO_MORE_DOCS) { > > final int freq = postingsEnum.freq(); > > int docID = postingsEnum.docID();} > > > > > > Ahmet > > > > > > > > > > On Monday, June 15, 2015 9:12 AM, Shay Hummel <shay.hum...@gmail.com> > > wrote: > > Hi > > > > I was wondering, what is the easiest way to get the term frequency of a > > term t in document d, namely tf(t,d) ? > > In the same spirit - what is the easieast way the get the document > > frequency of a term in the collection, i.e. how many contain the term t, > > namely df(t) ? > > > > Regards, > > Shay > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >