get a list of terms sorted by total term frequency
hi, is there a simple way to get a list of all terms that occur in a field sorted by their total term frequency within that field? TermsComponent (http://wiki.apache.org/solr/TermsComponent) provides fast field faceting over the whole index, but as counts it gives the number of documents that each term occurs in (given a field or set of fields). in place of document counts, i want total term frequency counts. the ttf function (http://wiki.apache.org/solr/FunctionQuery#totaltermfreq) provides this, but only if you know what term to pass to the function. edward
Re: get a list of terms sorted by total term frequency
Lucene's misc module has HighFreqTerms tool. Mike McCandless http://blog.mikemccandless.com On Wed, Nov 7, 2012 at 1:15 PM, Edward Garrett heacu.mcint...@gmail.com wrote: hi, is there a simple way to get a list of all terms that occur in a field sorted by their total term frequency within that field? TermsComponent (http://wiki.apache.org/solr/TermsComponent) provides fast field faceting over the whole index, but as counts it gives the number of documents that each term occurs in (given a field or set of fields). in place of document counts, i want total term frequency counts. the ttf function (http://wiki.apache.org/solr/FunctionQuery#totaltermfreq) provides this, but only if you know what term to pass to the function. edward
Re: get a list of terms sorted by total term frequency
i see... using the -t flag it would be cool if TermsComponent had an option to sort by total term frequency, something like terms.sort={count|index|ttf} surely that's a common enough use case On Wed, Nov 7, 2012 at 6:17 PM, Michael McCandless luc...@mikemccandless.com wrote: Lucene's misc module has HighFreqTerms tool. Mike McCandless http://blog.mikemccandless.com On Wed, Nov 7, 2012 at 1:15 PM, Edward Garrett heacu.mcint...@gmail.com wrote: hi, is there a simple way to get a list of all terms that occur in a field sorted by their total term frequency within that field? TermsComponent (http://wiki.apache.org/solr/TermsComponent) provides fast field faceting over the whole index, but as counts it gives the number of documents that each term occurs in (given a field or set of fields). in place of document counts, i want total term frequency counts. the ttf function (http://wiki.apache.org/solr/FunctionQuery#totaltermfreq) provides this, but only if you know what term to pass to the function. edward -- edge