get a list of terms sorted by total term frequency

2012-11-07 Thread Edward Garrett
hi,

is there a simple way to get a list of all terms that occur in a field
sorted by their total term frequency within that field?

TermsComponent (http://wiki.apache.org/solr/TermsComponent) provides
fast field faceting over the whole index, but as counts it gives the
number of documents that each term occurs in (given a field or set of
fields). in place of document counts, i want total term frequency
counts. the ttf function
(http://wiki.apache.org/solr/FunctionQuery#totaltermfreq) provides
this, but only if you know what term to pass to the function.

edward


Re: get a list of terms sorted by total term frequency

2012-11-07 Thread Michael McCandless
Lucene's misc module has HighFreqTerms tool.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Nov 7, 2012 at 1:15 PM, Edward Garrett heacu.mcint...@gmail.com wrote:
 hi,

 is there a simple way to get a list of all terms that occur in a field
 sorted by their total term frequency within that field?

 TermsComponent (http://wiki.apache.org/solr/TermsComponent) provides
 fast field faceting over the whole index, but as counts it gives the
 number of documents that each term occurs in (given a field or set of
 fields). in place of document counts, i want total term frequency
 counts. the ttf function
 (http://wiki.apache.org/solr/FunctionQuery#totaltermfreq) provides
 this, but only if you know what term to pass to the function.

 edward


Re: get a list of terms sorted by total term frequency

2012-11-07 Thread Edward Garrett
i see... using the -t flag

it would be cool if TermsComponent had an option to sort by total term
frequency, something like

terms.sort={count|index|ttf}

surely that's a common enough use case


On Wed, Nov 7, 2012 at 6:17 PM, Michael McCandless
luc...@mikemccandless.com wrote:
 Lucene's misc module has HighFreqTerms tool.

 Mike McCandless

 http://blog.mikemccandless.com


 On Wed, Nov 7, 2012 at 1:15 PM, Edward Garrett heacu.mcint...@gmail.com 
 wrote:
 hi,

 is there a simple way to get a list of all terms that occur in a field
 sorted by their total term frequency within that field?

 TermsComponent (http://wiki.apache.org/solr/TermsComponent) provides
 fast field faceting over the whole index, but as counts it gives the
 number of documents that each term occurs in (given a field or set of
 fields). in place of document counts, i want total term frequency
 counts. the ttf function
 (http://wiki.apache.org/solr/FunctionQuery#totaltermfreq) provides
 this, but only if you know what term to pass to the function.

 edward



-- 
edge