On 16 Aug 2007, at 15:17, Alf Eaton wrote:
- Is there a way to get a list of all the terms in the index (or maybe just the top n) ordered by descending frequency of usage? I imagine it's related to docFreq, but can't see how to get a list of terms in all documents.
Thanks to http://tinyurl.com/2gndww I worked out how to do this (to get a list of terms and their frequency) with PyLucene:
terms = reader.terms() while terms.next(): term = terms.term() if term.field() == 'title': print '%s - %d' % (term.text(), reader.docFreq(term)) alf. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]