On 16 Aug 2007, at 15:17, Alf Eaton wrote:
- Is there a way to get a list of all the terms in the index (or
maybe just the top n) ordered by descending frequency of usage? I
imagine it's related to docFreq, but can't see how to get a list of
terms in all documents.
Thanks to http://tinyurl.com/2gndww I worked out how to do this (to
get a list of terms and their frequency) with PyLucene:
terms = reader.terms()
while terms.next():
term = terms.term()
if term.field() == 'title':
print '%s - %d' % (term.text(), reader.docFreq(term))
alf.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]