Re: Stemmed terms/common terms

2007-08-16 Thread Alf Eaton
On 16 Aug 2007, at 15:17, Alf Eaton wrote: - Is there a way to get a list of all the terms in the index (or maybe just the top n) ordered by descending frequency of usage? I imagine it's related to docFreq, but can't see how to get a list of terms in all documents. Thanks to http://tinyu

Re: Stemmed terms/common terms

2007-08-16 Thread Alf Eaton
On 16 Aug 2007, at 17:06, Grant Ingersoll wrote: On Aug 16, 2007, at 10:17 AM, Alf Eaton wrote: A couple of questions about term frequencies and stemming: - What's the best way to get the most common unstemmed form of a Porter-stemmed word from the index? For example given the stem 'wal

Re: Stemmed terms/common terms

2007-08-16 Thread Grant Ingersoll
On Aug 16, 2007, at 10:17 AM, Alf Eaton wrote: A couple of questions about term frequencies and stemming: - What's the best way to get the most common unstemmed form of a Porter-stemmed word from the index? For example given the stem 'walk', find that 'walking' is the most common full word

Stemmed terms/common terms

2007-08-16 Thread Alf Eaton
A couple of questions about term frequencies and stemming: - What's the best way to get the most common unstemmed form of a Porter-stemmed word from the index? For example given the stem 'walk', find that 'walking' is the most common full word in the index. - Is there a way to get a list of