Maybe high frequency terms that are not evenly distributed throughout
the corpus would be a better definition. Discriminative terms. I'm
sure there is something in the machine learning literature about
unsupervised clustering that would help here. But I don't know what it
is :)
-Mike
On 06/27/2012 05:09 AM, Ian Lea wrote:
All words are important if they help people find what they want.
Maybe you want high frequency terms. See contrib class
org.apache.lucene.misc.HighFreqTerms.
--
Ian.
On Wed, Jun 27, 2012 at 3:04 AM, 齐保元<[email protected]> wrote:
meaningful just means the word is important than others,like keywords/keyphrase.
Please define meaningful.
--
Ian.
On Tue, Jun 26, 2012 at 10:39 AM,<[email protected]> wrote:
hi, does anyone knows how to extract meaningful words from Lucene index?
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]