Top terms relevance from specific documents ?

Yannick Martel Tue, 15 Dec 2015 08:58:24 -0800

Hi !

I am using (Java) Lucene for data indexation, and I want to produce kind
of tags cloud for specific data.


I've found HighFreqTerms to get a top list of terms from *all
documents* (if I have well understood) (by the bye, I had override it to
be able to filter on several fields instead only one).

But, it does not really match with my need : I'd like to get the most
repeated terms in a single (or several specific) document(s).
For exemple, considering a document with Terms "Title", "Summary",
"Description", I try to get the count of each terms (excluding stop
words from Analyzer).

I cannot find process to do that : I searched among TopFieldCollector,
or other collector, but seems it just give document scores :/

Find documentation is not easy I think, cause lot of questions/answers
are either not corresponding my need, or with old version (3.x for
example), and I'm feeling lost in all of this...


Hopping someone could guide me well.

Regards,

-- 
Yannick Martel


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Top terms relevance from specific documents ?

Reply via email to