I think I understand what you're describing as a "link map" to be a "tag cloud" where each tag is a "frequent" or "strong" term.

We did something like this as an experiment (without Lucene):
http://www.cognocys.com/prospector/news.html

If you're talking about something similar, then I think you can use Lucene's TFVs only to get at the frequency data in the context of the Documents (not the results). I'm no expert, but I say this because I've only ever seen TermFrequencyVectors being discussed in the context of an IndexReader, not in the context of Hits or TopDocs. http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/index/ class-use/TermFreqVector.html

The other thing, though, is that TF may not be sufficient to determine what to use for each tag/link. For example, given a set of Results, R, would you like to use:
1.  the top N most frequent terms for each Document in R?
2. the top M most frequent terms that are common to all/many Documents in R? 3. the top O most frequent terms that are common in results built using the highlighter?
...

To a certain extent, this is a clustering problem:-- given some set of Documents, R, which just happen to be the results of some search, represent R using a tag cloud/link map of terms that best represent R.

Have you looked at carrot2? I haven't seen the tag cloud visualization there, but you may find some ideas for clustering/ document-set representation there:
http://project.carrot2.org/


Good luck!

-h

On 16-Oct-2008, at 3:21 PM, Darren Govoni wrote:

I guess a link map (as I understand it) is a collection of hyperlinks of words/phrases where the dominant ones are bolder color and larger font.
Its relatively new schema, some sites are using.

For example, someone searches for a person and a link map would show
them all the most frequent terms in the results they got back. Sort of
like latent relationships.

Does that help?

I thought this could be done using term frequency vectors in Lucene, but
I've never used TFV's before. And can then be limited to just a set of
results.

HTH,
Darren

On Thu, 2008-10-16 at 14:09 -0400, Glen Newton wrote:
Sorry, could you explain what you mean by a "link map over lucene results"?

thanks,
-glen

2008/10/16 Darren Govoni <[EMAIL PROTECTED]>:
Hi,
 Has anyone created a link map over lucene results or know of a link
describing the process? If not, I would like to build one to contribute.

Also, I read about term frequencies in the book, but wanted to know if I
can extract the strongest occurring terms from a given result set or
result?

thank you for any help. I will keep reading/looking.

Darren


-------------------------------------------------------------------- -
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to