Hi,

 

I was wondering if there's an option to return statistics about distances
from the query terms to the most frequent terms in the result documents.

At present I return the most frequent terms using facetSearch which returns
for each word in the result documents the number ob occurences (within the
results).

The additional information I'm looking for is the average distance between
these terms and my search term.

 

So let's say I have two docs

"the house is red"

"I live in a red house"

The search for "house" should also return the info

the:1

is:1

red:1.5

I:5

live:4

and so on...

 

 

As I wasn't able to find such a function I thought about two solution for
the problem

 

1) Use facetSearch and implement a different facet.method which calculates
the average distance of a word to the given search term.

Solr doesn't seem to provide an interface to  implement a different method
so I think this solution would be a bit dogdy and would lead to problems
with the next Solr Upgrade.

 

2) Using the TermVectorComponent which return the position of each word
within a document, I could calculate the distance based on this data in the
application.

But TermVectorComponent returns information per document which means I would
need to return all documents of the result set which is probably not
recommended.

 

 

My question is

a) Did a miss a function of Solr that already does what I'm looking for?

 

b) Is solution 2) feasible even if I always have to return all docs of the
results set (the content doesn't need to be return though, just the
statistics)

 

c) Are the interfaces to ammend facetSearch the way I described which I
might have missed?

 

 

 

Thanks

Jens

Reply via email to