Re: pruning search result with search score gradient

2011-01-20 Thread Toke Eskildsen
On Tue, 2011-01-11 at 12:12 +0100, Julien Piquot wrote:
 I would like to be able to prune my search result by removing the less 
 relevant documents. I'm thinking about using the search score : I use 
 the search scores of the document set (I assume there are sorted by 
 descending order), normalise them (0 would be the the lowest value and 1 
 the greatest value) and then calculate the gradient of the normalised 
 scores. The documents with a gradient below a threshold value would be 
 rejected.

As part of experimenting with federated search, this is one approach
we'll be trying out to determine which results to discard when merging.

 If the scores are linearly decreasing, then no document is rejected. 
 However, if there is a brutal score drop, then the documents below the 
 drop are rejected.

So if we have the scores
1.0, 0.9, 0.2, 0.15, 0.1, 0.05
then the slopes will be
0.05, 0.4, 0.025, 0.025, 0.025
and with a slope threshold of 0.1, we would discard everything from
score 0.2 and below.

It makes sense if the scores are linear with the relevance (a document
with score 0.8 has double the relevance as one with 0.4). I don't know
if they are, so experiments must be made and I fear that this is another
demonstration of the inherent problem with quantifying quality.

- Toke



Re: pruning search result with search score gradient

2011-01-20 Thread Dennis Gearon
that's a pretty good idea, using 'delta score'

 Dennis Gearon


Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



- Original Message 
From: Toke Eskildsen t...@statsbiblioteket.dk
To: solr-user@lucene.apache.org solr-user@lucene.apache.org
Sent: Thu, January 20, 2011 11:31:48 PM
Subject: Re: pruning search result with search score gradient

On Tue, 2011-01-11 at 12:12 +0100, Julien Piquot wrote:
 I would like to be able to prune my search result by removing the less 
 relevant documents. I'm thinking about using the search score : I use 
 the search scores of the document set (I assume there are sorted by 
 descending order), normalise them (0 would be the the lowest value and 1 
 the greatest value) and then calculate the gradient of the normalised 
 scores. The documents with a gradient below a threshold value would be 
 rejected.

As part of experimenting with federated search, this is one approach
we'll be trying out to determine which results to discard when merging.

 If the scores are linearly decreasing, then no document is rejected. 
 However, if there is a brutal score drop, then the documents below the 
 drop are rejected.

So if we have the scores
1.0, 0.9, 0.2, 0.15, 0.1, 0.05
then the slopes will be
0.05, 0.4, 0.025, 0.025, 0.025
and with a slope threshold of 0.1, we would discard everything from
score 0.2 and below.

It makes sense if the scores are linear with the relevance (a document
with score 0.8 has double the relevance as one with 0.4). I don't know
if they are, so experiments must be made and I fear that this is another
demonstration of the inherent problem with quantifying quality.

- Toke


Re: pruning search result with search score gradient

2011-01-12 Thread Erick Erickson
What's the use-case you're trying to solve? Because if you're
still showing results to the user, you're taking information away
from them. Where are you expecting to get the list? If you try
to return the entire list, you're going to pay the penalty
of creating the entire list and transmitting it across the wire rather
than just a pages' worth.

And if you're paging, the user will do this for you by deciding for
herself when she's getting less relevant results.

So I don't understand what the value to the end user you're trying
to provide is, perhaps if you elaborate on that I'll have more useful
response

Best
Erick

On Tue, Jan 11, 2011 at 3:12 AM, Julien Piquot julien.piq...@arisem.comwrote:

 Hi everyone,

 I would like to be able to prune my search result by removing the less
 relevant documents. I'm thinking about using the search score : I use the
 search scores of the document set (I assume there are sorted by descending
 order), normalise them (0 would be the the lowest value and 1 the greatest
 value) and then calculate the gradient of the normalised scores. The
 documents with a gradient below a threshold value would be rejected.
 If the scores are linearly decreasing, then no document is rejected.
 However, if there is a brutal score drop, then the documents below the drop
 are rejected.
 The threshold value would still have to be tuned but I believe it would
 make a much stronger metric than an absolute search score.

 What do you think about this approach? Do you see any problem with it? Is
 there any SOLR tools that could help me dealing with that?

 Thanks for your answer.

 Julien



Re: pruning search result with search score gradient

2011-01-12 Thread Jonathan Rochkind
Some times I've _considered_ trying to do this (but generally decided it 
wasn't worth it) was when I didn't want those documents below the 
threshold to show up in the facet values.  In my application the facet 
counts are sometimes very pertinent information, that are sometimes not 
quite as useful as they could be when they include barely-relevant hits.


On 1/12/2011 11:42 AM, Erick Erickson wrote:

What's the use-case you're trying to solve? Because if you're
still showing results to the user, you're taking information away
from them. Where are you expecting to get the list? If you try
to return the entire list, you're going to pay the penalty
of creating the entire list and transmitting it across the wire rather
than just a pages' worth.

And if you're paging, the user will do this for you by deciding for
herself when she's getting less relevant results.

So I don't understand what the value to the end user you're trying
to provide is, perhaps if you elaborate on that I'll have more useful
response

Best
Erick

On Tue, Jan 11, 2011 at 3:12 AM, Julien Piquotjulien.piq...@arisem.comwrote:


Hi everyone,

I would like to be able to prune my search result by removing the less
relevant documents. I'm thinking about using the search score : I use the
search scores of the document set (I assume there are sorted by descending
order), normalise them (0 would be the the lowest value and 1 the greatest
value) and then calculate the gradient of the normalised scores. The
documents with a gradient below a threshold value would be rejected.
If the scores are linearly decreasing, then no document is rejected.
However, if there is a brutal score drop, then the documents below the drop
are rejected.
The threshold value would still have to be tuned but I believe it would
make a much stronger metric than an absolute search score.

What do you think about this approach? Do you see any problem with it? Is
there any SOLR tools that could help me dealing with that?

Thanks for your answer.

Julien



pruning search result with search score gradient

2011-01-11 Thread Julien Piquot

Hi everyone,

I would like to be able to prune my search result by removing the less 
relevant documents. I'm thinking about using the search score : I use 
the search scores of the document set (I assume there are sorted by 
descending order), normalise them (0 would be the the lowest value and 1 
the greatest value) and then calculate the gradient of the normalised 
scores. The documents with a gradient below a threshold value would be 
rejected.
If the scores are linearly decreasing, then no document is rejected. 
However, if there is a brutal score drop, then the documents below the 
drop are rejected.
The threshold value would still have to be tuned but I believe it would 
make a much stronger metric than an absolute search score.


What do you think about this approach? Do you see any problem with it? 
Is there any SOLR tools that could help me dealing with that?


Thanks for your answer.

Julien


Re: pruning search result with search score gradient

2011-01-11 Thread Grijesh.singh

Look at Solr Function Queries they might help you

-
Grijesh
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/pruning-search-result-with-search-score-gradient-tp2233760p2233773.html
Sent from the Solr - User mailing list archive at Nabble.com.