There is one case that I can think of where this 'constant' scoring
would be useful, and I think Chuck already mentioned this 1-2 months
ago.  For instace, having such scores would allow one to create alert
applications where queries run by some scheduler would trigger an alert
whenever the score is > X.  So that is where the absolue value of the
score would be useful.

I believe Chuck submitted some code that fixes this, which also helps
with MultiSearcher, where you have to have this contant score in order
to properly order hits from different Searchers, but I didn't dare to
touch that code without further studying, for which I didn't have time.

Otis


--- Doug Cutting <[EMAIL PROTECTED]> wrote:

> Chuck Williams wrote:
> > I believe the biggest problem with Lucene's approach relative to
> the pure vector space model is that Lucene does not properly
> normalize.  The pure vector space model implements a cosine in the
> strictly positive sector of the coordinate space.  This is guaranteed
> intrinsically to be between 0 and 1, and produces scores that can be
> compared across distinct queries (i.e., "0.8" means something about
> the result quality independent of the query).
> 
> I question whether such scores are more meaningful.  Yes, such scores
> 
> would be guaranteed to be between zero and one, but would 0.8 really
> be 
> meaningful?  I don't think so.  Do you have pointers to research
> which 
> demonstrates this?  E.g., when such a scoring method is used, that 
> thresholding by score is useful across queries?
> 
> Doug
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to