rom: Doug Cutting [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, December 15, 2004 12:35 PM
> To: Lucene Users List
> Subject: Re: A question about scoring function in Lucene
>
> Chris Hostetter wrote:
> > For example, using the current scoring equation, if i do a
Chris Hostetter wrote:
For example, using the current scoring equation, if i do a search for
"Doug Cutting" and the results/scores i get back are...
1: 0.9
2: 0.3
3: 0.21
4: 0.21
5: 0.1
...then there are at least two meaningful pieces of data I can glean:
Otis Gospodnetic wrote:
There is one case that I can think of where this 'constant' scoring
would be useful, and I think Chuck already mentioned this 1-2 months
ago. For instace, having such scores would allow one to create alert
applications where queries run by some scheduler would trigger an al
There is one case that I can think of where this 'constant' scoring
would be useful, and I think Chuck already mentioned this 1-2 months
ago. For instace, having such scores would allow one to create alert
applications where queries run by some scheduler would trigger an alert
whenever the score i
: I question whether such scores are more meaningful. Yes, such scores
: would be guaranteed to be between zero and one, but would 0.8 really be
: meaningful? I don't think so. Do you have pointers to research which
: demonstrates this? E.g., when such a scoring method is used, that
: threshold
Chuck Williams wrote:
I believe the biggest problem with Lucene's approach relative to the pure vector space model is that Lucene does not properly normalize. The pure vector space model implements a cosine in the strictly positive sector of the coordinate space. This is guaranteed intrinsically
t; From: Nhan Nguyen Dang [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, December 15, 2004 1:18 AM
> To: Lucene Users List
> Subject: RE: A question about scoring function in Lucene
>
> Thank for your answer,
> In Lucene scoring function, they use only norm_q,
> but f
w.emeraldinsight.com/rpsv/cgi-bin/emft.pl
> if you sign up for an eval.
>
> It's easy to correct for idf^2 by using a customer
> Similarity that takes a final square root.
>
> Chuck
>
> > -Original Message-----
> > From: Vikas Gupta [mailto:[EMAIL PROTECTED]
rom: Vikas Gupta [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, December 14, 2004 9:32 PM
> To: Lucene Users List
> Subject: Re: A question about scoring function in Lucene
>
> Lucene uses the vector space model. To understand that:
>
> -Read section 2.1 of "Space op
Lucene uses the vector space model. To understand that:
-Read section 2.1 of "Space optimizations for Total Ranking" paper (Linked
here http://lucene.sourceforge.net/publications.html)
-Read section 6 to 6.4 of
http://www.csee.umbc.edu/cadip/readings/IR.report.120600.book.pdf
-Read section 1 of
ht
Hi all,
Lucene score document based on the correlation between
the query q and document t:
(this is raw function, I don't pay attention to the
boost_t, coord_q_d factor)
score_d = sum_t( tf_q * idf_t / norm_q * tf_d * idf_t
/ norm_d_t) (*)
Could anybody explain it in detail ? Or are there any
p
11 matches
Mail list logo