Hi! I have a question with sorting, I don’t understand why in a test a hit
with a lower score is ranked before hits with higher scores.

I am using Lucene 5.2.1.



Two CustomScoreQuery subqueries on two fields, subquery 1 and subquery 2,
and two test cases:

case 1: the two calculated custom scores are multiplied by the same factor
depending on the date of the match at the end of the customScore method of
CustomScoreProvider

case 2: the two calculated custom scores are *not* multiplied by the date
factor.



All tests with the same Sort, by score then by date.



Case 1: with date factor:



Test 1: subquery 1 only:

two hits, doc A (date A) gets the score A1, doc B (date B) gets the score
B1: score A1 > score B1, date A < date B, and doc A is ranked before doc B

Explanation:

doc A score A1 shardIndex=0 fields=[score A1, date A]

doc B score B1 shardIndex=0 fields=[score B1, date B]



That's correct.





Test 2: MUST query subquery 1, subquery 2:

the two same docs match: doc A (date A) gets the score A2, doc B (date B)
gets the score B2: score A2 *<* score B2, date A < date B, and *doc A is
ranked before doc B*

Explanation:

doc A score A2 shardIndex=0 fields=[score A1, date A]

doc B score B2 shardIndex=0 fields=[score B1, date B]



*doc A is ranked before doc B although score A2 < score B2 and sorting
should use scores A2 and B2, not A1 and B1.*







Case 2: without date factor:



Test 1: subquery 1 only:

doc A (date A) gets the score A1, doc B (date B) gets the score B1: score
A1 > score B1, date A < date B, and doc A is ranked before doc B

Explanation:

doc A score A1 shardIndex=0 fields=[score A1, date A]

doc B score B1 shardIndex=0 fields=[score B1, date B]





Test 2: MUST query subquery 1, subquery 2:

the two same docs match: doc A (date A) gets the score A2, doc B (date B)
gets the score B2: score A2 *>* score B2, date A < date B, and doc A is
ranked before doc B

Explanation:

doc A score A2 shardIndex=0 fields=[score A1, date A]

doc B score B2 shardIndex=0 fields=[score B1, date B]



Using score A1 here works: without the date factor, all the hits of test 2
match subquery 2 in the same way and they get the same sub-score: the
explanation shows in this case that the score = field[0] score + the common
sub-score of the hits, therefore the sorting is the same by current score
as by field[0] score.



But, with the date factor, this is no longer true, the sort [Score, date]
should use the current scores of test 2 and not those of test 1.





Please, could someone enlighten me? Do I make a mistake somewhere?



Claude Lepère

<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avg.com
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Reply via email to