On Sep 5, 2008, at 6:27 PM, Ravindra Sharma wrote:
Hi Folks,
I have somewhat complex scoring/boosting requirement.
Say I have 3 text fields A, B, C and a Numeric field called D.
Say My query is "testrank".
Scoring should be based on following:
Query matches
1. text fields A, B and C, & Highest value of D (highest boost/rank)
2. A and B, & Highest value of D (2nd highest)
3. A and C, & Highest value of D (3rd highest)
4. B and C, & Highest value of D (4th highest)
5. B, & Highest value of D (5th highest)
6. C, & Highest value of D (6th highest)
i). If I use the standard query, it will be query (with boost)
something
like this:
query = (A:testrank AND B:testrank AND C:testrank)^10 OR (A:testrank
AND
B:testrank)^9 OR (A:testrank AND C:testrank)^8 OR (B:testrank AND
C:testrank)^7 OR (A:testrank)^6 OR (B:testrank)^5 OR (C:testrank)^4
sort = by Score (primary), Field D (Secondary)
Also, I do need to override Similarity such that tf, idf etc doesn't
interfere; and all docs should score purely based on boost values, I
have
specified. That way seconday sort can be effective.
This will be a poor query so I would like to avoid it.
Why is it poor? I admit, I'm not fully following what you are trying
to do. Perhaps, taking a step back and letting us know the bigger
picture you want to solve will help. For example, how did you come up
w/ the need for the scoring algorithm above? Is this research or are
you trying to factor in PageRank or something like that?
-Grant