On Sep 5, 2008, at 6:27 PM, Ravindra Sharma wrote:

Hi Folks,

I have somewhat complex scoring/boosting requirement.

Say I have 3 text fields A, B, C and a Numeric field called D.
Say My query is "testrank".

Scoring should be based on following:

Query matches
1. text fields A, B and C, & Highest value of D (highest boost/rank)
2. A and B, & Highest value of D (2nd highest)
3. A and C, & Highest value of D (3rd highest)
4. B and C, & Highest value of D (4th highest)
5. B, & Highest value of D (5th highest)
6. C, & Highest value of D (6th highest)

i). If I use the standard query, it will be query (with boost) something
like this:

query = (A:testrank AND B:testrank AND C:testrank)^10 OR (A:testrank AND
B:testrank)^9 OR (A:testrank AND C:testrank)^8 OR (B:testrank AND
C:testrank)^7 OR (A:testrank)^6 OR (B:testrank)^5 OR (C:testrank)^4
sort = by Score (primary), Field D (Secondary)

Also, I do need to override Similarity such that tf, idf etc doesn't
interfere; and all docs should score purely based on boost values, I have
specified. That way seconday sort can be effective.

This will be a poor query so I would like to avoid it.

Why is it poor? I admit, I'm not fully following what you are trying to do. Perhaps, taking a step back and letting us know the bigger picture you want to solve will help. For example, how did you come up w/ the need for the scoring algorithm above? Is this research or are you trying to factor in PageRank or something like that?


-Grant

Reply via email to