David Spencer wrote:

+(f1:t1^2.0 t1) +(f1:t2^2.0 t2) f1:"t1 t2"~5^3.0 "t1 t2"~2^1.5

(f1:t1^2.0 t1) (f1:t2^2.0 t2) f1:"t1 t2"~5^3.0 "t1 t2"~2^1.5

(f1:t1^2.0 t1) (f1:t2^2.0 t2) (f1:t3^2.0 t3) (f1:t4^2.0 t4) (f1:t5^2.0 t5) f1:"t1 t2 t3 t4 t5"~5^3.0 "t1 t2 t3 t4 t5"~2^1.5

This looks great to me! I'd make mand=true by default, i.e., have a method where this parameter is not specified. Similarly, we might default phraseBoosts[i] to boolBoosts[i]*phraseBoost, and slops to infinity. What we want is something that provides only the knobs that we think most folks will need. Ideally we wouldn't even need to specify fieldBoosts. Short fields like titles get a larger lengthNorm, which effectively boosts them a lot already.


But perhaps we should back off and first just evaluate single field search with different idf, tf (and perhaps lengthNorm and sloppyFreq) definitions. Once we're happy with those, then we should return to different multi-field query formulations.

Let's start with the issue that's been raised so much: whether idf is better defined with log() or sqrt(log()).

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to