Doug Cutting wrote:

David Spencer wrote:


+(f1:t1^2.0 t1) +(f1:t2^2.0 t2) f1:"t1 t2"~5^3.0 "t1 t2"~2^1.5

(f1:t1^2.0 t1) (f1:t2^2.0 t2) f1:"t1 t2"~5^3.0 "t1 t2"~2^1.5

(f1:t1^2.0 t1) (f1:t2^2.0 t2) (f1:t3^2.0 t3) (f1:t4^2.0 t4) (f1:t5^2.0 t5) f1:"t1 t2 t3 t4 t5"~5^3.0 "t1 t2 t3 t4 t5"~2^1.5


This looks great to me! I'd make mand=true by default, i.e., have a method where this parameter is not specified. Similarly, we might default phraseBoosts[i] to boolBoosts[i]*phraseBoost, and slops to infinity. What we want is something that provides only the knobs that we think most folks will need. Ideally we wouldn't even need to specify fieldBoosts. Short fields like titles get a larger lengthNorm, which effectively boosts them a lot already.

Yeah I agree w/ all of the above, offer options but have easy to use ways of calling it w/ intelligent defaults.

But perhaps we should back off and first just evaluate single field search with different idf, tf (and perhaps lengthNorm and sloppyFreq) definitions. Once we're happy with those, then we should return to different multi-field query formulations.


Let's start with the issue that's been raised so much: whether idf is better defined with log() or sqrt(log()).

I can redo my page and rebuild indexes if necessary, I just need it clarified what we want to do, esp -> does the index need to be rebuilt?


[1]

I currently have 2 variations on the index, one w/ the default settings and another with the Similarity code Chuck attached to the bug report. Do we need other variations on the index e.g. with different weights, or during indexing are the weights less important than the log() vs. sqrt(log()) issue?

[2]

I guess it's obvious from the above, but just to make it clear - I'll change the page to only do single field queries - but how many variations do we want to see in parallel - the current page shows 2x2 results, for each combo of index and query - but I, say, show several more queries in parallel w/ different weights...



Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to