: Maybe what I really need is a query parser that does not do "disjunction
: maximum" at all, but somehow still combines different 'qf' type fields with
: different boosts on each field. I personally don't _neccesarily_ need the
: actual "disjunction max" calculation, but I do need combining of mu
Yeah, I see your points. It's complicated. I'm not sure either.
But the thing is:
> in order to use a feature like that you'd have to really think hard
about
> the query analysis of your fields, and which ones will produce which
> tokens in which situations
You need to think really hard about
: not other) setups/intentions. It's counter-intuitive to me that adding
: a field to the 'qf' set results in _fewer_ hits than the same 'qf' set
agreed .. but that's where looking the debug info comes in to understand
the reason for that behavior is that your old qf treated part of your
inp
Thanks, that's helpful.
It still seems like current behavior does the "wrong" thing in _many_ cases (I
know a lot of people get tripped up by it, sometimes on this list) -- but I
understand your cases where it does the right thing, and where what I'm
suggesting would be the wrong thing.
> Ul
: It seems like the problem is when different fields in the 'qf' produce a
: different number of tokens for a given query. dismax needs to know the number
: of tokens in the input in order to calculate 'mm', when 'mm' is expressed as a
: percentage, or when different mm's are given for different
Thanks. I'm trying to think through if there's any hypothetical way for
dismax to be improved to not be subject to this problem. Now that it's
clear that the problem isn't just with stopwords, and that in fact it's
very hard to predict if you'll get the problem and under what input,
when creat
Jonathan:
Thanks for writing that up, you're right, it is arcane
I've starred this one!
Erick
>
> http://lucene.472066.n3.nabble.com/Dismax-Minimum-Match-Stopwords-Bug-td493483.html
> http://bibwild.wordpress.com/2010/04/14/solr-stop-wordsdismax-gotcha/
>
> So to understand, first familiari
Okay, I figured this one out -- I'm participating in a thread with
myself here, but for benefit of posterity, or if anyone's interested,
it's kind of interesting.
It's actually a variation of the known issue with dismax, mm, and fields
with varying stopwords. Actually a pretty tricky problem w
Okay, let's try the debug trace again without a pf to be less confusing.
One field in qf, that's ordinary text tokenized, and does get hits:
q=churchill%20%3A%20roosevelt&qt=search&qf=title1_t&mm=100%&debugQuery=true&pf=
churchill : roosevelt
churchill : roosevelt
+((DisjunctionMaxQuery((title
I'm aware that using a field tokenized with KeywordTokenizerFactory is
in a dismax 'qf' is often going to result in 0 hits on that field --
(when a whitespace-containing query is entered). But I do it anyway,
for cases where a non-whitespace-containing query is entered, then it
hits. And in t
10 matches
Mail list logo