I'm not awaken enough to figure out whether the -1 trick is right or not, but if you manage to prove it somehow, patches to simplify boolean queries at rewrite time are welcome!
Le mar. 9 août 2016 à 00:47, Spyros Kapnissis <[email protected]> a écrit : > Hm, I hadn't really thought about the minShouldMatch part, I thought it' d > be covered but I see your point being semantically different if you keep it > as is. > However.. Running your edge case example on an actual local index I get > the following: > "(X X Y #X)" w/minshouldmatch=2 vs. (+X X Y) w/minshouldmatch=2 => same > top score, less results in second case."(X X Y #X)" w/minshouldmatch=2 vs. > (+X X Y) w/minshouldmatch=1 => same top score, same number of results"(X X > X Y #X)" w/minshouldmatch=3 vs. (+X X X Y) w/minshouldmatch=2 => same top > score, same number of results > But still not really convinced myself if decrementing minshouldmatch by 1 > will do the trick.. I'll have to verify - maybe I'll try more examples to > see if it holds as a general case.. Nice exercise either way :) > > > > On Tuesday, August 9, 2016 12:40 AM, Chris Hostetter < > [email protected]> wrote: > > > > Off the top of my head, i think any optimiation like that would also need > to account for minNrShouldMatch, wouldn't it? > > if your query is "(X Y Z #X)" w/minshouldmatch=2, and you rewrite that > query to "(+X Y Z)" w/minshouldmatch=2 you now have a semantically diff > query that won't match as many documents as the original. > > in that example, you could decrement minshouldmatch (=1) ... but i'm not > sure off that holds as a general rule for all possible permutations/values > ... i'd have to think about it. > > An interesting edge case to think about is "(X X Y #X)" w/minshouldmatch=2 > ... pretty sure that would give you very diff scores if you rewrote it to > "(+X X Y)" (or "(+X Y)") w/minshouldmatch=1 > > > > : Hello all, I noticed while debugging a query that BooleanQuery will > : rewrite itself to remove FILTER clauses that are also MUST as an > : optimization/simplification, which makes total sense. So (+f:x #f:x) > : will become (+f:x). However, shouldn't there also be another > : optimization to remove FILTER clauses that are also SHOULD, while > : converting them to MUST? So, for eg. query (f:x #f:x) will become > : (+f:x). I did an initial simple implementation and the tests seem to > : pass. Are there any cases where this does not hold? > : > : > > -Hoss > http://www.lucidworks.com/ > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
