I see that edismax already defines pf (bigrams) and pf3 (trigrams) -- how would folks think about just calling them pf / pf1 (aliases for each other?), pf2, and pf3? The pf would then behave exactly as it does in dismax.
And it sounds like the solution to my single-token fields is to just move them into the query itself. Thanks! On Fri, Dec 4, 2009 at 11:58 AM, Yonik Seeley <yo...@lucidimagination.com>wrote: > On Fri, Dec 4, 2009 at 11:26 AM, Bill Dueber <b...@dueber.com> wrote: > > I've started trying edismax, and have noticed that my relevancy ranking > is > > messed up with edismax because, according to the debug output, it's using > > bigrams instead of phrases and inexplicably ignoring a couple of the pf > > fields. While the hit count isn't changing, this kills my ability to > boost > > exact title matches (or, I would guess, exact-anything-else matches, > too). > > It's a feature in general - the problem with putting all the terms in > a single phrase query is that you get no boosting at all if all of the > terms don't appear. > > But since it may be useful as an option, perhaps we should add the > single-phrase option to extended dismax as well. > > > edismax is also completely ignoring the title_a and title_ab fields, > which > > are defined as "exactmatcher" as follows. > > I believe this is because extended dismax only adds phrases for > boosting... hence if a field type outputs a single token, it's > considered redundant with the main query. This is an optimization to > speed up queries (esp single-word queries). > Perhaps one way to fix this would be to check if the pf is in the qf > list before removing single term phrases? > > -Yonik > http://www.lucidimagination.com > -- Bill Dueber Library Systems Programmer University of Michigan Library