Hi all, Once again, thanks for the responses! After thinking about this a bit more, I think Michael's response makes sense now. I do agree that partial matches shouldn't be ranked higher than conjunctive matches, so I think it doesn't make sense in my use case to use a DisjunctiveMinQuery (I think I would need a AndMinQuery or something like that). This also answers my initial question.
I did have a question about this though: in that case you should use something like 1/x as your scoring function > in the sub-clauses > Doesn't using 1/x as a scoring function, even in the subclauses, still cause an issue where the output score will be inversely correlated to the indexed term score? I think that would break BMW right? Or maybe I am misunderstanding the suggestion. Thanks, Marc On Thu, Nov 9, 2023 at 10:18 AM Uwe Schindler <u...@thetaphi.de> wrote: > Hi, > > in that case you should use something like 1/x as your scoring function > in the sub-clauses. In Lucene scores should go up for more relevancy. > This must also apply for function scoring. > > Uwe > > Am 09.11.2023 um 19:14 schrieb Marc D'Mello: > > Hi Michael, > > > > Thanks for the response! So to answer your first question, yes this would > > keep the lowest score from the matching sub-scorers. Our use case is that > > we have a custom term-level score overriding term frequency and we want > to > > take the min of that as part of our scoring function. Maybe it's a niche > > use case? > > > > Thanks, > > Marc > > > > On Wed, Nov 8, 2023 at 3:19 PM Michael Froh <msf...@gmail.com> wrote: > > > >> Hi Marc, > >> > >> Can you clarify what the semantics of a DisjunctionMinQuery would be? > Would > >> you keep the score for the *lowest* scoring disjunct (plus some > tiebreaker > >> applied to the other matching disjuncts)? > >> > >> I'm trying to imagine how that would work compared to the classic DisMax > >> use-case. Say I'm searching for "dalmatian" using a DisMax query over > term > >> queries against title and body. A match on title is probably going to > score > >> higher than a match against the body, just because the title has a > shorter > >> length (and the doc frequency of individual terms in the title is > likely to > >> be lower, since there are fewer terms overall). With DisMax, a match on > >> title alone will score higher than a match on body, and the tie-break > will > >> tend to score a match on title and body higher than a match on title > alone. > >> > >> With a DisMin (assuming you keep the lowest score), then a match on > title > >> and body would probably score lower than a match on title alone. That > feels > >> weird to me, but I might be missing the use-case. > >> > >> How would you use a DisMinQuery? > >> > >> Thanks, > >> Froh > >> > >> > >> > >> On Wed, Nov 8, 2023 at 10:50 AM Marc D'Mello <marcd2...@gmail.com> > wrote: > >> > >>> Hi all, > >>> > >>> I noticed we have a DisjunctionMaxQuery > >>> < > >>> > >> > https://github.com/apache/lucene/blob/branch_9_7/lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxQuery.java > >>> but > >>> not a corresponding DisjunctionMinQuery. I was just wondering if there > >> was > >>> a specific reason for that? Or is it just that it is not a common query > >> to > >>> use? > >>> > >>> Thanks! > >>> Marc > >>> > -- > Uwe Schindler > Achterdiek 19, D-28357 Bremen > https://www.thetaphi.de > eMail: u...@thetaphi.de > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >