Hi all,

Once again, thanks for the responses! After thinking about this a bit more,
I think Michael's response makes sense now. I do agree that partial matches
shouldn't be ranked higher than conjunctive matches, so I think it doesn't
make sense in my use case to use a DisjunctiveMinQuery (I think I would
need a AndMinQuery or something like that). This also answers my initial
question.

I did have a question about this though:

in that case you should use something like 1/x as your scoring function
> in the sub-clauses
>

Doesn't using 1/x as a scoring function, even in the subclauses, still
cause an issue where the output score will be inversely correlated to the
indexed term score? I think that would break BMW right? Or maybe I am
misunderstanding the suggestion.

Thanks,
Marc

On Thu, Nov 9, 2023 at 10:18 AM Uwe Schindler <u...@thetaphi.de> wrote:

> Hi,
>
> in that case you should use something like 1/x as your scoring function
> in the sub-clauses. In Lucene scores should go up for more relevancy.
> This must also apply for function scoring.
>
> Uwe
>
> Am 09.11.2023 um 19:14 schrieb Marc D'Mello:
> > Hi Michael,
> >
> > Thanks for the response! So to answer your first question, yes this would
> > keep the lowest score from the matching sub-scorers. Our use case is that
> > we have a custom term-level score overriding term frequency and we want
> to
> > take the min of that as part of our scoring function. Maybe it's a niche
> > use case?
> >
> > Thanks,
> > Marc
> >
> > On Wed, Nov 8, 2023 at 3:19 PM Michael Froh <msf...@gmail.com> wrote:
> >
> >> Hi Marc,
> >>
> >> Can you clarify what the semantics of a DisjunctionMinQuery would be?
> Would
> >> you keep the score for the *lowest* scoring disjunct (plus some
> tiebreaker
> >> applied to the other matching disjuncts)?
> >>
> >> I'm trying to imagine how that would work compared to the classic DisMax
> >> use-case. Say I'm searching for "dalmatian" using a DisMax query over
> term
> >> queries against title and body. A match on title is probably going to
> score
> >> higher than a match against the body, just because the title has a
> shorter
> >> length (and the doc frequency of individual terms in the title is
> likely to
> >> be lower, since there are fewer terms overall). With DisMax, a match on
> >> title alone will score higher than a match on body, and the tie-break
> will
> >> tend to score a match on title and body higher than a match on title
> alone.
> >>
> >> With a DisMin (assuming you keep the lowest score), then a match on
> title
> >> and body would probably score lower than a match on title alone. That
> feels
> >> weird to me, but I might be missing the use-case.
> >>
> >> How would you use a DisMinQuery?
> >>
> >> Thanks,
> >> Froh
> >>
> >>
> >>
> >> On Wed, Nov 8, 2023 at 10:50 AM Marc D'Mello <marcd2...@gmail.com>
> wrote:
> >>
> >>> Hi all,
> >>>
> >>> I noticed we have a DisjunctionMaxQuery
> >>> <
> >>>
> >>
> https://github.com/apache/lucene/blob/branch_9_7/lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxQuery.java
> >>> but
> >>> not a corresponding DisjunctionMinQuery. I was just wondering if there
> >> was
> >>> a specific reason for that? Or is it just that it is not a common query
> >> to
> >>> use?
> >>>
> >>> Thanks!
> >>> Marc
> >>>
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to