You could set a very high position increment gap for multi-valued fields
(Analyzer#getPositionIncrementGap) and perform something
like Intervals.maxWidth(Intervals.unordered(...), pos_gap-1) ?


Le jeu. 10 sept. 2020 à 12:32, Dawid Weiss <[email protected]> a écrit :

> Yeah... I was thinking about adding synthetic boundaries but this
> seems... impure. :) Another quick reflection is that I'd have to
> somehow translate the original query (which can be arbitrarily
> complex) into an interval query. Tough.
>
> D.
>
> On Thu, Sep 10, 2020 at 12:22 PM Alan Woodward <[email protected]>
> wrote:
> >
> > I’ve solved this sort of thing in the past by indexing boundary tokens,
> and wrapping the queries with the equivalent of
> Intervals.notContaining(query, boundary-query); you could also put a very
> large position increment gap and use a width filter, but that’s a bit more
> error prone if you could conceivably have lots of text in the individual
> field entries.
> >
> > > On 10 Sep 2020, at 10:38, Dawid Weiss <[email protected]> wrote:
> > >
> > > Hi Alan,
> > >
> > > You're the expert here so I thought I'd ask before I jump in deep. Do
> > > you think it's feasible to solve the following multivalued-field
> > > problem:
> > >
> > > doc: field=["foo", "bar"]
> > > query: field:(foo AND bar)
> > >
> > > I'd like the above to return zero hits (no single value contains both
> > > foo and bar), but since multi-valued fields are logically indexed as a
> > > single field, it returns doc. I recognize this as a well known problem
> > > but subdocuments are not fun to deal with so I'd like to avoid them at
> > > all costs.
> > >
> > > Would it be possible to solve the above with intervals? Say, something
> > > like this:
> > >
> > > Intervals.containing(valuePositionRanges(), query).
> > >
> > > I assume the containment relationship would get rid of false-positives
> > > crossing value boundary here. The problem is in how to construct those
> > > value position ranges... Store them at index-construction time
> > > somehow? Compute them on the fly for anything that has a chance to
> > > match query? Your thoughts would be very appreciated.
> > >
> > > Dawid
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [email protected]
> > > For additional commands, e-mail: [email protected]
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to