You could set a very high position increment gap for multi-valued fields (Analyzer#getPositionIncrementGap) and perform something like Intervals.maxWidth(Intervals.unordered(...), pos_gap-1) ?
Le jeu. 10 sept. 2020 à 12:32, Dawid Weiss <[email protected]> a écrit : > Yeah... I was thinking about adding synthetic boundaries but this > seems... impure. :) Another quick reflection is that I'd have to > somehow translate the original query (which can be arbitrarily > complex) into an interval query. Tough. > > D. > > On Thu, Sep 10, 2020 at 12:22 PM Alan Woodward <[email protected]> > wrote: > > > > I’ve solved this sort of thing in the past by indexing boundary tokens, > and wrapping the queries with the equivalent of > Intervals.notContaining(query, boundary-query); you could also put a very > large position increment gap and use a width filter, but that’s a bit more > error prone if you could conceivably have lots of text in the individual > field entries. > > > > > On 10 Sep 2020, at 10:38, Dawid Weiss <[email protected]> wrote: > > > > > > Hi Alan, > > > > > > You're the expert here so I thought I'd ask before I jump in deep. Do > > > you think it's feasible to solve the following multivalued-field > > > problem: > > > > > > doc: field=["foo", "bar"] > > > query: field:(foo AND bar) > > > > > > I'd like the above to return zero hits (no single value contains both > > > foo and bar), but since multi-valued fields are logically indexed as a > > > single field, it returns doc. I recognize this as a well known problem > > > but subdocuments are not fun to deal with so I'd like to avoid them at > > > all costs. > > > > > > Would it be possible to solve the above with intervals? Say, something > > > like this: > > > > > > Intervals.containing(valuePositionRanges(), query). > > > > > > I assume the containment relationship would get rid of false-positives > > > crossing value boundary here. The problem is in how to construct those > > > value position ranges... Store them at index-construction time > > > somehow? Compute them on the fly for anything that has a chance to > > > match query? Your thoughts would be very appreciated. > > > > > > Dawid > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [email protected] > > > For additional commands, e-mail: [email protected] > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
