Hi Alan,

At the moment I'm using Payloads (exposed via TermSpans) to store
positionLength
in the index per-position
<https://michaelgibney.net/2018/09/lucene-graph-queries-2/#we-can-do-something-useful-with-positionlength-now>
(definitely
in Uwe's category of "[because] payloads are stored together with the
positions in the postings"). I'm using the positionLength for precise
SpanNearQuery phrase matching with index-time synonyms/token-graphs.

I'm not sure how directly relevant positionLength would be to
IntervalSource. But more generally, I can say that I really appreciate
having access to Payloads as a generic framework for implementation of
experimental features that rely on per-position indexed attributes.

Michael

On Wed, Feb 13, 2019 at 3:27 AM Uwe Schindler <u...@thetaphi.de> wrote:

> Hi,
>
> I think the main reason why there are Payload implementation inside Spans
> are the fact that the payloads are stored together with the positions in
> the postings. Due to performance reasons, back at that time, the processing
> of payloads was put into the span query series, because then you can score
> by payload and do position based stuff in a single pass.
>
> I agree that adding that to the IntervalSource API is hard, because
> IntervalSource does not know anything about payloads, so a combination of
> different queries won't work. And as you said, the soring is separated.
>
> Payloads are mostly used for scoring, but I don't remember any use case I
> had in the last 5 years that made use of this - it was just too slow. And
> term-level boosts are seldomly used. In most cases people stick with
> document-level boosts (docvalues). Nowadays I'd also recommend FeatureField
> for term/keyword/category-level scoring.
>
> One thing that payloads were used are NLP features like word type
> annotations and filtering based on that, which requires (of course support
> in spans). But in most cases the better way to do this is to add the
> annotation into the term text and do simple term queries (like terms called
> "lucene#propernoun").
>
> IMHO, adding a PayloadTermQuery-like type to change the term frequency
> based on a function of payload is fine, but can easily be modelled with
> FeatureField, too.
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -----Original Message-----
> > From: Alan Woodward <romseyg...@gmail.com>
> > Sent: Thursday, February 7, 2019 10:27 AM
> > To: dev@lucene.apache.org
> > Subject: Use of Payloads
> >
> > Hi all,
> >
> > The new intervals queries are now nearly at feature parity with Spans;
> the
> > implementations still outstanding are all to do with using payloads.
> > Currently, span queries allow you to filter out spans based on the
> payloads
> > of the matching terms, and also allow you to modify the score of the
> query
> > as a whole based on those payloads.  I’d like to get some idea of how
> people
> > are actually using these functions.
> >
> > In terms of filtering, adding an IntervalSource that wraps a simple term
> and
> > filters it out based on the payload will be simple enough.  Adding this
> for
> > compound intervals is more complicated, and I think trickier to reason
> about,
> > so I’d like to try and avoid doing this if possible - feedback on actual
> use-
> > cases would be helpful here.
> >
> > For scoring, intervals use a completely different scoring mechanism to
> Spans,
> > just returning a scaled score between 0 and [boost].  To include term
> > weighting as well, users should combine the Intervals query with a
> boolean
> > query consisting of all terms used in the IntervalsSource.  This doesn’t
> mix so
> > well with payloads, but an alternative option here could be to add a
> > PayloadTermQuery that can adjust the term frequency of a term on a
> > particular document via a payload function.
> >
> > What do people think?  Are there cases that I’ve missed, or other
> possible
> > uses here?
> >
> > - Alan
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

Reply via email to