[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
----------------------------------
    Attachment: LUCENE-6371.patch

I've been playing around with various APIs for this, and I think this one works 
reasonably well.

Spans.isPayloadAvailable() and getPayload() are replaced with a collect() 
method that takes a SpanCollector.  If you want to get payloads from a Spans, 
you do the following:

{code:java}
PayloadSpanCollector collector = new PayloadSpanCollector();
while (spans.nextStartPosition() != NO_MORE_POSITIONS) {
  collector.reset();
  spans.collect(collector);
  doSomethingWith(collector.getPayloads());
}
{code}

The actual job of collecting information from postings lists is devolved to the 
collector itself (via SpanCollector.collectLeaf(), called from 
TermSpans.collect()).

The API is made slightly complicated by the need to buffer collected 
information in NearOrderedSpans, because the algorithm there moves child spans 
on eagerly when finding the smallest possible match, so by the time collect() 
is called we're out of position.  This is dealt with using a 
BufferedSpanCollector, with collectCandidate(Spans) and accept() methods.  The 
default (No-op) collector has a no-op implementation of this, which should get 
optimized away by HotSpot, meaning that we don't need to have separate 
implementations for collecting and non-collecting algorithms, and can do away 
with PayloadNearOrderedSpans.

This patch also moves the PayloadCheck queries to the .payloads package, which 
tidies things up a bit.

All tests pass.

> Improve Spans payload collection
> --------------------------------
>
>                 Key: LUCENE-6371
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6371
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Paul Elschot
>            Priority: Minor
>         Attachments: LUCENE-6371.patch
>
>
> Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to