On Jul 12, 2007, at 6:12 PM, Chris Hostetter wrote:



Hmm... okay so the issue is that in order to get the payload data, you
have to have a TermPositions instance.

instead of adding getPayload methods to the Spans class (which as Paul
points out, can have nesting issues) perhaps more general solutions would
be:

a) a more high level getPayload API that let's you get a payload
arbitrarily for a toc/position (perhaps as part of the TernDocs API?) ...
then for Spans you could use this new API with Spans.start() and
Spans.end(). (and all the positions in between)

Not sure I follow this.  I don't see the fit w/ TermDocs.

b) add a variation of the TermPositions class to allow people to iterate
through the terms of a TermDoc in position order (TermPosition first
iterates over the Terms and then over the positions) ... then you could
seek(span.start()) to get the Payload data

c) add methods to the Spans API to get the subspans (if any) ... this
would be the Spans corrilary to getTerms() and would always return
TermSpans which would have TermPositions for getting payload data.


This could be a good alternative.

When we first talked about payloads we wondered if we could just make all Queries into SpanQueries by passing TermPositions instead of term docs, but in the end decided not to do it because of performance issues (some of which are lessened by lazy loading of TermPositions.

The thing is, I think, that the Spans is already moving you along in the term positions, so it just seems like a natural fit to have it there, even if there is nesting. It doesn't seem like it would be that hard to then return back the nesting stuff b/c you are just collating the results from the underlying SpanTermQuery. Having said that, I haven't looked into the actual code, so take that w/ a grain of salt.

I will try to do some more investigation, as others are welcome to do. Perhaps we should move this to dev?

Cheers,
Grant


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to