Re: Flex & Docs/AndPositionsEnum

Marvin Humphrey Thu, 11 Feb 2010 09:17:49 -0800

On Thu, Feb 11, 2010 at 08:30:14AM -0500, Michael McCandless wrote:
> Oh you're saying we don't know if the underlying enum actually skipped vs
> just scanned?


Yep.

> Isn't the skip data also based on deltas?  

Yes, but that's internal to the skip reader, in both Lucene and Lucy/KS.  When
it comes time to skip, the skip reader's doc id is assigned directly, in both
libraries.  From StandardPostingsReaderImpl.java:

          doc = skipper.getDoc();

Trying to apply the skip reader's doc id information as a delta would get
quite complicated.  (A delta against...  what?)  I'm not sure that's even
possible.

> So even if real skipping happened, Lucy/KS would not "lose" the offset that
> the aggregator had previously added?  Or maybe I'm lost on what the issue is
> here...

It would indeed "lose" the offset, because the skip reader's doc id
information gets assigned directly rather than applied as a delta.

And since the aggregator layer is not aware of when this occurs, it cannot
intervene to re-apply the offset.

Having driven down this dead-end, turned around and come back, I've become
persuaded that requiring the segment-level postings iterator to be aware of
its consumer is not a good idea.

> > A generic aggregator wouldn't know that it needed to do that.  The postings
> > codec developer would be forced to write aggregation code in addition to
> > segment-level code.
> 
> Right, if position were not primitive but contained within an opaque
> (to the aggregator) object.  And, you were doing the flat positions
> space.
> 
> I guess... this restriction still seems academic... ie, not a real
> issue in Lucene.  

Not for the standard posting formats that Lucene offers.  But the point of
flex is to provide an extension framework, I thought.

Well, whatever.  It's just another place where Lucy and Lucene will part ways.

Marvin Humphrey


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Flex & Docs/AndPositionsEnum

Reply via email to