On Thu, Feb 11, 2010 at 08:30:14AM -0500, Michael McCandless wrote: > Oh you're saying we don't know if the underlying enum actually skipped vs > just scanned?
Yep. > Isn't the skip data also based on deltas? Yes, but that's internal to the skip reader, in both Lucene and Lucy/KS. When it comes time to skip, the skip reader's doc id is assigned directly, in both libraries. From StandardPostingsReaderImpl.java: doc = skipper.getDoc(); Trying to apply the skip reader's doc id information as a delta would get quite complicated. (A delta against... what?) I'm not sure that's even possible. > So even if real skipping happened, Lucy/KS would not "lose" the offset that > the aggregator had previously added? Or maybe I'm lost on what the issue is > here... It would indeed "lose" the offset, because the skip reader's doc id information gets assigned directly rather than applied as a delta. And since the aggregator layer is not aware of when this occurs, it cannot intervene to re-apply the offset. Having driven down this dead-end, turned around and come back, I've become persuaded that requiring the segment-level postings iterator to be aware of its consumer is not a good idea. > > A generic aggregator wouldn't know that it needed to do that. The postings > > codec developer would be forced to write aggregation code in addition to > > segment-level code. > > Right, if position were not primitive but contained within an opaque > (to the aggregator) object. And, you were doing the flat positions > space. > > I guess... this restriction still seems academic... ie, not a real > issue in Lucene. Not for the standard posting formats that Lucene offers. But the point of flex is to provide an extension framework, I thought. Well, whatever. It's just another place where Lucy and Lucene will part ways. Marvin Humphrey --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org