The basic idea is to have several tokens at the same position (i.e. setPositionIncrement(0)) which are different possible stems for the same word.
Right. Like I said, I recognize the benefits of using a position increment of 0.
I certainly see the benefit of putting tokens into zero-increment positions, but are increments of 2 or more at all useful?
Who knows ? I may be interesting to keep track of the *presence* of "empty words", e.g. "[the] sky [is] blue", "[the] sky [is] [really] blue", "[the] sky [is] [that] [really] blue". The traditionnal reduction to "sky blue" is maybe over-simplistic for some cases...
But, how would you actually *use* an index that was indexed with the holes noted by > 1 position increments?
Erik
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]