Hello -

a quick follow-up to my previous post.

In the SpanNear (or for that matter PhraseQuery), one can set a slop value where 0 (zero) means one following after the other.

How can one differentiate between Terms at the **same** position vs. one after the other?

ie:
(Token)/Position

(A)/0 (B)/1 (C)/2 ....
vs
( A B )/0 (C)/1 (D)/2 ...

How can a SpanNear (or anything) query for A,B tell these two cases apart?

---Marc



Doug Cutting wrote:

Marc Hadfield wrote:

I actually mention your option in my email:

In principle I could store the full text in two fields with the second field containing the types without incrementing the token index. Then, do a SpanQuery for "Johnson" and "name" with a distance of 0. The resulting match would have a token position which would refer back to the matching position in the first field. I don't know if this is a really good idea.


ie Field_B = full text interlaced with "types" following each full text token with positionIncrement=0


Sorry, you confused me when you spoke of this as "two fields" when only one field is required.

However, as far as I understand, the standard TermQuery won't let me check if "Johnson" and "__name__" occur at the **same** position. Perhaps, as I ask above, a SpanQuery will allow multiple terms with a distance of zero (0) , that is they were indexed with positionIncrement=0 and SpanQuery can handle 0 distance terms?


TermQuery certainly won't, since it only concerns a single term. But PhraseQuery now has an add(Term, position) method that should do the trick. And SpanNearQuery should work.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to