Marc,

SpanNearQuery isn't capable of performing the proximity to within only a single position in the manner you've described. A slop of 0 means the terms must be contiguous with no gaps, which also allows for matches in the same position as in your first example.

I think MultiPhraseQuery (from svn trunk) will do what you want though. Please try that and report back on how it works for you.

        Erik


On Jan 4, 2006, at 9:39 PM, Marc Hadfield wrote:

hello all -

i have a problem with a SpanNearQuery returning incorrect (false positive) results.

I am creating the context of a field using tokens which have position increment set to either 1 or 0. The position increment is set to 0 for special tokens, in this case part-of-speech markers.
Thus, brackets set of position increments:

[The __pos_dt] [cow __pos_noun] [jumped __pos_verb] [over __pos_prep] [the __pos_dt] [moon __pos_noun] [. __pos_.]


My Span Query looks like:
SpanQuery sq = new SpanNearQuery(new SpanQuery[]
                     {
new SpanTermQuery(new Term("content", "jumped")), new SpanTermQuery(new Term("content", "__pos_verb"))
                                        }, 0, false);

This correctly finds the span:  [jumped __pos_verb]

However, if I query:
SpanQuery sq = new SpanNearQuery(new SpanQuery[]
                     {
new SpanTermQuery(new Term("content", "jumped")), new SpanTermQuery(new Term("content", "__pos_noun"))
                                        }, 0, false);

This incorrectly finds the span:  [cow __pos_noun] [jumped __pos_verb]

This is wrong because there is a distance of 1 between the tokens, not 0.

I am using a recent version of Lucene from SVN.

I am thinking that the problem is related to the position increment being set to 0 for the first token of the incorrect "match" -- thus perhaps this is a bug in the SpanNearQuery?


Best,
Marc






---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to