On Wed, Jul 28, 2010 at 10:25 PM, Chris Hostetter
<hossman_luc...@fucit.org> wrote:
>
> : Say someone enters the query string 3-diphenylpropanoic
> :
> : The query parser I'm using transforms this into a phrase query and the
> : indexed form is missed because based the positions of the terms '3'
> : and 'diphenylpropanoic' indicate they are not adjacent?
> :
> : Is this intended behavior? I expect that the catenated word
> : 'diphenylpropanoic' should have a position of 2 based on the position
> : of the first term in the concatenation, but perhaps I'm missing
>
> I believe this is correct, but i'm not certain for hte reason - i think
> it's just an implementation detail.

I dove into the implementation a bit and it appears to be the case, it
would be a big waste to work the code in such a way as to emit
concatenations so that their positions match that of the starting
token. Since negative position increments aren't allowed some really
grody buffering would have to happen. Changing the current behavior
would require some serious work with little return.

> Consider the ooposite scenerio: if
> your indexed text was diphenyl-propanoic-3 and things worked the way
> you are suggesting they should, the term diphenylpropanoic
> would up at position 1 (with diphenyl) and "diphenylpropanoic-3" would not
> match because then the terms wouldn't be adjacent.
> damned if you do, damned if you don't

This is a really good point. Thanks for the example.

> typically for fields where you are using WDF with the "concat" options
> you would usually use a bit of slop on the generated phrase queries to
> allow for the loosenes of the position information.

Ahh, ok I see dismax supports this with the ps= parameter. Thanks,

Drew

Reply via email to