I've been playing around with the PositionLengthAttribute for a few days,
and it doesn't seem to have any effect at all.

I'm aware that position length is not stored in the index, as explained in
this blog post.

http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html

However, even when used at query time it doesn't seem to do anything. Let's
take the following token stream as an example.

text: "he"
posInc: 1
posLen: 1

text: "cannot"
posInc: 1
posLen: 2

text: "can"
posInc: 0
posLen: 1

text: "not"
posInc: 1
posLen: 1

text: "help"
posInc: 1
posLen: 1

If we were to construct this graph of tokens, it should match the phrases
"he can not help" and "he cannot help". According to my testing, it will
match the phrases "he can not help" and "he cannot not help", because the
position length is entirely ignored and treated as if it is always 1.

Am I misunderstanding how these attributes work?

- Hayden

Reply via email to