Interesting. First, an apology for an error in my e-book - it says that the
enablePositionIncrements parameter for the stop filter defaults to "false",
but it actually defaults to "true". The question mark represents a "position
increment". In your case you don't want position increments, so add the
enablePositionIncrements="false" parameter to the stop filter, and be sure
to reindex your data. The position increment leaves a "hole" where each stop
word was removed. The question mark represents the hole. All bets are off as
to what phrase query does when the phrase starts with a hole. I think the
basic idea is that there must be some term in the index at that position
that can be "skipped".
This is actually a change in behavior, which occurred as a side effect of
LUCENE-4963 in 4.4. The default for enablePositionIncrements was false, but
that release changed it to true.
I suspect that I wrote that section of my e-book before 4.4 came out.
Unfortunately, the change is not well documented - nothing in the Javadoc,
and this is another example of where an underlying change in Lucene that
impacts Solr users is not well highlighted for Solr users. Sorry about that.
In any case, try adding enablePositionIncrements="false", reindex, and see
what happens.
-- Jack Krupansky
-----Original Message-----
From: heaven
Sent: Monday, August 25, 2014 3:37 AM
To: solr-user@lucene.apache.org
Subject: Re: Help with StopFilterFactory
A valid search:
http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
An Invalid search:
http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww
What weird I found is that the valid query has:
"parsedquery_toString": "+(url_words_ngram:\"twitter com zer0sleep\")"
And the invalid one has:
"parsedquery_toString": "+(url_words_ngram:\"? twitter com zer0sleep\")"
So "https" part was replaced with a "?".
--
View this message in context:
http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-tp4153839p4154957.html
Sent from the Solr - User mailing list archive at Nabble.com.