Hello,

I've written several times now on the list with this question / problem and no 
one has yet replied so I don't know if the question is too wrong-headed or if 
there is simply no one reading the list that can comment on the question.

The question that I'm trying to get answered is what is the correct way of 
ignoring stop word gaps in Lucene 4.4+?

While we are using Lucene 4.4 embedded in eXist-db (exist-db.org), I think the 
question is a proper Lucene question and really has nothing to do with the fact 
that we're using it in an embedded manner.

The problem to be solved is how to ignore stop word gaps in queries - without 
the user having to indicate where such gaps might occur at query time.

Since Lucene 4.4 the FilteringTokenFilter.setEnablePositionIncrements(false) is 
not available. None of the resources such as the "Lucene in Action" and so on 
explain how to use Lucene to get the desired effect now that 4.4+ has removed 
the previous approach.

Prior to Lucene 4.4 it was possible to setEnablePositionIncrements(false) so 
that during indexing and querying the number and position of stop word gaps 
would be ignored (as mentioned on pp 138-139 of "Lucene in Action").

This meant that a document with a phrase such as:

   blue is the sky

with stop words "is" and "the" would be selected by the query:

   blue sky

This is what we want to achieve. 

Why? We are working with Tibetan and elisions are not uncommon so that, e.g.:

   rin po che

on some occasions might be shortened to

   rin che

and we would like to have a query of

   rin po che

or

   rin che

find all occurrences of

   rin po che

and

   rin che

without having the user have to mark where elisions might occur.

The 
org.apache.lucene.queryparser.flexible.standard.CommonQueryParserConfiguration 
provides a setEnablePositionIncrements but that does not seem to work to allow 
for the above desired query behavior that was possible prior to Lucene 4.4.

What is the proper way to ignore stop word gaps?

Thank you,
Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to