Hello,
We're using the Lucene 4.4 embedded in eXist-db (exist-db.org), and as the
subject indicates we want to ignore stop word gaps in queries - without the
user having to indicate where such gaps might occur at query time.
Since Lucene 4.4 the FilteringTokenFilter.setEnablePositionIncrements(false) is
not available.
Prior to Lucene 4.4 it was possible setEnablePositionIncrements(false) so that
during indexing and querying the number and position of stop word gaps would be
ignored.
This meant that a phrase such as:
blue is the sky
with stop words "is" and "the" would be selected by the query:
blue sky
We are working with Tibetan and elisions are not uncommon so that, e.g.:
rin po che
on some occasions might be shortened to
rin che
and we would like to have a query of
rin po che
or
rin che
find all occurrences of
rin po che
and
rin che
without having the user have to mark where elisions might occur.
The
org.apache.lucene.queryparser.flexible.standard.CommonQueryParserConfiguration
provides a setEnablePositionIncrements but that does not seem to work to allow
for the above desired query behavior that was possible prior to Lucene 4.4.
What is the proper way to ignore stop word gaps?
Thank you,
Chris
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]