[
https://issues.apache.org/jira/browse/LUCENE-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658517#comment-13658517
]
Steve Rowe commented on LUCENE-4981:
------------------------------------
Adrien,
I looked
[http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory],
and where it originally came from
(http://markmail.org/message/g4habmbyeuckmix6 and LUCENE-1380), and I don't
think existing query parser functionality, including
{{QueryParser.setAutoGeneratePhraseQueries}}, will cover the use case it was
created to handle.
That use case is roughly: given an indexed non-tokenized string field (e.g. "a
b"), and a multi-word query against that field, create disjunction query of all
possible word n-grams, where 0<n<N and N is as large as the expected longest
query. E.g. a query "a b c" would result in "'a' OR 'a b' OR 'a b c' OR 'b' OR
'b c' OR 'c'", and would match a doc with field value "a b".
[~michaelsembwever], the guy who started the thread and created the issue, was
able to handle this use case by stringing together:
# Quoting the query, to allow the configured analyzer to see all of the terms
instead of one-at-a-time
# ShingleFilter, to create the n-grams
# The new PositionFilter, to place all terms at the same position
# QueryParser's synonym handling functionality, which produces a
MultiPhraseQuery, which when given multiple terms at the same single position,
creates a BooleanQuery with one SHOULD TermQuery for each term.
Without PositionFilter, is there some way to achieve the same goal?
I don't think we should get rid of PositionFilter unless we have an alternate
way to handle the (IMHO legitimate) use case it was originally designed to
cover.
> Deprecate PositionFilter
> ------------------------
>
> Key: LUCENE-4981
> URL: https://issues.apache.org/jira/browse/LUCENE-4981
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Assignee: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-4981.patch
>
>
> According to the documentation
> (http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory),
> PositionFilter is mainly useful to make query parsers generate boolean
> queries instead of phrase queries although this problem can be solved at
> query parsing level instead of analysis level (eg. using
> QueryParser.setAutoGeneratePhraseQueries).
> So given that PositionFilter corrupts token graphs (see TestRandomChains), I
> propose to deprecate it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]