[jira] [Commented] (LUCENE-4981) Deprecate PositionFilter

Steve Rowe (JIRA) Wed, 15 May 2013 09:33:19 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658517#comment-13658517
 ]


Steve Rowe commented on LUCENE-4981:
------------------------------------

Adrien,

I looked 
[http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory],
 and where it originally came from 
(http://markmail.org/message/g4habmbyeuckmix6 and LUCENE-1380), and I don't 
think existing query parser functionality, including 
{{QueryParser.setAutoGeneratePhraseQueries}}, will cover the use case it was 
created to handle.

That use case is roughly: given an indexed non-tokenized string field (e.g. "a 
b"), and a multi-word query against that field, create disjunction query of all 
possible word n-grams, where 0<n<N and N is as large as the expected longest 
query.  E.g. a query "a b c" would result in "'a' OR 'a b' OR 'a b c' OR 'b' OR 
'b c' OR 'c'", and would match a doc with field value "a b".

[~michaelsembwever], the guy who started the thread and created the issue, was 
able to handle this use case by stringing together:

# Quoting the query, to allow the configured analyzer to see all of the terms 
instead of one-at-a-time
# ShingleFilter, to create the n-grams
# The new PositionFilter, to place all terms at the same position
# QueryParser's synonym handling functionality, which produces a 
MultiPhraseQuery, which when given multiple terms at the same single position, 
creates a BooleanQuery with one SHOULD TermQuery for each term.

Without PositionFilter, is there some way to achieve the same goal?

I don't think we should get rid of PositionFilter unless we have an alternate 
way to handle the (IMHO legitimate) use case it was originally designed to 
cover.
                
> Deprecate PositionFilter
> ------------------------
>
>                 Key: LUCENE-4981
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4981
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-4981.patch
>
>
> According to the documentation 
> (http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory),
>  PositionFilter is mainly useful to make query parsers generate boolean 
> queries instead of phrase queries although this problem can be solved at 
> query parsing level instead of analysis level (eg. using 
> QueryParser.setAutoGeneratePhraseQueries).
> So given that PositionFilter corrupts token graphs (see TestRandomChains), I 
> propose to deprecate it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4981) Deprecate PositionFilter

Reply via email to