[ 
https://issues.apache.org/jira/browse/LUCENE-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-2605:
-------------------------------
    Attachment: LUCENE-2605.patch

Patch, I think it's ready.

I've pulled MockSynonymFilter/Analyzer out into their own files in 
lucene-test-framework, and added tests for it, and added a fixed multi-term 
source synonym with one single-term target.

I added tests to {{TestQueryParser}} using the modified MockSynonymAnalyzer 
ensuring operators block multi-term analysis when they should and don't when 
they shouldn't.

I'll go make issues now for converting Solr's clone of this QueryParser, and 
the standard flexible query parser, to add the same capabilities.

> queryparser parses on whitespace
> --------------------------------
>
>                 Key: LUCENE-2605
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2605
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/queryparser
>            Reporter: Robert Muir
>            Assignee: Steve Rowe
>             Fix For: 4.9, 6.0
>
>         Attachments: LUCENE-2605.patch, LUCENE-2605.patch, LUCENE-2605.patch
>
>
> The queryparser parses input on whitespace, and sends each whitespace 
> separated term to its own independent token stream.
> This breaks the following at query-time, because they can't see across 
> whitespace boundaries:
> * n-gram analysis
> * shingles 
> * synonyms (especially multi-word for whitespace-separated languages)
> * languages where a 'word' can contain whitespace (e.g. vietnamese)
> Its also rather unexpected, as users think their 
> charfilters/tokenizers/tokenfilters will do the same thing at index and 
> querytime, but
> in many cases they can't. Instead, preferably the queryparser would parse 
> around only real 'operators'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to