Nguyen Manh Tien created SOLR-5379:
--------------------------------------

             Summary: Multi-word synonym filter
                 Key: SOLR-5379
                 URL: https://issues.apache.org/jira/browse/SOLR-5379
             Project: Solr
          Issue Type: Improvement
          Components: query parsers
            Reporter: Nguyen Manh Tien
            Priority: Minor
             Fix For: 4.5.1


While dealing with synonym at query time, solr failed to work with multi-word 
synonyms due to some reasons:
- First the lucene queryparser tokenizes user query by space so it split 
multi-word term into two terms before feeding to synonym filter, so synonym 
filter can't recognized multi-word term to do expansion
- Second, if synonym filter expand into multiple terms which contains 
multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to 
handle synonyms. But MultiPhraseQuery don't work with term have different 
number of words.

For the first one, we can extend quoted all multi-word synonym in user query so 
that lucene queryparser don't split it. There are a jira task related to this 
one https://issues.apache.org/jira/browse/LUCENE-2605.

For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery 
SHOULD which contains multiple PhraseQuery in case tokens stream have 
multi-word synonym.




--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to