[ 
https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13803652#comment-13803652
 ] 

Nguyen Manh Tien commented on SOLR-5379:
----------------------------------------

[~otis] The difference are
 SOLR-4381 is an extension of EDismax so it only work for that query parser, my 
patch is a patch to SolrQueryParserBase it work for any query parser
 SOLR-4381 rewrite query into lattice (all synonym combination) so it need to 
parse N modified query, my patch is applied when we read tokenstream to build 
Lucene Query, so it still parse query 1 time and
we can still optimize my work to make the result Lucene Query compacted by 
combine both MultiPhraseQuery and PhraseQuery, so the Lucene Query of my patch 
is smaller than SOLR-4381


> Query-time multi-word synonym expansion
> ---------------------------------------
>
>                 Key: SOLR-5379
>                 URL: https://issues.apache.org/jira/browse/SOLR-5379
>             Project: Solr
>          Issue Type: Improvement
>          Components: query parsers
>            Reporter: Nguyen Manh Tien
>              Labels: multi-word, queryparser, synonym
>             Fix For: 4.5.1, 4.6
>
>         Attachments: quoted.patch, synonym-expander.patch
>
>
> While dealing with synonym at query time, solr failed to work with multi-word 
> synonyms due to some reasons:
> - First the lucene queryparser tokenizes user query by space so it split 
> multi-word term into two terms before feeding to synonym filter, so synonym 
> filter can't recognized multi-word term to do expansion
> - Second, if synonym filter expand into multiple terms which contains 
> multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to 
> handle synonyms. But MultiPhraseQuery don't work with term have different 
> number of words.
> For the first one, we can extend quoted all multi-word synonym in user query 
> so that lucene queryparser don't split it. There are a jira task related to 
> this one https://issues.apache.org/jira/browse/LUCENE-2605.
> For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery 
> SHOULD which contains multiple PhraseQuery in case tokens stream have 
> multi-word synonym.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to