[ https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13803199#comment-13803199 ]
Otis Gospodnetic commented on SOLR-5379: ---------------------------------------- My understanding of how this synonym expander (the synonym-expander.patch) works is: Assume synonyms are: {code} Seabiscuit, Sea biscit, Biscit {code} For query "Seabiscuit article", the regular edismax will construct a MultiPhraseQuery like ("Seebiscuit|Sea|biscit", biscit, article"). Instead of that, this patch rewrites the query differently: PhraseQuery(Seabiscit article) OR PhraseQuery(Sea biscit article) OR PhraseQuery(biscit article) > Query-time multi-word synonym expansion > --------------------------------------- > > Key: SOLR-5379 > URL: https://issues.apache.org/jira/browse/SOLR-5379 > Project: Solr > Issue Type: Improvement > Components: query parsers > Reporter: Nguyen Manh Tien > Labels: multi-word, queryparser, synonym > Fix For: 4.5.1, 4.6 > > Attachments: quoted.patch, synonym-expander.patch > > > While dealing with synonym at query time, solr failed to work with multi-word > synonyms due to some reasons: > - First the lucene queryparser tokenizes user query by space so it split > multi-word term into two terms before feeding to synonym filter, so synonym > filter can't recognized multi-word term to do expansion > - Second, if synonym filter expand into multiple terms which contains > multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to > handle synonyms. But MultiPhraseQuery don't work with term have different > number of words. > For the first one, we can extend quoted all multi-word synonym in user query > so that lucene queryparser don't split it. There are a jira task related to > this one https://issues.apache.org/jira/browse/LUCENE-2605. > For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery > SHOULD which contains multiple PhraseQuery in case tokens stream have > multi-word synonym. -- This message was sent by Atlassian JIRA (v6.1#6144) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org