Hi Kuro Kurosaka,

Thanks for your attention.


It must Chinese query can reproduce this problem. because English word is 
seperate by space.


If I search "北京市  动物园", I inserted a space in the query. The query would be 
parsed to "+(北京市 北京) +动物园", which is expected. So the Chinese query can aslo 
work only if I insert space to seperate words.


The query parser I used is ik-analyzer: http://code.google.com/p/ik-analyzer/




Thanks,
Wei Li


------------------ Original ------------------
From:  "Kuro Kurosaka"<kuro...@sonic.net>;
Date:  Thu, Apr 4, 2013 02:53 AM
To:  "solr-user"<solr-user@lucene.apache.org>; 
Cc:  "李威"<li...@antvision.cn>; "罗佳"<luo...@antvision.cn>; 
"李景泽"<lijin...@antvision.cn>; 
Subject:  Re: It seems a issue of deal with chinese synonym for solr

 
On 3/11/13 6:15 PM, 李威 wrote:
> in org.apache.solr.parser.SolrQueryParserBase, there is a function: 
> "protected Query newFieldQuery(Analyzer analyzer, String field, String 
> queryText, boolean quoted)  throws SyntaxError"
>
> The below code can't process chinese rightly.
>
> "          BooleanClause.Occur occur = positionCount > 1 && operator == 
> AND_OPERATOR ?
>              BooleanClause.Occur.MUST : BooleanClause.Occur.SHOULD;
>
> "
>
> For example, “北京市" and “北京" are synonym, if I seach "北京市动物园", the expected 
> parse result is "+(北京市 北京) +动物园", but actually it would be parsed to "+北京市 
> +北京 +动物园".
>
> The code can process English, because English word is seperate by space, and 
> only one position.

An interesting feature of this example is that difference between the two 
synonyms is
omission of one token "市" (city). Doesn't the same same problem happen if we 
define
"London City" and "London" as synonyms, and execute a query like "London City 
Zoo"?
Must Chinese Analyzer be used to reproduce this problem?

I tried to test this but I couldn't. The result of query string expansion using 
Solr 4.2's
query interface with debug output shows:

<str name="parsedquery">MultiPhraseQuery(text:"(london london) city zoo")</str>

I see no plus (+). What query parser did you use?

-- 
Kuro Kurosaka

Reply via email to