[jira] [Commented] (SOLR-9894) Tokenizer work randomly

Alexandre Rafalovitch (JIRA) Tue, 27 Dec 2016 00:47:50 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15779952#comment-15779952
 ]


Alexandre Rafalovitch commented on SOLR-9894:
---------------------------------------------

The tokenizers used are not part of the Lucene/Solr code base. They seem to 
come from https://github.com/EugenePig/ik-analyzer-solr5 . A bug report should 
be opened against that repository with a specific example.

I would recommend being very clear on what example showcases the issue and 
perhaps even annotate and recompile the code to confirm this. It is unlikely to 
be something random, but might be a strange combination of factors that 
triggers whatever you are observing.

> Tokenizer work randomly
> -----------------------
>
>                 Key: SOLR-9894
>                 URL: https://issues.apache.org/jira/browse/SOLR-9894
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: 6.2.1
>         Environment: solrcloud 6.2.1(3 solr nodes)
> OS:linux
> RAM:8G
>            Reporter: 王海涛
>            Priority: Critical
>              Labels: patch
>
> my schema.xml has a fieldType as folow:
> <fieldType name="my_ik" class="solr.TextField">
>               <analyzer type="index">
>                       <tokenizer 
> class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
>                               <filter 
> class="org.wltea.pinyin.solr5.PinyinTokenFilterFactory" pinyinAll="true" 
> minTermLength="2"/> 
>                               <filter class="solr.LowerCaseFilterFactory"/>
>                       </analyzer>
>               <analyzer type="query">
>                       <tokenizer 
> class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true"/>
>                  <filter class="solr.LowerCaseFilterFactory"/>
>               </analyzer>
>       </fieldType>
> Attention:
>   index tokenzier useSmart is false
>   query tokenzier useSmart is true
> But when I send query request with parameter q ,
> the query tokenziner sometimes useSmart equals true
> sometimes useSmart equal false.
> That is so terrible!
> I guess the problem may be caught by tokenizer cache.
> when I query ,the tokenizer should use true as the useSmart's value,
> but it had cache the wrong tokenizer result which created by indexWriter who 
> use false as useSmart's value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-9894) Tokenizer work randomly

Reply via email to