[jira] [Commented] (SOLR-9894) Tokenizer work randomly

Erick Erickson (JIRA) Thu, 29 Dec 2016 19:23:15 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15786758#comment-15786758
 ]


Erick Erickson commented on SOLR-9894:
--------------------------------------

We've mentioned several times that this involves a tokenizer that is _not_ 
supported by Apache Solr, specifically: 
org.wltea.pinyin.solr5.PinyinTokenFilterFactory. You have yet to show that the 
problem isn't in this custom class.

Plus, the class mentions Solr 5, yet you're logging this against Solr 6.

Unless and until you can show that this issue is a problem with Solr and not 
this non-solr tokenizer there is little that we can do. If you would like to 
retain consulting services to debug this custom code, please contact one of the 
many consulting services.

> Tokenizer work randomly
> -----------------------
>
>                 Key: SOLR-9894
>                 URL: https://issues.apache.org/jira/browse/SOLR-9894
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: 6.2.1
>         Environment: solrcloud 6.2.1(3 solr nodes)
> OS:linux
> RAM:8G
>            Reporter: 王海涛
>            Priority: Critical
>              Labels: patch
>         Attachments: step1.png, step2.png, step3.png, step4.png
>
>
> my schema.xml has a fieldType as folow:
> <fieldType name="my_ik" class="solr.TextField">
>               <analyzer type="index">
>                       <tokenizer 
> class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
>                               <filter 
> class="org.wltea.pinyin.solr5.PinyinTokenFilterFactory" pinyinAll="true" 
> minTermLength="2"/> 
>                               <filter class="solr.LowerCaseFilterFactory"/>
>                       </analyzer>
>               <analyzer type="query">
>                       <tokenizer 
> class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true"/>
>                  <filter class="solr.LowerCaseFilterFactory"/>
>               </analyzer>
>       </fieldType>
> Attention:
>   index tokenzier useSmart is false
>   query tokenzier useSmart is true
> But when I send query request with parameter q ,
> the query tokenziner sometimes useSmart equals true
> sometimes useSmart equal false.
> That is so terrible!
> I guess the problem may be caught by tokenizer cache.
> when I query ,the tokenizer should use true as the useSmart's value,
> but it had cache the wrong tokenizer result which created by indexWriter who 
> use false as useSmart's value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9894) Tokenizer work randomly

Reply via email to