[ https://issues.apache.org/jira/browse/SOLR-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756592#action_12756592 ]
Robert Muir commented on SOLR-1336: ----------------------------------- Thanks, so do we want a contrib (which would mostly just be the jar file + the 2 factories) or should it go in example/solr/lib? If we do the latter, where should i put factories? These could be useful if someone wants the chinese analysis to work a little different, for example SmartChineseAnalyzer does porter stemming on english but someone might not want that. > Add support for lucene's SmartChineseAnalyzer > --------------------------------------------- > > Key: SOLR-1336 > URL: https://issues.apache.org/jira/browse/SOLR-1336 > Project: Solr > Issue Type: New Feature > Components: Analysis > Reporter: Robert Muir > Attachments: SOLR-1336.patch, SOLR-1336.patch, SOLR-1336.patch > > > SmartChineseAnalyzer was contributed to lucene, it indexes simplified chinese > text as words. > if the factories for the tokenizer and word token filter are added to solr it > can be used, although there should be a sample config or wiki entry showing > how to apply the built-in stopwords list. > this is because it doesn't contain actual stopwords, but must be used to > prevent indexing punctuation... > note: we did some refactoring/cleanup on this analyzer recently, so it would > be much easier to do this after the next lucene update. > it has also been moved out of -analyzers.jar due to size, and now builds in > its own smartcn jar file, so that would need to be added if this feature is > desired. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.