[ https://issues.apache.org/jira/browse/SOLR-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927069#action_12927069 ]
Robert Muir commented on SOLR-2211: ----------------------------------- Sounds great, this one has no external dependencies, so it can just be with the rest of the factories. I'll look at starting on the ant/build-system-stuff for SOLR-2210. > Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support > --------------------------------------------------------------------------- > > Key: SOLR-2211 > URL: https://issues.apache.org/jira/browse/SOLR-2211 > Project: Solr > Issue Type: New Feature > Affects Versions: 3.1 > Reporter: Tom Burton-West > Priority: Minor > > The Lucene 3.x StandardTokenizer with UAX#29 support provides benefits for > non-English tokenizing. Presently it can be invoked by using the > StandardTokenizerFactory and setting the Version to 3.1. However, it would > be useful to be able to use the improved unicode processing without > necessarily including the ip address and email address processing of > StandardAnalyzer. A FilterFactory that allowed the use of the > StandardTokenizer with UAX#29 support on its own would be useful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org