[ 
https://issues.apache.org/jira/browse/SOLR-211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-211:
-------------------------------

    Attachment: SOLR-211-RegexSplitTokenizer.patch

Thanks for the quick feedback!

Here is an updated version that 

1. uses a compiled Pattern
2. uses matcher.find() to set proper start and offeset
3. is called PatternSplitTokenizerFactory
4. The tests make sure the output is the same as you would get with 
string.split( pattern )



> regex split() Tokenizer
> -----------------------
>
>                 Key: SOLR-211
>                 URL: https://issues.apache.org/jira/browse/SOLR-211
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Ryan McKinley
>         Attachments: SOLR-211-RegexSplitTokenizer.patch, 
> SOLR-211-RegexSplitTokenizer.patch
>
>
> A TokenizerFactory that makes tokens from:
>   string.split( regex );

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to