: After 1/2 hour of regex hacking... I think I'll stick with a two step
: process: split then trim ;)

But regex hacking is FUN!!

I'm 99% certain this does waht you want...

        <tokenizer class="solr.PatternTokenizerFactory"
                   pattern="((\A\s*)|\s*?(\s+-\s+|--|,|\(|\))|\s+)\s*\z?"

..if it doesn't send me an example string that it fails on and tell me
what hte desired output is.

Incidently, PatternTokenizerFactory seems to have the anoying limitation
of assuming there is a token prior to each match -- even if the match
explicitly matches on the start of the string (so it creates a 0 width
token) ... that seems like a bug right?




-Hoss

Reply via email to