"C2C3" "C3C4"
it
* also need filter filter zero length token ""
* for Digit: digit, '+', '#' will token as letter
* for more info on Asia language(Chinese Japanese Korean) text
segmentation:
* please search http://www.google.com/search?q=word+chine
ng the solr.war and repacking it, but, since I know dinkus
about Java, that really didn't mean a whole lot to me, even though I'm
guessing it's probably a grand total of four commands at the unix prompt.
:)
.Paul
Paul Clegg, Principal Software Engineer
My Digital Life, In