subject:"Some questions about StandardTokenizer and UNICODE Regular Expressions"

Re:Re: Some questions about StandardTokenizer and UNICODE Regular Expressions

2016-06-16 Thread dr

Thank you so much, Steve. Your reply is very helpful. At 2016-06-16 23:01:18, "Steve Rowe" wrote: >Hi dr, > >Unicode’s character property model is described here: >. > >Wikipedia has a description of Unicode character properties: >

Re: Some questions about StandardTokenizer and UNICODE Regular Expressions

2016-06-16 Thread Steve Rowe

Hi dr, Unicode’s character property model is described here: . Wikipedia has a description of Unicode character properties: JFlex allows you to refer to the set of characters that have a given Unicode

Some questions about StandardTokenizer and UNICODE Regular Expressions

2016-06-16 Thread dr

Hi guys Currenly, I'm looking into the rules of StandardTokenizer, but met some probleam. As the docs says, StandardTokenizer implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29. Also it is generated by JFlex, a lexer/sc