On Sat, Jul 21, 2012 at 3:19 AM, Charles Hixson <charleshi...@earthlink.net> wrote: > On 07/20/2012 04:05 AM, Дмитрий wrote: >> >> Hello. >> >> Does IrRegex support Unicode character classes? E.g. Will IrRegex consider >> accented letters (á) or Cyrillic letters (я) as "alpha"? Wil IrRegex >> consider Chinese wide space ( ) as "space"? Will IrRegex consider Chinese >> brackets (「」【】) as "punct"? If it doesn't, the regexp is going to be >> EXTREMELY messy [in fact, I believe it may better to build such a regexp >> automatically then]. >> >> I’m on Windows, so I can’t check it (when I use UTF-8 console via chcp >> 65001, for some reason Chicken seems to fail on every string with operation >> non-ascii string — even on a simple (display "Привет")). >> >> >> -- >> Yours sincerely, >> Dmitry Kushnariov >> >> > > As I said, I'm a neophyte. My "character classes" were based around > [a-zA-z] etc. So you can readily see why the pattern would have quickly > become unreasonably complex. I didn't find any definition of other > character classes (well, not one that meant anything) and given the > discussion, I think that they wouldn't have worked if I'd gotten to the > point of testing them. > > I was planning on using Chicken to learn scheme, since R7SR is supposed to > be based more on R5SR than on R6SR, but maybe it's better to learn using > Racket. I *trust* the conversion won't be too difficult. (I *do* need to > use utf-8 in lots of places, and an incomplete implementation while I was > learning would be ... unpleasant. Particularly if the user documentation > presumed that it *was* complete.)
The utf8 implementation is not incomplete. It's just not the default. -- Alex _______________________________________________ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users