>It would be helpful, you explained what you are trying to do.
There is a lexical part of a grammar I'm trying to get parsed. The part particularly says: identifier-or-keyword: identifier-start-character identifier-part-charactersopt identifier-start-character: letter-character _ (the underscore character U+005F) letter-character: A Unicode character of classes Lu, Ll, Lt, Lm, Lo, or Nl A unicode-escape-sequence representing a character of classes Lu, Ll, Lt, Lm, Lo, or Nl combining-character: A Unicode character of classes Mn or Mc A unicode-escape-sequence representing a character of classes Mn or Mc etc. I'm trying to build a lexer for the grammar. Sorry, I didn't get what UTF-8 verbatim means. I just got a bunch of question marks. Alexander ----- Date: Sat, 14 Mar 2009 21:10:11 -0700 (PDT) From: Oleg Kobchenko <[email protected]> Subject: Re: [Jprogramming] regex matching Unicode classes? To: Programming forum <[email protected]> Message-ID: <[email protected]> Content-Type: text/plain; charset=utf-8 It would be helpful, you explained what you are trying to do. > What do you mean by using UTF-8 verbatim? load 'regex' T=: '? ??????? ??? ???????? ??????' NB. test V=: '?????' NB. some vowels runs=: ;:^:_1@,@(rxmatches rxfrom]) NB. contigous runs ('[^ ',V,']+') runs T ?? ??? ? ? ??? ? ? ? ?? ? ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
