Your example string is "\n; BB#0;\n" So, I'd expect the lexer to match: - whitespace - line-comment
Yes, `block-comment` matches, but `line-comment On Thu, Jul 24, 2014 at 12:46 AM, Mangpo Phitchaya Phothilimthana <[email protected]> wrote: > Hi, > > I try to write a lexer and parser, but I cannot figure out how to set > priority to lexer's tokens. My simplified lexer (shown below) has only 2 > tokens BLOCK, and COMMENT. BLOCK is in fact a subset of COMMENT. BLOCK > appears first in the lexer, but when I parse something that matches BLOCK, > it always matches to COMMENT instead. Below is my program. In this > particular example, I expect to get a BLOCK token, but I get COMMENT token > instead. If I comment out (line-comment (token-COMMENT lexeme)) in the > lexer, I then get the BLOCK token. > > Can anyone tell me how to work around this issue? I can only find this in > the documentation > "When multiple patterns match, a lexer will choose the longest match, > breaking ties in favor of the rule appearing first." > > #lang racket > > (require parser-tools/lex > (prefix-in re- parser-tools/lex-sre) > parser-tools/yacc) > > (define-tokens a (BLOCK COMMENT)) > (define-empty-tokens b (EOF)) > > (define-lex-trans number > (syntax-rules () > ((_ digit) > (re-: (uinteger digit) > (re-? (re-: "." (re-? (uinteger digit)))))))) > > (define-lex-trans uinteger > (syntax-rules () > ((_ digit) (re-+ digit)))) > > (define-lex-abbrevs > (block-comment (re-: "; BB#" number10 ":")) > (line-comment (re-: ";" (re-* (char-complement #\newline)) #\newline)) > (digit10 (char-range "0" "9")) > (number10 (number digit10))) > > (define my-lexer > (lexer-src-pos > (block-comment (token-BLOCK lexeme)) > (line-comment (token-COMMENT lexeme)) > (whitespace (position-token-token (my-lexer input-port))) > ((eof) (token-EOF)))) > > (define my-parser > (parser > (start code) > (end EOF) > (error > (lambda (tok-ok? tok-name tok-value start-pos end-pos) > (raise-syntax-error 'parser > (format "syntax error at '~a' in src l:~a c:~a" > tok-name > (position-line start-pos) > (position-col start-pos))))) > (tokens a b) > (src-pos) > (grammar > (unit ((BLOCK) $1) > ((COMMENT) $1)) > (code ((unit) (list $1)) > ((unit code) (cons $1 $2)))))) > > (define (lex-this lexer input) > (lambda () > (let ([token (lexer input)]) > (pretty-display token) > token))) > > (define (ast-from-string s) > (let ((input (open-input-string s))) > (ast input))) > > (define (ast input) > (my-parser (lex-this my-lexer input))) > > (ast-from-string " > ; BB#0: > ") > > ____________________ > Racket Users list: > http://lists.racket-lang.org/users > ____________________ Racket Users list: http://lists.racket-lang.org/users

