Hello, On 01-07-2009, Andreas Rossberg <rossb...@mpi-sws.org> wrote: > Mike Lin wrote: >> OK, now I'm curious :) how does your lexer match balanced parentheses, >> or in this case comments? >> > > Easily, with a bit of side effects (I think that's roughly how all ML > compilers do it): > > ------------------------------------------------ > let error l s = (* ... *) > let commentDepth = ref 0 > let start = ref 0 > let loc length = let pos = !start in (pos, pos+length) > > rule lex = > parse eof { EOF } > (* | ... *) > | "{-" { start := pos lexbuf; > lexNestComment lexbuf } > > and lexNestComment = > parse eof { error (loc 2) "unterminated comment" } > | "(*" { incr commentDepth; > lexNestComment lexbuf } > | "*)" { decr commentDepth; > if !commentDepth > 0 > then lexNestComment lexbuf > else lex lexbuf } > | _ { lexNestComment lexbuf } > ------------------------------------------------ > > If you also want to treat strings in comments specially (like OCaml), > then you need to do a bit more work, but it's basically the same idea. >
May I recommend you to write this in a more simple way: ------------------------------------------------------------------------- rule lex = parse eof { () } | "(*" { start := pos lexbuf; lexNestComment lexbuf; lex lexbuf } and lexNestComment = parse eof { error (loc 2) "unterminated comment" } | "(*" { lexNestComment lexbuf } | "*)" { () } | _ { lexNestComment lexbuf } ------------------------------------------------------------------------- I think it works the same way, except that it uses less global variables. Regards, Sylvain Le Gall _______________________________________________ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs