Hello, I have modified the Alex lexer generator to support unicode.
The general idea is that the state-machine works on the UTF8 representation of the text. I submit my work here for review in order to off-load the maintainer (Simon Marlow) as far as possible. The prototype is available on github: git://github.com/jyp/Alex.git Be sure to * checkout the "utf8" branch (so "git diff master" shows the changes) * Do a 2-stage bootstrapping before testing Caveats: * The generated code depends on some utf8 packages; * There is no attempt to fix the bytestring-based wrappers; * Left-context recognition is not table-based any more; * Presence of debug code. Bug reports, comments, and especially patches are welcome :) Thanks, -- JP _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe