Re: Where regexs listed for Python language's tokenizer/lexer?

Robert Kern Sat, 12 Sep 2009 16:08:55 -0700

Dennis Lee Bieber wrote:

On Fri, 11 Sep 2009 23:10:39 -0700 (PDT), Chris Seberino
<[email protected]> declaimed the following in
gmane.comp.python.general:

Where regexs listed for Python language's tokenizer/lexer?

If I'm not mistaken, the grammar is not sufficient to specify the
language....
you also need to specify the regexs that define the tokens
right?..where is that?

        Pardon... I've been out of the "market", but I don't recall EVER
seeing a "regex" used in a textbook for compiler/interpreter design.

        BNF (or Pascal's bubble diagram equivalent) has always been used to
define the syntactical components in those books in my possession, and
parsers (tokenizers) were written using those implied algorithms (if the
first character is numeric or "." it starts a number, otherwise treat it
as an identifier, etc.),

In actual implementations of lexers and the lexical analysis components ofparsers, regexes are fairly common. For example, from ply:


  http://www.dabeaz.com/ply/ply.html#ply_nn6

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list

Re: Where regexs listed for Python language's tokenizer/lexer?

Reply via email to