A found some clues on lexing using the re module in Python in an
article by Martin LĂ·wis.


He writes:
  A scanner based on regular expressions is usually implemented
  as an alternative of all token definitions. For XPath, a
  fragment of this expressions looks like this:

      (?P<VariableReference>\\$""" + QName + """)|

  Here, each alternative in the regular expression defines a
  named group. Scanning proceeds in the following steps:

     1. Given the complete input, match the regular expression
     with the beginning of the input.
     2. Find out which alternative matched.

Item 2 is where I get stuck. There doesn't seem to be an obvious
way to do it, which I understand is a bad thing in Python.
Whatever source code went with the article originally is not
linked from the above page, so I don't know what Martin did.

Here's what I came up with (with a trivial example regex):

  import re
  r = re.compile('(?P<x>x+)|(?P<a>a+)')
  m = r.match('aaxaxx')
  if m:
    for k in r.groupindex:
      if m.group(k):
        # Find the token type.
        token = (k, m.group())

I wish I could do something obvious instead, like m.name().

Neil Cerutti
After finding no qualified candidates for the position of principal, the
school board is pleased to announce the appointment of David Steele to the
post. --Philip Streifer

Reply via email to