On 2007-01-10, Fredrik Lundh <[EMAIL PROTECTED]> wrote: > Neil Cerutti wrote: >> A found some clues on lexing using the re module in Python in >> an article by Martin LĂ·wis. > >> Here, each alternative in the regular expression defines a >> named group. Scanning proceeds in the following steps: >> >> 1. Given the complete input, match the regular expression >> with the beginning of the input. >> 2. Find out which alternative matched. > > you can use lastgroup, or lastindex: > > http://effbot.org/zone/xml-scanner.htm > > there's also a "hidden" ready-made scanner class inside the SRE > module that works pretty well for simple cases; see: > > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/457664
Thanks for the excellent pointers. I got tripped up: >>> m = re.match('(a+(b*)a+)', 'abbbbaa') >>> dir(m) ['__copy__', '__deepcopy__', 'end', 'expand', 'group', 'groupdict', 'groups', 'span', 'start'] There are some notable omissions there. That's not much of an excuse for my not understanding the handy docs, but I guess it can can function as a warning against relying on the interactive help. I'd seen the lastgroup definition in the documentation, but I realize it was exactly what I needed. I didn't think carefully enough about what "last matched capturing group" actually meant, given my regex. I don't think I saw "name" there either. ;-) lastgroup The name of the last matched capturing group, or None if the group didn't have a name, or if no group was matched at all. -- Neil Cerutti We dispense with accuracy --sign at New York drug store -- http://mail.python.org/mailman/listinfo/python-list