inhahe wrote:
Can someone tell me why this doesn't work?

colorre = re.compile ('('
                        '^'
                       '|'
                        '(?:'
                           '\x0b(?:10|11|12|13|14|15|0\\d|\\d)'
                           '(?:'
                              ',(?:10|11|12|13|14|15|0\\d|\\d)'
                           ')?'
                        ')'
                      ')(.*?)')

I'm trying to extract mirc color codes.

this works:

colorre = re.compile ('\x0b(?:10|11|12|13|14|15|0\\d|\\d)'
                      '(?:'
                         ',(?:10|11|12|13|14|15|0\\d|\\d)'
                      ')?'
                      )

but I wanted to modify it so that it returns me groups of (color code, text after the code), except for the first text at the beginning of the string before any color code, for which it should return ('', text). that's what the first paste above is trying to do, but it doesn't work. here are some results:

 >>> colorre.findall('a\x0b1,1')
[('', ''), ('\x0b1,1', '')]
 >>> colorre.findall('a\x0b1,1b')
[('', ''), ('\x0b1,1', '')]
 >>> colorre.findall('ab')
[('', '')]
 >>> colorre.findall('\x0b1,1')
[('', '')]
 >>> colorre.findall('\x0b1,1a')
[('', '')]
 >>>

i can easily work with the string that does work and just use group starting and ending positions, but i'm curious as to why i can't get it working teh way i want :/

The problem with the regex is that .*? is a lazy repeat: it'll try to
match as few characters as possible, which is why the second group is
always ''. Try a greedy repeat instead, but matching only
non-backspaces:

colorre = re.compile('('
                       '^'
                      '|'
                       '(?:'
                          '\x0b(?:10|11|12|13|14|15|0\\d|\\d)'
                          '(?:'
                             ',(?:10|11|12|13|14|15|0\\d|\\d)'
                          ')?'
                       ')'
                     ')([^\x0b]*)')
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to