On Jun 10, 10:13 am, Peter Otten <__pete...@web.de> wrote: > 504cr...@gmail.com wrote: > > I wonder if you (or anyone else) might attempt a different explanation > > for the use of the special sequence '\1' in the RegEx syntax. > > > The Python documentation explains: > > > \number > > Matches the contents of the group of the same number. Groups are > > numbered starting from 1. For example, (.+) \1 matches 'the the' or > > '55 55', but not 'the end' (note the space after the group). This > > special sequence can only be used to match one of the first 99 groups. > > If the first digit of number is 0, or number is 3 octal digits long, > > it will not be interpreted as a group match, but as the character with > > octal value number. Inside the '[' and ']' of a character class, all > > numeric escapes are treated as characters. > > > In practice, this appears to be the key to the key device to your > > clever solution: > > >>>> re.compile(r"(\d+)").sub(r"INSERT \1", string) > > > 'abc INSERT 123 def INSERT 456 ghi INSERT 789' > > >>>> re.compile(r"(\d+)").sub(r"INSERT ", string) > > > 'abc INSERT def INSERT ghi INSERT ' > > > I don't, however, precisely understand what is meant by "the group of > > the same number" -- or maybe I do, but it isn't explicit. Is this just > > a shorthand reference to match.group(1) -- if that were valid -- > > implying that the group match result is printed in the compile > > execution? > > If I understand you correctly you are right. Another example: > > >>> re.compile(r"([a-z]+)(\d+)").sub(r"number=\2 word=\1", "a1 zzz42") > > 'number=1 word=a number=42 word=zzz' > > For every match of "[a-z]+\d+" in the original string "\1" in > "number=\2 word=\1" is replaced with the actual match for "[a-z]+" and > "\2" is replaced with the actual match for "\d+". > > The result, e. g. "number=1 word=a", is then used to replace the actual > match for group 0, i. e. "a1" in the example. > > Peter- Hide quoted text - > > - Show quoted text -
Wow! That is so cool. I had to process it for a little while to get it. >>> s = '111bbb333' >>> re.compile('(\d+)([b]+)(\d+)').sub(r'First string: \1 Second string: \2 >>> Third string: \3', s) 'First string: 111 Second string: bbb Third string: 333' MRI scans would no doubt reveal that people who attain a mastery of RegEx expressions must have highly developed areas of the brain. I wonder where the RegEx part of the brain might be located. That was a really clever teaching device. I really appreciate you taking the time to post it, Peter. I'm definitely getting a schooling on this list. Thanks! -- http://mail.python.org/mailman/listinfo/python-list