Re: [Tutor] Regex question

Andrés Chandía Wed, 30 Mar 2011 09:51:49 -0700


Thanks Steve, your are, from now on, my guru....


this is the final version, the
good one!

contents = re.sub(r'(<u>|<span style="text-decoration:
underline;">)(l|L|n|N|t|T)(</span>|</u>)', r"\2'" ,contents)


On Wed, March 30, 2011 17:27, Steve Willoughby wrote:
On 30-Mar-11 08:21,
"Andrés Chandía" wrote:
>
>
> Thanks Kushal
and Steve.
> I think it works,a I say "I think" because at the
>
results I got a strange character instead of the letter that should appear
>
>
this is
> my regexp:
>
> contents = re.sub(r'(<u>|<span
style="text-decoration:
>
underline;">)(l|L|n|N|t|T)(</span>|</u>)', '\2\'' ,contents)

Remember that \2 in a string means the ASCII character with the code
002.  You need to
escape this with an extra backslash:
        '\\2\''
Although it would be more convenient
to switch to double quotes to make
the inclusion of the literal single quote easier:
        "\\2'"

How does that work?  As the string is being "built",
the \\ is
interpreted as a literal backslash, so the actual characters in the
string's value end up being:
        \2'
THAT is what is then passed into the sub()
function, where \2 means to
replace the second match.

This can be yet simpler
by using raw strings:
        r"\2'"

Since in raw strings, backslashes do
almost nothing special at all, so
you don't need to double them.

I should have
thought of that when sending my original answer to your
question.  Sorry I overlooked
it.

--steve





_______________________
            andrés
chandía

P No imprima
innecesariamente. ¡Cuide el medio ambiente!


_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regex question

Reply via email to