Thanks Steve, your are, from now on, my guru....
this is the final version, the good one! contents = re.sub(r'(<u>|<span style="text-decoration: underline;">)(l|L|n|N|t|T)(</span>|</u>)', r"\2'" ,contents) On Wed, March 30, 2011 17:27, Steve Willoughby wrote: On 30-Mar-11 08:21, "Andrés Chandía" wrote: > > > Thanks Kushal and Steve. > I think it works,a I say "I think" because at the > results I got a strange character instead of the letter that should appear > > this is > my regexp: > > contents = re.sub(r'(<u>|<span style="text-decoration: > underline;">)(l|L|n|N|t|T)(</span>|</u>)', '\2\'' ,contents) Remember that \2 in a string means the ASCII character with the code 002. You need to escape this with an extra backslash: '\\2\'' Although it would be more convenient to switch to double quotes to make the inclusion of the literal single quote easier: "\\2'" How does that work? As the string is being "built", the \\ is interpreted as a literal backslash, so the actual characters in the string's value end up being: \2' THAT is what is then passed into the sub() function, where \2 means to replace the second match. This can be yet simpler by using raw strings: r"\2'" Since in raw strings, backslashes do almost nothing special at all, so you don't need to double them. I should have thought of that when sending my original answer to your question. Sorry I overlooked it. --steve _______________________ andrés chandía P No imprima innecesariamente. ¡Cuide el medio ambiente! _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor