On 02/03/2011 02:15 PM, Peter Otten wrote:
Karim wrote:
I am trying to subsitute a '""' pattern in '\"\"' namely escape 2
consecutives double quotes:
* *In Python interpreter:*
$ python
Python 2.7.1rc1 (r271rc1:86455, Nov 16 2010, 21:53:40)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> expression = *' "" '*
>>> re.subn(*r'([^\\])?"', r'\1\\"', expression*)
Traceback (most recent call last):
File "<stdin>", line 1, in<module>
File "/home/karim/build/python/install/lib/python2.7/re.py", line
162, in subn
return _compile(pattern, flags).subn(repl, string, count)
File "/home/karim/build/python/install/lib/python2.7/re.py", line
278, in filter
return sre_parse.expand_template(template, match)
File "/home/karim/build/python/install/lib/python2.7/sre_parse.py",
line 787, in expand_template
raise error, "unmatched group"
sre_constants.error: unmatched group
But if I remove '?' I get the following:
>>> re.subn(r'([^\\])"', r'\1\\"', expression)
(' \\"" ', 1)
Only one substitution..._But this is not the same REGEX._ And the
count=2 does nothing. By default all occurrence shoul be substituted.
* *On linux using my good old sed command, it is working with my '?'
(0-1 match):*
*$* echo *' "" '* | sed *'s/\([^\\]\)\?"/\1\\"/g*'*
\"\"
*Indeed what's the matter with RE module!?*
You should really fix the problem with your email program first;
Thunderbird issue with bold type (appears as stars) but I don't know how
to fix it yet.
afterwards
it's probably a good idea to try and explain your goal clearly, in plain
English.
I already did it. (cf the mails queue). But to resume I pass the
expression string to TCL command which delimits string with double
quotes only.
Indeed I get error with nested double quotes => That's the key problem.
Yes. What Steven said ;)
Now to your question as stated: if you want to escape two consecutive double
quotes that can be done with
s = s.replace('""', '\"\"')
I have already done it as a workaround but I have to add another
replacement before to consider all other cases.
I want to make the original command work to suppress the workaround.
but that's probably *not* what you want. Assuming you want to escape two
consecutive double quotes and make sure that the first one isn't already
escaped,
You hit it !:-)
this is my attempt:
def sub(m):
... s = m.group()
... return r'\"\"' if s == '""' else s
...
print re.compile(r'[\\].|""').sub(sub, r'\\\"" \\"" \"" "" \\\" \\" \"')
That is not the thing I want. I want to escape any " which are not
already escaped.
The sed regex '/\([^\\]\)\?"/\1\\"/g' is exactly what I need (I have
made regex on unix since 15 years).
For me the equivalent python regex is buggy: r'([^\\])?"', r'\1\\"'
'?' is not accepted Why? character which should not be an antislash with
0 or 1 occurence. This is quite simple.
I am a poor tradesman but I don't deny evidence.
Regards
Karim
\\\"" \\\"\" \"" \"\" \\\" \\" \"
Compare that with
$ echo '\\\"" \\"" \"" "" \\\" \\" \"' | sed 's/\([^\\]\)\?"/\1\\"/g'
\\\"\" \\"\" \"\" \"\" \\\\" \\\" \\"
Concerning the exception and the discrepancy between sed and python's re, I
suggest that you ask it again on comp.lang.python aka the python-list
mailing list where at least one regex guru will read it.
Peter
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor