On Fri, Sep 4, 2009 at 1:34 PM, GoodPotatoes<goodpotat...@yahoo.com> wrote:
> I simply want to remark out all non-word characters read from a line.
>
> Line:
> Q*bert says "#...@!$%  "
>
> in Perl
> #match each non-word character, add "\" before it, globally.
>
> $_=s/(\W)/\\$1/g;
>
> output:
> Q\*bert\ says\ \"\...@\!\$\%\ \ \"  #perfect!
>
> Is there something simple like this in python?
>
> I would imagine:
> foo='Q*bert says "#...@!$%  "'
> pNw=re.compile('(\W)')
> re.sub(pNw,'\\'+(match of each non-word character),foo)
>
> How do I get the match into this function?  Is there a different way to do
> this?

Like this:

>>> import re
>>> line = 'Q*bert says "#...@!$%  "'
>>> pattern = re.compile(r"(\W)")

>>> re.sub(pattern, r"\\\1", line)
'Q\\*bert\\ says\\ \\"\\...@\\!\\$\\%\\ \\ \\"'

Note that line is showing the single backslashes doubled up, because
it's the repr of the string.  If you print the string instead, you'll
see what you expect:

>>> print re.sub(pattern, r"\\\1", line)
Q\*bert\ says\ \"\...@\!\$\%\ \ \"

>>>

When you're using re.sub, \1 is the value of the first match.  It's
also helpful to use raw strings when you're using the python re
library, since both python and the regular expression library use '\'
as an escape character.  (See the top of
http://docs.python.org/library/re.html for more details).

-- 
Jerry
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to