list of regex special characters

2010-11-28 Thread goldtech
I am looking for a list of special character in python regular
expressions that need to be escaped if you want their literal meaning.

I searched and can not find the list. Any help appreciated.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: list of regex special characters

2010-11-28 Thread Ben Finney
goldtech goldt...@worldpost.com writes:

 I am looking for a list of special character in python regular
 expressions that need to be escaped if you want their literal meaning.

You can avoid caring about that by using ‘re.escape’, which escapes any
characters in its input character that are not alphanumeric.

 I searched and can not find the list. Any help appreciated.

 import re
 help(re)
…
DESCRIPTION
…
The special characters are: …

-- 
 \ “I got some new underwear the other day. Well, new to me.” —Emo |
  `\   Philips |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: list of regex special characters

2010-11-28 Thread Tim Chase

On 11/28/2010 05:58 PM, goldtech wrote:

I am looking for a list of special character in python regular
expressions that need to be escaped if you want their literal meaning.

I searched and can not find the list. Any help appreciated.


Trust the re module to tell you:

  import re
  chars = [chr(i) for i in range(0,256)]
  escaped = [c for c in chars if re.escape(c) != c]
  print len(escaped)
 194
  print escaped
 [...]
  can_use_unescaped = [c for c in chars if re.escape(c) == c]

(adjust chars accordingly if you want to check unicode 
characters too).


-tkc



--
http://mail.python.org/mailman/listinfo/python-list


Re: list of regex special characters

2010-11-28 Thread Ben Finney
Tim Chase python.l...@tim.thechases.com writes:

 On 11/28/2010 05:58 PM, goldtech wrote:
  I am looking for a list of special character in python regular
  expressions that need to be escaped if you want their literal
  meaning.

 Trust the re module to tell you:

   import re
   chars = [chr(i) for i in range(0,256)]
   escaped = [c for c in chars if re.escape(c) != c]

Note that, according to its docstring, ‘re.escape’ doesn't distinguish
characters that *need to be* escaped for their literal meaning; it
simply escapes any non-alphanumeric character.

   can_use_unescaped = [c for c in chars if re.escape(c) == c]

Right. There are three classes of character for this purpose:

* those that have a literal meaning *only if* escaped
* those that have literal meaning whether or not they are escaped
* those that have a literal meaning *only if not* escaped

The ‘re.escape’ function, according to its docstring, simply says any
non-alphanumerics can safely be said to exist in one of the first two
classes, and both are safe to escape without bothering to distinguish
between them.

The OP was asking for the first class specifically, but I question
whether that's actually needed for the purpose.

-- 
 \   “The cost of education is trivial compared to the cost of |
  `\ ignorance.” —Thomas Jefferson |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list