Re: regex problem ..

Tino Wildenhain Mon, 15 Dec 2008 04:36:39 -0800

Analog Kid wrote:

Hi All:
I am new to regular expressions in general, and not just re in python. So, apologies if you find my question stupid :) I need some help with forming a regex. Here is my scenario ... I have strings coming in from a list, each of which I want to check against a regular expression and see whether or not it "qualifies". By that I mean I have a certain set of characters that are permissible and if the string has characters which are not permissible, I need to flag that string ... here is a snip ...
flagged = list()
strs = ['HELLO', 'Hi%20There', '123...@#@']
p =  re.compile(r"""[^a-zA-Z0-9]""", re.UNICODE)
for s in strs:
    if len(p.findall(s)) > 0:
        flagged.append(s)

print flagged
my question is ... if I wanted to allow '%20' but not '%', how would my current regex (r"""[^a-zA-Z0-9]""") be modified?


You might want to normalize before checking, e.g.

from urllib import unquote

p=re.compile("[^a-zA-Z0-9 ]")
flagged=[]

for s in strs:
    if p.search(unquote(s)):
       flagged.append(s)

be carefull however if you want to show the
flagged ones back to the user. Best is always
quote/unquote at the boundaries as appropriate.

Regards
Tino

smime.p7s
Description: S/MIME Cryptographic Signature

--
http://mail.python.org/mailman/listinfo/python-list

Re: regex problem ..

Reply via email to