Analog Kid wrote: > Hi All: > I am new to regular expressions in general, and not just re in python. > So, apologies if you find my question stupid :) I need some help with > forming a regex. Here is my scenario ... > I have strings coming in from a list, each of which I want to check > against a regular expression and see whether or not it "qualifies". By > that I mean I have a certain set of characters that are permissible and > if the string has characters which are not permissible, I need to flag > that string ... here is a snip ... > > flagged = list() > strs = ['HELLO', 'Hi%20There', '123...@#@'] > p = re.compile(r"""[^a-zA-Z0-9]""", re.UNICODE) > for s in strs: > if len(p.findall(s)) > 0: > flagged.append(s) > > print flagged > > my question is ... if I wanted to allow '%20' but not '%', how would my > current regex (r"""[^a-zA-Z0-9]""") be modified? > The essence of the approach is to observe that each element is a sequence of zero or more "character", where character is "either letter/digit or escape." So you would use a pattern like
"([a-zA-Z0-9]|%[0-9a-f][0-9a-f])+" regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list