On Fri, 1 Oct 2010 12:45:38 pm Alex Hall wrote: > Hi, once again... > I have a regexp that I am trying to use to make sure a line matches > the format: [c*]n [c*]n n > where c* is (optionally) 0 or more non-numeric characters and n is > any numeric character. The spacing should not matter. These should > pass: v1 v2 5 > 2 someword7 3 > > while these should not: > word 2 3 > 1 2 > > Here is my test: > s=re.search(r"[\d+\s+\d+\s+\d]", l)
Try this instead: re.search(r'\d+\s+\D*\d+\s+\d', l) This searches for: one or more digits at least one whitespace char (space, tab, etc) zero or more non-digits at least one digit at least one whitespace exactly one digit > However: > 1. this seems to pass with *any* string, even when l is a single > character. This causes many problems [...] I'm sure it does. You don't have to convince us that if the regular expression is broken, the rest of your code has a problem. That's a given. It's enough to know that the regex doesn't do what you need it to do. > 3. Once I get the above working, I will need a way of pulling the > characters out of the string and sticking them somewhere. For > example, if the string were > v9 v10 15 > I would want an array: > n=[9, 10, 15] Modify the regex to be this: r'(\d+)\s+\D*(\d+)\s+(\d)' and then query the groups of the match object that is returned: >>> mo = re.search(r'(\d+)\s+\D*(\d+)\s+(\d)', 'spam42 eggs23 9') >>> mo.groups() ('42', '23', '9') Don't forget that mo will be None if the regex doesn't match, and don't forget that the items returned are strings. -- Steven D'Aprano _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor