"Magnus Lycka" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > I want an re that matches strings like "21MAR06 31APR06 1236", > where the last part is day numbers (1-7), i.e it can contain > the numbers 1-7, in order, only one of each, and at least one > digit. I want it as three groups. I was thinking of > > r"(\d\d[A-Z]\d\d) (\d\d[A-Z]\d\d) (1?2?3?4?5?6?7?)" > > but that will match even if the third group is empty, > right? Does anyone have good and not overly complex RE for > this? > > P.S. I know the "now you have two problems reply..."
For the pyparsing-inclined, here are two versions, along with several examples on how to extract the fields from the returned ParseResults object. The second version is more rigorous in enforcing the days-of-week rules on the 3rd field. Note that the month field is already limited to valid month abbreviations, and the same technique used to validate the days-of-week field could be used to ensure that the date fields are valid dates (no 31st of FEB, etc.), that the second date is after the first, etc. -- Paul Download pyparsing at http://pyparsing.sourceforge.net. data = "21MAR06 31APR06 1236" data2 = "21MAR06 31APR06 1362" from pyparsing import * # define format of an entry month = oneOf("JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC") date = Combine( Word(nums,exact=2) + month + Word(nums,exact=2) ) daysOfWeek = Word("1234567") entry = date.setResultsName("startDate") + \ date.setResultsName("endDate") + \ daysOfWeek.setResultsName("weekDays") + \ lineEnd # extract entry data e = entry.parseString(data) # various ways to access the results print e.startDate, e.endDate, e.weekDays print "%(startDate)s : %(endDate)s : %(weekDays)s" % e print e.asList() print e print # get more rigorous in testing for valid days of week field def rigorousDayOfWeekTest(s,l,toks): # remove duplicates from toks[0], sort, then compare to original tmp = "".join(sorted(dict([(ll,0) for ll in toks[0]]).keys())) if tmp != toks[0]: raise ParseException(s,l,"Invalid days of week field") daysOfWeek.setParseAction(rigorousDayOfWeekTest) entry = date.setResultsName("startDate") + \ date.setResultsName("endDate") + \ daysOfWeek.setResultsName("weekDays") + \ lineEnd print entry.parseString(data) print entry.parseString(data2) # <-- raises ParseException -- http://mail.python.org/mailman/listinfo/python-list