In article <mailman.364.1298517901.1189.python-l...@python.org>, Chris Rebert <c...@rebertia.com> wrote:
> regex = compile("(\d\d)/(\d\d)/(\d{4})") I would probably write that as either r"(\d{2})/(\d{2})/(\d{4})" or (somewhat less likely) r"(\d\d)/(\d\d)/(\d\d\d\d)" Keeping to one consistent style makes it a little easier to read. Also, don't forget the leading `r` to get raw strings. I've long since given up trying to remember the exact rules of what needs to get escaped and what doesn't. If it's a regex, I just automatically make it a raw string. Also, don't overlook the re.VERBOSE flag. With it, you can write positively outrageous expressions which are still quite readable. With it, you could write this regex as: r" (\d{2}) / (\d{2}) / (\d{4}) " which takes up only slightly more space, but makes it a whole lot easier to scan by eye. I'm still going to stand by my previous statement, however. If you're trying to parse HTML, use an HTML parser. Using a regex like this is perfectly fine for parsing the CDATA text inside the HTML <td> element, but pattern matching the HTML markup itself is madness. -- http://mail.python.org/mailman/listinfo/python-list