Yowza! My eyes glaze over when I see re's like "r'(?m)^(?P<data>.*? (".*?".*?)*)(?:#.*?)?$"!
Here's a simple recognizer that reads source code and suppresses comments. A comment will be a '#' character followed by the rest of the line. We need the recognizer to also detect quoted strings, so that any would-be '#' comment introducers that are in a quoted string *wont* incur the stripping wrath of the recognizer. A quoted string must be recognized before recognizing a '#' comment introducer. With our input tests given as: tests ='''this is a test 1 this is a test 2 #with a comment this is a '#gnarlier' test #with a comment this is a "#gnarlier" test #with a comment '''.splitlines() here is such a recognizer implemented using pyparsing. from pyparsing import quotedString, Suppress, restOfLine comment = Suppress('#' + restOfLine) recognizer = quotedString | comment for t in tests: print t print recognizer.transformString(t) print Prints: this is a test 1 this is a test 1 this is a test 2 #with a comment this is a test 2 this is a '#gnarlier' test #with a comment this is a '#gnarlier' test this is a "#gnarlier" test #with a comment this is a "#gnarlier" test For some added fun, add a parse action to quoted strings, to know when we've really done something interesting: def detectGnarliness(tokens): if '#' in tokens[0]: print "Ooooh, how gnarly! ->", tokens[0] quotedString.setParseAction(detectGnarliness) Now our output becomes: this is a test 1 this is a test 1 this is a test 2 #with a comment this is a test 2 this is a '#gnarlier' test #with a comment Ooooh, how gnarly! -> '#gnarlier' this is a '#gnarlier' test this is a "#gnarlier" test #with a comment Ooooh, how gnarly! -> "#gnarlier" this is a "#gnarlier" test -- Paul -- http://mail.python.org/mailman/listinfo/python-list