Re: Regular expression that skips single line comments?

Tim Chase Mon, 19 Jan 2009 08:41:12 -0800

I am trying to parse a set of files that have a simple syntax using
RE. I'm interested in counting '$' expansions in the files, with one
minor consideration. A line becomes a comment if the first non-white
space character is a semicolon.


e.g.  tests 1 and 2 should be ignored

sInput = """
; $1 test1
    ; test2 $2
    test3 ; $3 $3 $3
test4
$5 test5
   $6
  test7 $7 test7
"""

Required output:    ['$3', '$3', '$3', '$5', '$6', '$7']


We're interested in two things:  comments and "dollar-something"s

 >>> import re
 >>> r_comment = re.compile(r'\s*;')
 >>> r_dollar = re.compile(r'\$\d+')

Then remove comment lines and find the matching '$' expansions:

>>> [r_dollar.findall(line) for line in sInput.splitlines() ifnot r_comment.match(line)]

[[], ['$3', '$3', '$3'], [], ['$5'], ['$6'], ['$7']]

Finally, roll each line's results into a single list by slightlyabusing sum()

>>> sum((r_dollar.findall(line) for line in sInput.splitlines()if not r_comment.match(line)), [])

['$3', '$3', '$3', '$5', '$6', '$7']

Adjust the r_dollar if your variable pattern differs (such asreverting to your previous r'\$.' pattern if you prefer, or usingr'\$\w+' for multi-character variables).


-tkc





--
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression that skips single line comments?

Reply via email to