hofer wrote: > Something I have to do very often is filtering / transforming line > based file contents and storing the result in an array or a > dictionary. > > Very often the functionallity exists already in form of a shell script > with sed / awk / grep , . . . > and I would like to have the same implementation in my script > > What's a compact, efficient (no intermediate arrays generated / > regexps compiled only once) way in python > for such kind of 'pipe line' > > Example 1 (in bash): (annotated with comment (thus not working) if > copied / pasted > cat file \ ### read from file > | sed 's/\.\..*//' \ ### remove '//' comments > | sed 's/#.*//' \ ### remove '#' comments > | grep -v '^\s*$' \ ### get rid of empty lines > | awk '{ print $1 + $2 " " $2 }' \ ### knowing, that all remaining > lines contain always at least > \ ### two integers calculate > sum and 'keep' second number > | grep '^42 ' ### keep lines for which sum is 42 > | awk '{ print $2 }' ### print number > thanks in advance for any suggestions of how to code this (keeping the > comments)
for line in open("file"): # read from file try: a, b = map(int, line.split(None, 2)[:2]) # remove extra columns, # convert to integer except ValueError: pass # remove comments, get rid of empty lines, # skip lines with less than two integers else: # line did start with two integers if a + b == 42: # keep lines for which the sum is 42 print b # print number The hard part was keeping the comments ;) Without them it looks better: import sys for line in sys.stdin: try: a, b = map(int, line.split(None, 2)[:2]) except ValueError: pass else: if a + b == 42: print b Peter -- http://mail.python.org/mailman/listinfo/python-list