On Sep 2, 12:36 pm, hofer <[EMAIL PROTECTED]> wrote: > Hi, > > Something I have to do very often is filtering / transforming line > based file contents and storing the result in an array or a > dictionary. > > Very often the functionallity exists already in form of a shell script > with sed / awk / grep , . . . > and I would like to have the same implementation in my script >
All that sed'ing, grep'ing and awk'ing, you might want to take a look at pyparsing. Here is a pyparsing take on your posted problem: from pyparsing import LineEnd, Word, nums, LineStart, OneOrMore, restOfLine test = """ 1 2 3 47 23 // this will never match # blank lines are not of any interest 91 26 23 19 41 1 97 26 // extra numbers don't matter """ # define pyparsing expressions to match a line of integers EOL = LineEnd() integer = Word(nums) # by default, pyparsing will implicitly skip over whitespace and # newlines, so EOL is skipped over by default - this would mix together # integers on consecutive lines - we only want OneOrMore integers as long # as they are on the same line, that is, integers with no intervening # EOL's line_of_integers = (LineStart() + integer + OneOrMore(~EOL + integer)) # use a parse action to identify the target lines def select_significant_values(t): v1, v2 = map(int, t[:2]) if v1+v2 == 42: print v2 line_of_integers.setParseAction(select_significant_values) # skip over comments, wherever they are line_of_integers.ignore( '//' + restOfLine ) line_of_integers.ignore( '#' + restOfLine ) # use the line_of_integers expression to search through the test text # the parse action will print the matching values line_of_integers.searchString(test) -- Paul -- http://mail.python.org/mailman/listinfo/python-list