Re: Multiple regex match idiom
On 9 Mai, 11:00, Hrvoje Niksic [EMAIL PROTECTED] wrote: I often have the need to match multiple regexes against a single string, typically a line of input, like this: if (matchobj = re1.match(line)): ... re1 matched; do something with matchobj ... elif (matchobj = re2.match(line)): ... re2 matched; do something with matchobj ... elif (matchobj = re3.match(line)): Of course, that doesn't work as written because Python's assignments are statements rather than expressions. The obvious rewrite results in deeply nested if's: matchobj = re1.match(line) if matchobj: ... re1 matched; do something with matchobj ... else: matchobj = re2.match(line) if matchobj: ... re2 matched; do something with matchobj ... else: matchobj = re3.match(line) if matchobj: ... Normally I have nothing against nested ifs, but in this case the deep nesting unnecessarily complicates the code without providing additional value -- the logic is still exactly equivalent to the if/elif/elif/... shown above. There are ways to work around the problem, for example by writing a utility predicate that passes the match object as a side effect, but that feels somewhat non-standard. I'd like to know if there is a Python idiom that I'm missing. What would be the Pythonic way to write the above code? Instead of scanning the same input over and over again with different, maybe complex, regexes and ugly looking, nested ifs, i would suggest defining a grammar and do parsing the input once with registered hooks for your matching expressions. SimpleParse (http://simpleparse.sourceforge.net) with a DispatchProcessor or pyparsing (http://pyparsing.wikispaces.com/) in combination with setParseAction or something similar are your friends for such a task. Steffen -- http://mail.python.org/mailman/listinfo/python-list
Multiple regex match idiom
I often have the need to match multiple regexes against a single string, typically a line of input, like this: if (matchobj = re1.match(line)): ... re1 matched; do something with matchobj ... elif (matchobj = re2.match(line)): ... re2 matched; do something with matchobj ... elif (matchobj = re3.match(line)): Of course, that doesn't work as written because Python's assignments are statements rather than expressions. The obvious rewrite results in deeply nested if's: matchobj = re1.match(line) if matchobj: ... re1 matched; do something with matchobj ... else: matchobj = re2.match(line) if matchobj: ... re2 matched; do something with matchobj ... else: matchobj = re3.match(line) if matchobj: ... Normally I have nothing against nested ifs, but in this case the deep nesting unnecessarily complicates the code without providing additional value -- the logic is still exactly equivalent to the if/elif/elif/... shown above. There are ways to work around the problem, for example by writing a utility predicate that passes the match object as a side effect, but that feels somewhat non-standard. I'd like to know if there is a Python idiom that I'm missing. What would be the Pythonic way to write the above code? -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple regex match idiom
Hrvoje Niksic wrote: I often have the need to match multiple regexes against a single string, typically a line of input, like this: if (matchobj = re1.match(line)): ... re1 matched; do something with matchobj ... elif (matchobj = re2.match(line)): ... re2 matched; do something with matchobj ... elif (matchobj = re3.match(line)): [snip] There are ways to work around the problem, for example by writing a utility predicate that passes the match object as a side effect, but that feels somewhat non-standard. I'd like to know if there is a Python idiom that I'm missing. What would be the Pythonic way to write the above code? Only just learning Python, but to me this seems better. Completely untested. re_list = [ re1, re2, re3, ... ] for re in re_list: matchob = re.match(line) if matchob: break Of course this only works it the do something is the same for all matches. If not, maybe a function for each case, something like re1 = re.compile() def fn1( s, m ): re2 = def fn2( s, m ): re_list = [ (re1, fn1), (re2, fn2), ... ] for (r,f) in re_list: matchob = r.match(line) if matchob: f( line, matchob ) break f(line,m) Probably better ways than this exist. Charles -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiple regex match idiom
On May 9, 5:00 am, Hrvoje Niksic [EMAIL PROTECTED] wrote: I often have the need to match multiple regexes against a single string, typically a line of input, like this: if (matchobj = re1.match(line)): ... re1 matched; do something with matchobj ... elif (matchobj = re2.match(line)): ... re2 matched; do something with matchobj ... elif (matchobj = re3.match(line)): Of course, that doesn't work as written because Python's assignments are statements rather than expressions. The obvious rewrite results in deeply nested if's: matchobj = re1.match(line) if matchobj: ... re1 matched; do something with matchobj ... else: matchobj = re2.match(line) if matchobj: ... re2 matched; do something with matchobj ... else: matchobj = re3.match(line) if matchobj: ... Normally I have nothing against nested ifs, but in this case the deep nesting unnecessarily complicates the code without providing additional value -- the logic is still exactly equivalent to the if/elif/elif/... shown above. There are ways to work around the problem, for example by writing a utility predicate that passes the match object as a side effect, but that feels somewhat non-standard. I'd like to know if there is a Python idiom that I'm missing. What would be the Pythonic way to write the above code? Hrvoje, To make it more elegant I would do this: 1. Put all the ...do somethings... in functions like re1_do_something(), re2_do_something(),... 2. Create a list of pairs of (re,func) in other words: dispatch=[ (re1, re1_do_something), (re2, re2_do_something), ... ] 3. Then do: for regex,func in dispatch: if regex.match(line): func(...) Hope this helps, -Nick Vatamaniuc -- http://mail.python.org/mailman/listinfo/python-list