Re: Multiple regex match idiom

2007-05-10 Thread Steffen Oschatz
On 9 Mai, 11:00, Hrvoje Niksic [EMAIL PROTECTED] wrote:
 I often have the need to match multiple regexes against a single
 string, typically a line of input, like this:

 if (matchobj = re1.match(line)):
   ... re1 matched; do something with matchobj ...
 elif (matchobj = re2.match(line)):
   ... re2 matched; do something with matchobj ...
 elif (matchobj = re3.match(line)):
 

 Of course, that doesn't work as written because Python's assignments
 are statements rather than expressions.  The obvious rewrite results
 in deeply nested if's:

 matchobj = re1.match(line)
 if matchobj:
   ... re1 matched; do something with matchobj ...
 else:
   matchobj = re2.match(line)
   if matchobj:
 ... re2 matched; do something with matchobj ...
   else:
 matchobj = re3.match(line)
 if matchobj:
   ...

 Normally I have nothing against nested ifs, but in this case the deep
 nesting unnecessarily complicates the code without providing
 additional value -- the logic is still exactly equivalent to the
 if/elif/elif/... shown above.

 There are ways to work around the problem, for example by writing a
 utility predicate that passes the match object as a side effect, but
 that feels somewhat non-standard.  I'd like to know if there is a
 Python idiom that I'm missing.  What would be the Pythonic way to
 write the above code?

Instead of scanning the same input over and over again with different,
maybe complex, regexes and ugly looking, nested ifs, i would suggest
defining a grammar and do parsing the input once with registered hooks
for your matching expressions.

SimpleParse (http://simpleparse.sourceforge.net) with a
DispatchProcessor or  pyparsing (http://pyparsing.wikispaces.com/) in
combination with setParseAction  or something similar are your friends
for such a task.

Steffen

-- 
http://mail.python.org/mailman/listinfo/python-list


Multiple regex match idiom

2007-05-09 Thread Hrvoje Niksic
I often have the need to match multiple regexes against a single
string, typically a line of input, like this:

if (matchobj = re1.match(line)):
  ... re1 matched; do something with matchobj ...
elif (matchobj = re2.match(line)):
  ... re2 matched; do something with matchobj ...
elif (matchobj = re3.match(line)):


Of course, that doesn't work as written because Python's assignments
are statements rather than expressions.  The obvious rewrite results
in deeply nested if's:

matchobj = re1.match(line)
if matchobj:
  ... re1 matched; do something with matchobj ...
else:
  matchobj = re2.match(line)
  if matchobj:
... re2 matched; do something with matchobj ...
  else:
matchobj = re3.match(line)
if matchobj:
  ...

Normally I have nothing against nested ifs, but in this case the deep
nesting unnecessarily complicates the code without providing
additional value -- the logic is still exactly equivalent to the
if/elif/elif/... shown above.

There are ways to work around the problem, for example by writing a
utility predicate that passes the match object as a side effect, but
that feels somewhat non-standard.  I'd like to know if there is a
Python idiom that I'm missing.  What would be the Pythonic way to
write the above code?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Multiple regex match idiom

2007-05-09 Thread Charles Sanders
Hrvoje Niksic wrote:
 I often have the need to match multiple regexes against a single
 string, typically a line of input, like this:
 
 if (matchobj = re1.match(line)):
   ... re1 matched; do something with matchobj ...
 elif (matchobj = re2.match(line)):
   ... re2 matched; do something with matchobj ...
 elif (matchobj = re3.match(line)):
 
[snip]
 
 There are ways to work around the problem, for example by writing a
 utility predicate that passes the match object as a side effect, but
 that feels somewhat non-standard.  I'd like to know if there is a
 Python idiom that I'm missing.  What would be the Pythonic way to
 write the above code?

Only just learning Python, but to me this seems better.
Completely untested.

re_list = [ re1, re2, re3, ... ]
for re in re_list:
   matchob = re.match(line)
   if matchob:
 
 break

Of course this only works it the do something is the same
for all matches. If not, maybe a function for each case,
something like

re1 = re.compile()
def fn1( s, m ):
   
re2 = 
def fn2( s, m ):
   

re_list = [ (re1, fn1), (re2, fn2), ... ]

for (r,f) in re_list:
   matchob = r.match(line)
   if matchob:
 f( line, matchob )
 break
 f(line,m)

Probably better ways than this exist.


Charles
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Multiple regex match idiom

2007-05-09 Thread Nick Vatamaniuc
On May 9, 5:00 am, Hrvoje Niksic [EMAIL PROTECTED] wrote:
 I often have the need to match multiple regexes against a single
 string, typically a line of input, like this:

 if (matchobj = re1.match(line)):
   ... re1 matched; do something with matchobj ...
 elif (matchobj = re2.match(line)):


   ... re2 matched; do something with matchobj ...
 elif (matchobj = re3.match(line)):
 

 Of course, that doesn't work as written because Python's assignments
 are statements rather than expressions.  The obvious rewrite results
 in deeply nested if's:

 matchobj = re1.match(line)
 if matchobj:
   ... re1 matched; do something with matchobj ...
 else:
   matchobj = re2.match(line)
   if matchobj:
 ... re2 matched; do something with matchobj ...
   else:
 matchobj = re3.match(line)
 if matchobj:
   ...

 Normally I have nothing against nested ifs, but in this case the deep
 nesting unnecessarily complicates the code without providing
 additional value -- the logic is still exactly equivalent to the
 if/elif/elif/... shown above.

 There are ways to work around the problem, for example by writing a
 utility predicate that passes the match object as a side effect, but
 that feels somewhat non-standard.  I'd like to know if there is a
 Python idiom that I'm missing.  What would be the Pythonic way to
 write the above code?

Hrvoje,

To make it more elegant I would do this:

1. Put all the ...do somethings... in functions like
re1_do_something(), re2_do_something(),...

2. Create a list of pairs of (re,func) in other words:
dispatch=[ (re1, re1_do_something), (re2, re2_do_something), ... ]

3. Then do:
for regex,func in dispatch:
if regex.match(line):
  func(...)


Hope this helps,
-Nick Vatamaniuc



-- 
http://mail.python.org/mailman/listinfo/python-list