Python's regular expression help
Hi, Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) m.group(0) 'absss' m.group(1) 'ab' m.group(2) 'sss' ... But two questions: How can I operate a regex on a string variable? I'm doing something wrong here: f=r'abss' f 'abss' m = p.match( f ) m.group(0) Traceback (most recent call last): File pyshell#15, line 1, in module m.group(0) AttributeError: 'NoneType' object has no attribute 'group' How do I implement a regex on a multiline string? I thought this might work but there's problem: p = re.compile('(ab*)(sss)', re.S) m = p.match( 'ab\nsss' ) m.group(0) Traceback (most recent call last): File pyshell#26, line 1, in module m.group(0) AttributeError: 'NoneType' object has no attribute 'group' Thanks for the newbie regex help, Lee -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression help
Le 29/04/2010 20:00, goldtech a écrit : Hi, Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) m.group(0) 'absss' m.group(1) 'ab' m.group(2) 'sss' ... But two questions: How can I operate a regex on a string variable? I'm doing something wrong here: f=r'abss' f 'abss' m = p.match( f ) m.group(0) Traceback (most recent call last): File pyshell#15, line 1, inmodule m.group(0) AttributeError: 'NoneType' object has no attribute 'group' How do I implement a regex on a multiline string? I thought this might work but there's problem: p = re.compile('(ab*)(sss)', re.S) m = p.match( 'ab\nsss' ) m.group(0) Traceback (most recent call last): File pyshell#26, line 1, inmodule m.group(0) AttributeError: 'NoneType' object has no attribute 'group' Thanks for the newbie regex help, Lee for multiline, I use re.DOTALL I do not know match(), findall is pretty efficient : my = a href=\hello world.com\LINK/a res = re.findall((.*?),my) res ['LINK'] Dorian -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression help
goldtech wrote: Hi, Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) m.group(0) 'absss' m.group(1) 'ab' m.group(2) 'sss' ... But two questions: How can I operate a regex on a string variable? I'm doing something wrong here: f=r'abss' f 'abss' m = p.match( f ) m.group(0) Traceback (most recent call last): File pyshell#15, line 1, in module m.group(0) AttributeError: 'NoneType' object has no attribute 'group' Look closely: the regex contains 3 letter 's', but the string referred to by f has only 2. How do I implement a regex on a multiline string? I thought this might work but there's problem: p = re.compile('(ab*)(sss)', re.S) m = p.match( 'ab\nsss' ) m.group(0) Traceback (most recent call last): File pyshell#26, line 1, in module m.group(0) AttributeError: 'NoneType' object has no attribute 'group' Thanks for the newbie regex help, Lee The string contains a newline between the 'b' and the 's', but the regex isn't expecting any newline (or any other character) between the 'b' and the 's', hence no match. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression help
On 04/29/2010 01:00 PM, goldtech wrote: Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) f=r'abss' f 'abss' m = p.match( f ) m.group(0) Traceback (most recent call last): File pyshell#15, line 1, inmodule m.group(0) AttributeError: 'NoneType' object has no attribute 'group' 'absss' != 'abss' Your regexp looks for 3 s, your f contains only 2. So the regexp object doesn't, well, match. Try f = 'absss' and it will work. As an aside, using raw-strings for this text doesn't change anything, but if you want, you _can_ write it as f = r'absss' if it will make you feel better :) How do I implement a regex on a multiline string? I thought this might work but there's problem: p = re.compile('(ab*)(sss)', re.S) m = p.match( 'ab\nsss' ) m.group(0) Traceback (most recent call last): File pyshell#26, line 1, inmodule m.group(0) AttributeError: 'NoneType' object has no attribute 'group' Well, it depends on what you want to do -- regexps are fairly precise, so if you want to allow whitespace between the two, you can use r = re.compile(r'(ab*)\s*(sss)') If you want to allow whitespace anywhere, it gets uglier, and your capture/group results will contain that whitespace: r'(a\s*b*)\s*(s\s*s\s*s)' Alternatively, if you don't want to allow arbitrary whitespace but only newlines, you can use \n* instead of \s* -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression help
On Apr 29, 11:49 am, Tim Chase python.l...@tim.thechases.com wrote: On 04/29/2010 01:00 PM, goldtech wrote: Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) f=r'abss' f 'abss' m = p.match( f ) m.group(0) Traceback (most recent call last): File pyshell#15, line 1, inmodule m.group(0) AttributeError: 'NoneType' object has no attribute 'group' 'absss' != 'abss' Your regexp looks for 3 s, your f contains only 2. So the regexp object doesn't, well, match. Try f = 'absss' and it will work. As an aside, using raw-strings for this text doesn't change anything, but if you want, you _can_ write it as f = r'absss' if it will make you feel better :) How do I implement a regex on a multiline string? I thought this might work but there's problem: p = re.compile('(ab*)(sss)', re.S) m = p.match( 'ab\nsss' ) m.group(0) Traceback (most recent call last): File pyshell#26, line 1, inmodule m.group(0) AttributeError: 'NoneType' object has no attribute 'group' Well, it depends on what you want to do -- regexps are fairly precise, so if you want to allow whitespace between the two, you can use r = re.compile(r'(ab*)\s*(sss)') If you want to allow whitespace anywhere, it gets uglier, and your capture/group results will contain that whitespace: r'(a\s*b*)\s*(s\s*s\s*s)' Alternatively, if you don't want to allow arbitrary whitespace but only newlines, you can use \n* instead of \s* -tkc Yes, most of my problem is w/my patterns not w/any python re syntax. I thought re.S will take a multiline string with any spaces or newlines and make it appear as one line to the regex. Make /n be ignored in a way...still playing w/it. Thanks for the help! -- http://mail.python.org/mailman/listinfo/python-list