Python's regular expression help

2010-04-29 Thread goldtech
Hi,
Trying to start out with simple things but apparently there's some
basics I need help with. This works OK:
 import re
 p = re.compile('(ab*)(sss)')
 m = p.match( 'absss' )
 m.group(0)
'absss'
 m.group(1)
'ab'
 m.group(2)
'sss'
...
But two questions:

How can I operate a regex on a string variable?
I'm doing something wrong here:

 f=r'abss'
 f
'abss'
 m = p.match( f )
 m.group(0)
Traceback (most recent call last):
  File pyshell#15, line 1, in module
m.group(0)
AttributeError: 'NoneType' object has no attribute 'group'

How do I implement a regex on a multiline string?  I thought this
might work but there's problem:

 p = re.compile('(ab*)(sss)', re.S)
 m = p.match( 'ab\nsss' )
 m.group(0)
Traceback (most recent call last):
  File pyshell#26, line 1, in module
m.group(0)
AttributeError: 'NoneType' object has no attribute 'group'


Thanks for the newbie regex help, Lee
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's regular expression help

2010-04-29 Thread Dodo

Le 29/04/2010 20:00, goldtech a écrit :

Hi,
Trying to start out with simple things but apparently there's some
basics I need help with. This works OK:

import re
p = re.compile('(ab*)(sss)')
m = p.match( 'absss' )
m.group(0)

'absss'

m.group(1)

'ab'

m.group(2)

'sss'
...
But two questions:

How can I operate a regex on a string variable?
I'm doing something wrong here:


f=r'abss'
f

'abss'

m = p.match( f )
m.group(0)

Traceback (most recent call last):
   File pyshell#15, line 1, inmodule
 m.group(0)
AttributeError: 'NoneType' object has no attribute 'group'

How do I implement a regex on a multiline string?  I thought this
might work but there's problem:


p = re.compile('(ab*)(sss)', re.S)
m = p.match( 'ab\nsss' )
m.group(0)

Traceback (most recent call last):
   File pyshell#26, line 1, inmodule
 m.group(0)
AttributeError: 'NoneType' object has no attribute 'group'




Thanks for the newbie regex help, Lee


for multiline, I use re.DOTALL

I do not know match(), findall is pretty efficient :
my = a href=\hello world.com\LINK/a
res = re.findall((.*?),my)
 res
['LINK']

Dorian
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python's regular expression help

2010-04-29 Thread MRAB

goldtech wrote:

Hi,
Trying to start out with simple things but apparently there's some
basics I need help with. This works OK:

import re
p = re.compile('(ab*)(sss)')
m = p.match( 'absss' )
m.group(0)

'absss'

m.group(1)

'ab'

m.group(2)

'sss'
...
But two questions:

How can I operate a regex on a string variable?
I'm doing something wrong here:


f=r'abss'
f

'abss'

m = p.match( f )
m.group(0)

Traceback (most recent call last):
  File pyshell#15, line 1, in module
m.group(0)
AttributeError: 'NoneType' object has no attribute 'group'


Look closely: the regex contains 3 letter 's', but the string referred
to by f has only 2.


How do I implement a regex on a multiline string?  I thought this
might work but there's problem:


p = re.compile('(ab*)(sss)', re.S)
m = p.match( 'ab\nsss' )
m.group(0)

Traceback (most recent call last):
  File pyshell#26, line 1, in module
m.group(0)
AttributeError: 'NoneType' object has no attribute 'group'

Thanks for the newbie regex help, Lee


The string contains a newline between the 'b' and the 's', but the regex
isn't expecting any newline (or any other character) between the 'b' and
the 's', hence no match.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python's regular expression help

2010-04-29 Thread Tim Chase

On 04/29/2010 01:00 PM, goldtech wrote:

Trying to start out with simple things but apparently there's some
basics I need help with. This works OK:

import re
p = re.compile('(ab*)(sss)')
m = p.match( 'absss' )



f=r'abss'
f

'abss'

m = p.match( f )
m.group(0)

Traceback (most recent call last):
   File pyshell#15, line 1, inmodule
 m.group(0)
AttributeError: 'NoneType' object has no attribute 'group'


'absss' != 'abss'

Your regexp looks for 3 s, your f contains only 2.  So the 
regexp object doesn't, well, match.  Try


  f = 'absss'

and it will work.  As an aside, using raw-strings for this text 
doesn't change anything, but if you want, you _can_ write it as


  f = r'absss'

if it will make you feel better :)


How do I implement a regex on a multiline string?  I thought this
might work but there's problem:


p = re.compile('(ab*)(sss)', re.S)
m = p.match( 'ab\nsss' )
m.group(0)

Traceback (most recent call last):
   File pyshell#26, line 1, inmodule
 m.group(0)
AttributeError: 'NoneType' object has no attribute 'group'


Well, it depends on what you want to do -- regexps are fairly 
precise, so if you want to allow whitespace between the two, you 
can use


  r = re.compile(r'(ab*)\s*(sss)')

If you want to allow whitespace anywhere, it gets uglier, and 
your capture/group results will contain that whitespace:


  r'(a\s*b*)\s*(s\s*s\s*s)'

Alternatively, if you don't want to allow arbitrary whitespace 
but only newlines, you can use \n* instead of \s*


-tkc



--
http://mail.python.org/mailman/listinfo/python-list


Re: Python's regular expression help

2010-04-29 Thread goldtech
On Apr 29, 11:49 am, Tim Chase python.l...@tim.thechases.com wrote:
 On 04/29/2010 01:00 PM, goldtech wrote:

  Trying to start out with simple things but apparently there's some
  basics I need help with. This works OK:
  import re
  p = re.compile('(ab*)(sss)')
  m = p.match( 'absss' )

  f=r'abss'
  f
  'abss'
  m = p.match( f )
  m.group(0)
  Traceback (most recent call last):
     File pyshell#15, line 1, inmodule
       m.group(0)
  AttributeError: 'NoneType' object has no attribute 'group'

 'absss' != 'abss'

 Your regexp looks for 3 s, your f contains only 2.  So the
 regexp object doesn't, well, match.  Try

    f = 'absss'

 and it will work.  As an aside, using raw-strings for this text
 doesn't change anything, but if you want, you _can_ write it as

    f = r'absss'

 if it will make you feel better :)

  How do I implement a regex on a multiline string?  I thought this
  might work but there's problem:

  p = re.compile('(ab*)(sss)', re.S)
  m = p.match( 'ab\nsss' )
  m.group(0)
  Traceback (most recent call last):
     File pyshell#26, line 1, inmodule
       m.group(0)
  AttributeError: 'NoneType' object has no attribute 'group'

 Well, it depends on what you want to do -- regexps are fairly
 precise, so if you want to allow whitespace between the two, you
 can use

    r = re.compile(r'(ab*)\s*(sss)')

 If you want to allow whitespace anywhere, it gets uglier, and
 your capture/group results will contain that whitespace:

    r'(a\s*b*)\s*(s\s*s\s*s)'

 Alternatively, if you don't want to allow arbitrary whitespace
 but only newlines, you can use \n* instead of \s*

 -tkc

Yes, most of my problem is w/my patterns not w/any python re syntax.

I thought re.S will take a multiline string with any spaces or
newlines and make it appear as one line to the regex. Make /n be
ignored in a way...still playing w/it. Thanks for the help!
-- 
http://mail.python.org/mailman/listinfo/python-list