Re: [Tutor] RE Silliness

2009-01-05 Thread bob gailer

Omer wrote:

Bob, I tried your way.

>>> import re
>>> urlMask = r"http://[\w\Q./\?=\R]+()?"
>>> text=u"Not working 
examplehttp://this.is.a/url?header=nullAnd another 
linehttp://and.another.url";

>>> re.findall(urlMask,text)
[u'', u'']


Oops I failed to notice you were using findall. Kent explained it.

Another way to fix it is to make () a non-group: (?:)

--
Bob Gailer
Chapel Hill NC 
919-636-4239


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] RE Silliness

2009-01-05 Thread Kent Johnson
On Mon, Jan 5, 2009 at 11:16 AM, Omer  wrote:
> Bob, I tried your way.
>
 import re
 urlMask = r"http://[\w\Q./\?=\R]+()?"
 text=u"Not working examplehttp://this.is.a/url?header=nullAnd
 another linehttp://and.another.url";
 re.findall(urlMask,text)
> [u'', u'']
>
> spir, I did understand it. What I'm not understanding is why isn't this
> working.

There is a bit of a gotcha in re.findall() - its behaviour changes
depending on whether there are groups in the re. If the re contains
groups, re.findall() only returns the matches for the groups.

If you enclose the entire re in parentheses (making it a group) you
get a better result:
In [2]: urlMask = r"(http://[\w\Q./\?=\R]+()?)"

In [3]: text=u"Not working
examplehttp://this.is.a/url?header=nullAnd another
linehttp://and.another.url";

In [4]: re.findall(urlMask,text)
Out[4]:
[(u'http://this.is.a/url?header=null', u''),
 (u'http://and.another.url', u'')]

You can also use non-grouping parentheses around the :
In [5]: urlMask = r"http://[\w\Q./\?=\R]+(?:)?"

In [6]: re.findall(urlMask,text)
Out[6]: [u'http://this.is.a/url?header=null', u'http://and.another.url']

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] RE Silliness

2009-01-05 Thread Omer
Bob, I tried your way.

>>> import re
>>> urlMask = r"http://[\w\Q./\?=\R]+()?"
>>> text=u"Not working examplehttp://this.is.a/url?header=nullAnd
another linehttp://and.another.url";
>>> re.findall(urlMask,text)
[u'', u'']

spir, I did understand it. What I'm not understanding is why isn't this
working.

(Whereas,

>>> OldurlMask = r"http://[\w\Q./\?=\R]+";   #Not f-ing working.
>>> re.findall(OldurlMask,text)
['http://this.is.a/url?header=null', 'http://and.another.url']

does work. Which is what had me frowning.
Also,
this ugly url mask is working:

>>> UglyUrlMask = r"(http://[\w\Q./\?=\R]+|http://[\w\Q./\?=\R]+)"
>>> re.findall(UglyUrlMask,text)
['http://this.is.a/url?header=null', 'http://and.another.url']

Anyone?)

On Mon, Jan 5, 2009 at 12:08 AM, spir  wrote:

> On Sun, 04 Jan 2009 14:09:53 -0500
> bob gailer  wrote:
>
> > Omer wrote:
> > > I'm sorry, burrowed into the reference until my eyes bled.
> > >
> > > What I want is to have a regular expression with an optional ending of
> > > ""
> > >
> > > (For those interested,
> > > urlMask = r"http://[\w\Q./\?=\R]+";
> > > is ther version w/o the optional  ending.)
> > >
> > > I can't seem to make a string optional- only a single character via
> > > []s. I for some reason thuoght it'll be ()s, but no help there- it
> > > just returns only the . Anybody?
> > >
> > urlMask = r"http://[\w\Q./\?=\R]+()?"
> >
> >  From the docs: ? Causes the resulting RE to match 0 or 1 repetitions of
> > the preceding RE. ab? will match either 'a' or 'ab'.
> >
> >
>
> Maybe Omer had not noted that a sub-expression can be grouped in () so that
> an operator (?+*) applies on the whole group.
> Denis
>
> --
> la vida e estranya
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] RE Silliness

2009-01-04 Thread spir
On Sun, 04 Jan 2009 14:09:53 -0500
bob gailer  wrote:

> Omer wrote:
> > I'm sorry, burrowed into the reference until my eyes bled.
> >
> > What I want is to have a regular expression with an optional ending of 
> > ""
> >
> > (For those interested,
> > urlMask = r"http://[\w\Q./\?=\R]+";
> > is ther version w/o the optional  ending.)
> >
> > I can't seem to make a string optional- only a single character via 
> > []s. I for some reason thuoght it'll be ()s, but no help there- it 
> > just returns only the . Anybody?
> >
> urlMask = r"http://[\w\Q./\?=\R]+()?"
> 
>  From the docs: ? Causes the resulting RE to match 0 or 1 repetitions of 
> the preceding RE. ab? will match either 'a' or 'ab'.
> 
> 

Maybe Omer had not noted that a sub-expression can be grouped in () so that an 
operator (?+*) applies on the whole group.
Denis

--
la vida e estranya
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] RE Silliness

2009-01-04 Thread bob gailer

Omer wrote:

I'm sorry, burrowed into the reference until my eyes bled.

What I want is to have a regular expression with an optional ending of 
""


(For those interested,
urlMask = r"http://[\w\Q./\?=\R]+";
is ther version w/o the optional  ending.)

I can't seem to make a string optional- only a single character via 
[]s. I for some reason thuoght it'll be ()s, but no help there- it 
just returns only the . Anybody?



urlMask = r"http://[\w\Q./\?=\R]+()?"

From the docs: ? Causes the resulting RE to match 0 or 1 repetitions of 
the preceding RE. ab? will match either 'a' or 'ab'.



--
Bob Gailer
Chapel Hill NC 
919-636-4239


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] RE Silliness

2009-01-04 Thread Omer
I'm sorry, burrowed into the reference until my eyes bled.

What I want is to have a regular expression with an optional ending of
""

(For those interested,
urlMask = r"http://[\w\Q./\?=\R]+";
is ther version w/o the optional  ending.)

I can't seem to make a string optional- only a single character via []s. I
for some reason thuoght it'll be ()s, but no help there- it just returns
only the . Anybody?

Thx,
Omer.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor