On Apr 27, 8:26 am, Michael Hoffman <[EMAIL PROTECTED]> wrote:
> proctorwrote:
> > On Apr 27, 1:33 am, Paul McGuire <[EMAIL PROTECTED]> wrote:
> >> On Apr 27, 1:33 am,proctor<[EMAIL PROTECTED]> wrote:
> >>> rx_test = re.compile('/x([^x])*x/')
> >>> s = '/xabcx/'
> >>> if rx_test.findall(s):
> >>>         print rx_test.findall(s)
> >>> ============
> >>> i expect the output to be ['abc'] however it gives me only the last
> >>> single character in the group: ['c']
>
> >> As Josiah already pointed out, the * needs to be inside the grouping
> >> parens.
> > so my question remains, why doesn't the star quantifier seem to grab
> > all the data.
>
> Because you didn't use it *inside* the group, as has been said twice.
> Let's take a simpler example:
>
>  >>> import re
>  >>> text = "xabc"
>  >>> re_test1 = re.compile("x([^x])*")
>  >>> re_test2 = re.compile("x([^x]*)")
>  >>> re_test1.match(text).groups()
> ('c',)
>  >>> re_test2.match(text).groups()
> ('abc',)
>
> There are three places that match ([^x]) in text. But each time you find
> one you overwrite the previous example.
>
> > isn't findall() intended to return all matches?
>
> It returns all matches of the WHOLE pattern, /x([^x])*x/. Since you used
> a grouping parenthesis in there, it only returns one group from each
> pattern.
>
> Back to my example:
>
>  >>> re_test1.findall("xabcxaaaxabc")
> ['c', 'a', 'c']
>
> Here it finds multiple matches, but only because the x occurs multiple
> times as well. In your example there is only one match.
>
> > i would expect either 'abc' or 'a', 'b', 'c' or at least just
> > 'a' (because that would be the first match).
>
> You are essentially doing this:
>
> group1 = "a"
> group1 = "b"
> group1 = "c"
>
> After those three statements, you wouldn't expect group1 to be "abc" or
> "a". You'd expect it to be "c".
> --
> Michael Hoffman

thank you all again for helping to clarify this for me.  of course you
were exactly right, and the problem lay not with python or the text,
but with me.  i mistakenly understood the text to be attempting to
capture the C style comment, when in fact it was merely matching it.

apologies.

sincerely,
proctor

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to