On Sep 6, 10:06 pm, "Mark Tolonen" <metolone+gm...@gmail.com> wrote:
> <gburde...@gmail.com> wrote in message
> news:f98a6057-c35f-4843-9efb-7f36b05b6...@g19g2000yqo.googlegroups.com...
> > If I do this:
> > import re
> > a=re.search(r'hello.*?money',  'hello how are you hello funny money')
> > I would expect a.group(0) to be "hello funny money", since .*? is a
> > non-greedy match. But instead, I get the whole sentence, "hello how
> > are you hello funny money".
> > Is this expected behavior? How can I specify the correct regexp so
> > that I get "hello funny money" ?
> A non-greedy match matches the fewest characters before matching the text
> *after* the non-greedy match.  For example:
> >>> import re
> >>> a=re.search(r'hello.*?money','hello how are you hello funny money and
> >>> more money')
> >>> a.group(0)  # non-greedy stops at the first money
> 'hello how are you hello funny money'>>> a=re.search(r'hello.*money','hello 
> how are you hello funny money and
> >>> more money')
> >>> a.group(0)  # greedy keeps going to the last money
> 'hello how are you hello funny money and more money'
> This is why it is difficult to use regular expressions to match nested
> objects like parentheses or XML tags.  In your case you'll need something
> extra to not match the first hello.
> >>> a=re.search(r'(?<!^)hello.*?money','hello how are you hello funny
> >>> money')
> >>> a.group(0)
> 'hello funny money'
> -Mark

I see now. I also understand r's response. But what if there are many
"hello"'s before "money," and I don't know how many there are? In
other words, I want to find every occurrence of "money," and for each
occurrence, I want to scan in the reverse (left) direction to the
closest occurrence of "hello." How can this be done?

Reply via email to