Re: Regular expressions, help?

azrazer Thu, 19 Apr 2012 06:23:18 -0700

Le 19/04/2012 14:02, Sania a écrit :

On Apr 19, 2:48 am, Jussi Piitulainen<jpiit...@ling.helsinki.fi>

[...]

  text="accounts put the death toll at 637 and those missing at
653 , but the total number is likely to be much bigger"
       dead=re.match(r".*death toll.*(\d[,\d\.]*)", text)
       deadnum=dead.group(1)
       deaths.append(deadnum)
       print deaths


It's the regexp. The .* after "death toll" each the input as far as it
can without making the whole match fail. The group matches only the
last digit in the text.

You could allow only non-digits before the number. Or you could look
up the variant of * that only matches as much as it must.


Hey Thanks,
So now my regex is

     dead=re.match(r".*death toll.{0,20}(\d[,\d\.]*)", text)

Hi,
But there, your regex matches :

<something>death toll<anything which length is <=20> followed by whatyou capture (which is made up of a digit, at least)

there are at least two issues here :
 - the number of characters between death toll and the figure may be > 20

- your {0,20} is greedy => .{0,20} matches as many as "." as it canAND one digit is matched by (\d[,\d\.]*), since your group captures adigit followed(OR NOT) by a digit, a comma, a dot=====> so " at 63" is sucked by .{0,20} and (\d[,\d\.]*) matchesthe remaining digit "7"


a solution would be to follow what Jussi suggested...
=> dead=re.match(r".*death toll\D*(\d*)", text)


But I only find 7 not 657. How is it that the group is only matching
the last digit?

=> .{,20} greed

The whole thing is parenthesis not just the last part. ?

yeah but only one digit remains when your group matches...

Good luck understanding regexes, it's a powerful tool ! :)

best,
azra.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expressions, help?

Reply via email to