On Thursday, 19 April 2012 07:11:54 UTC+1, Sania wrote: > Hi, > So I am trying to get the number of casualties in a text. After 'death > toll' in the text the number I need is presented as you can see from > the variable called text. Here is my code > I'm pretty sure my regex is correct, I think it's the group part > that's the problem. > I am using nltk by python. Group grabs the string in parenthesis and > stores it in deadnum and I make deadnum into a list. > > text="accounts put the death toll at 637 and those missing at > 653 , but the total number is likely to be much bigger" > dead=re.match(r".*death toll.*(\d[,\d\.]*)", text) > deadnum=dead.group(1) > deaths.append(deadnum) > print deaths > > Any help would be appreciated, > Thank you, > Sania
Or just don't fully rely on a regex. I would, for time, and the little sanity I believe I have left, would just do something like: death_toll = re.search(r'death toll.*\d+', text).group().rsplit(' ', 1)[1] hth, Jon. -- http://mail.python.org/mailman/listinfo/python-list