On 06/12/12 00:13, Ed Owens wrote:
 >>> str(string)
'[<div class="wx-timestamp">\n<div class="wx-subtitle
wx-timestamp">Updated: Dec 5, 2012, 5:08pm EST</div>\n</div>]'
 >>> m = re.search('":\b(\w+\s+\d+,\s+\d+,\s+\d+:\d+.m\s+\w+)<',
str(string))
 >>> print m
None
 >>>

I'm sort of embarrassed to ask this, but I've been staring at this
regular expression for hours and can't see why it doesn't work.

When using regex I always try the simplest things first.
Now, I'm not sure how much of the time element you want
but if its just the 'Dec 5, 2012, 5:08pm EST' bit

You can do it thusly:

>>> s = '[<div class="wx-timestamp">\n<div class="wx-subtitle wx-timestamp">Updated: Dec 5, 2012, 5:08pm EST</div>\n</div>]'

>>> m = re.search('Updated:(.+)<', s)
>>> m
<_sre.SRE_Match object at 0x7f48d412d5d0>
>>> m.groups()
(' Dec 5, 2012, 5:08pm EST',)
>>> m.groups()[0].strip()
'Dec 5, 2012, 5:08pm EST'

Now, that might be too simplistic for some of your other scenarios, but I'd be inclined to start small and work out rather than trying to be too generic too soon.

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to