Re: why does this call to re.findall() loop forever?

Terry Reedy Sun, 09 Nov 2008 15:43:00 -0800

[EMAIL PROTECTED] wrote:

Hi everyone,


I am using Python's re module to extract some data from html. The
following code never returns, and I was wondering if someone can
explain to me why. Is this a problem with my regexp (I tried really
hard to find it?)?

[snip] html/xml string

regexp = re.compile("<li class=\"post\".*?<h4 class=\"desc\"><a href=
\"(.*?)\" rel=\"nofollow\">(.*?)</a>.*?</div>\s*(?:<p class=\"notes
\">(.*?)</p>)?.*?<div class=\"meta\">(?:to ((?:<a class=\"tag\".*?> )
+))*.*?<span class=\"date\" title=\"(.*?)\">.*?</span>\s*</div>.*?</
li>", re.DOTALL)

re.findall(regexp, s)

Python have several modules for parsing and working with xml. Do younot know of them or is there some reason they won't work?


--
http://mail.python.org/mailman/listinfo/python-list

Re: why does this call to re.findall() loop forever?

Reply via email to