[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>  My apologies, given that Google Groups  messes up the formatting, the
>  regexp should read
> 
>  regexp = re.compile("""<li class=\"post\".*?<h4 class=\"desc\"><a
>  href=
>  \"(.*?)\" rel=\"nofollow\">(.*?)</a>.*?</div>\s*(?:<p class=\"notes
>  \">(.*?)</p>)?.*?<div class=\"meta\">(?:to ((?:<a class=\"tag\".*?> )
>  +))*.*?<span class=\"date\" title=\"(.*?)\">.*?</span>\s*</div>.*?</
>  li>""", re.DOTALL)

Some regular expressions can't be searched in a reasonable length of
time.  Not sure whether this is your problem but it might be!  Search
for "exponential time regular expression" if you want some examples.

Eg http://bugs.python.org/issue1515829

I'd attack this problem using beatifulsoup probably rather than
regexps!

-- 
Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to