scrapy: how to catch the unexpected case of return a response with Partial html body and status 200

bing Wed, 09 Jul 2014 04:26:36 -0700


During my crawling, some pages return a response with partial html body and 
status 200, after I compare the response body with the one I open in 
browser, the former one miss something. How can I catch this unexpected 
partial response body case in spider or in download middleware?


Below is about the log example：

2014-01-23 16:31:53+0100 [filmweb_multi] DEBUG: Crawled (408) 
http://www.filmweb.pl/film/Labirynt-2013-507169/photos> (referer: 
http://www.filmweb.pl/film/Labirynt-2013-507169) ['*partial*']

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

scrapy: how to catch the unexpected case of return a response with Partial html body and status 200

Reply via email to