I have resolved it with *errback*= callback that handles *http erors*, in the Request() instance constructor. Apparently in a Spider, inside a callback (other than *parse*) the response isnt defined when the http response is equal to *404*.
Le mercredi 16 avril 2014 22:32:01 UTC+1, Paul Tremberth a écrit : > > Hi Hakim, > > I'm not sure how you get this "instance" with attributes related to > errors. and you catching these through an errback? > > You can get non-200 responses via HttpError middleware (enabled by > default) and by defining an handle_httpstatus_list attribute to your spider > > Example: > > from scrapy.spider import Spider > > class ErrorSpider(Spider): > name = "testerror" > allowed_domains = ["dmoz.org"] > start_urls = [ > "http://www.dmoz.org/", > "http://www.dmoz.org/rererere/", > ] > handle_httpstatus_list = [404] > > def parse(self, response): > self.log("type: %s; status %d" % (type(response), response.status)) > > > > On Tuesday, April 15, 2014 4:51:23 PM UTC+2, Hakim Benoudjit wrote: >> >> hi guys, >> >> I have a little issue with reponse object inside a request callback when >> the page returns a 404: >> - If the page exists (http code:* 200*) response is of type >> *HtmlResponse*. >> - If the page returns 404, response is of type *instance *which >> contain some attriubtes related to error messages, and in this latter case, >> *status >> *isnt an attriburte of the *response *object. >> >> so I can know if the response *status *is *404*, only if I verify *response >> *object class (*HtmlResponse or **instance *). >> >> how do we know that a page returns *404 *if *response.status *isnt >> available as an attribute of *reponse *object ? >> > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
