"but this error doesn't give me any proper server response", because there is no response due to an exception. "it generally happens continuously until it changes its proxy", this is normal because there are several requests on the downloader. "and only could be caught in def process_exception() in the middleware", you can also use errback on your request and catch it on your spider.
El jueves, 27 de noviembre de 2014 23:54:25 UTC-2, Sungmin Lee escribió: > > It mostly gives me sane responses such as 200(mostly), 302, 404, 502, etc. > but this error doesn't give me any proper server response, and only could > be caught in def process_exception() in the middleware. > and when this error occurs, it generally happens continuously until it > changes its proxy. > > On Thursday, November 27, 2014 12:30:38 AM UTC+9, Nicolás Alejandro > Ramírez Quiros wrote: >> >> Are your spiders getting stalled as well? >> >> El martes, 25 de noviembre de 2014 04:56:43 UTC-2, Sungmin Lee escribió: >>> >>> Hi all, >>> >>> I'm using proxy to crawl a site, and it randomly gives me a bunch of >>> error messages like: >>> >>> 2014-10-20 05:26:10-0800 [foo.bar] DEBUG: Retrying <GET >>> http://foo.bar/foobar> (failed 1 times): >>> [<twisted.python.failure.Failure <class >>> 'twisted.internet.error.ConnectionDone'>>, <twisted.python.failure.Failure >>> <class 'twisted.web.http._DataLoss'>>] >>> >>> I think this mostly happens with a bad proxy but it sometimes occurs >>> with a healthy proxy as well. >>> The thing is that this not only skips the url entry to crawl. >>> >>> I implemented a middleware(especially for proxy and retry middlewares), >>> but it's really hard to catch this exception on scrapy level. >>> >>> Has anyone had the same issue? >>> >>> Thanks! >>> >> -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
