Most likely they are blocking your User-Agent (or possibly IP). This is a basic anti-scraping measure, and easily avoidable by altering your scrapy UA.
On Fri, Sep 18, 2015 at 11:44 AM, Ricky Huang <[email protected]> wrote: > Hello all, > > I am building a scraper for Kickass Torrents (kat.cr) for scrapping > torrent information and etc. I tested it via the shell interface and > Scrapy keeps erring out: > > >>> fetch(" >> https://kat.cr/south-park-s19e01-720p-hdtv-x264-killers-rartv-t11271450.html >> ") >> Traceback (most recent call last): >> File "<console>", line 1, in <module> >> File "/usr/local/lib/python2.7/site-packages/scrapy/shell.py", line 90, >> in fetch >> reactor, self._schedule, request, spider) >> File >> "/usr/local/lib/python2.7/site-packages/twisted/internet/threads.py", line >> 122, in blockingCallFromThread >> result.raiseException() >> File "<string>", line 2, in raiseException >> TCPTimedOutError: TCP connection timed out: 60: Operation timed out. > > > However, I am able to browse the site via a web browser, so it's > definitely not the site's fault. > > Can anyone shed a light on this issue for me? > > > Thanks in advance. > > -- > You received this message because you are subscribed to the Google Groups > "scrapy-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/scrapy-users. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
