Hi Sean Keane, I believe you need to tell us a bit more on the type of crawl you are doing. Is it a broad crawl with lots of domains? Is it a CrawlSpider with rules that can pick up a lot of pages?
What about the download rate: do you see it stable or does the crawl slow down? While running the crawl, if you're on Python 2, you could also check with the telnet console what going on https://doc.scrapy.org/en/latest/topics/telnetconsole.html Hope this helps, /Paul On Thursday, December 8, 2016 at 4:30:22 AM UTC+1, Sean Keane wrote: > > I have a spider that I created that use splash and it seems to never > complete, ie it runs for two days and thenI finally stop it. > > I have the following settings for my spider: > > SPIDER_MIDDLEWARES = { > 'scrapy_splash.SplashDeduplicateArgsMiddleware': 100 > } > > DUPEFILTER_CLASS = 'scrapy_splash.SplashAwareDupeFilter' > > > Can someone provide some advice on how I should debug the issue? > > Thanks > > Sean > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
