Hey, Tim. You have to change you code and find the next page selector. You can use scrapy shell[1] to search for next page selector.
I hope this help you. Good luck. [1] https://doc.scrapy.org/en/latest/topics/shell.html On Sun, Oct 9, 2016 at 8:19 AM, Tim Fitzhardinge <[email protected]> wrote: > Hi > > I'm new to web crawling. I successfully ran the main tutorial under a > myspider.py. Now how do I crawl every page from a website. As I tried > changing in the start_urls to take any home page of a website however it > only crawled 1 page. > > For example say crawl every page from http://www.asx.com.au website. I > believe there will be 10,000+ pages. Thank you > > Enter code here...import scrapy > > > > class *BlogSpider*(scrapy.Spider): > > name = 'blogspider' > > start_urls = ['https://blog.scrapinghub.com'] > > > > def *parse*(self, response): > > for title in response.css('h2.entry-title'): > > yield {'title': title.css('a ::text').extract_first()} > > > > next_page = response.css('div.prev-post > a ::attr(href)' > ).extract_first() > > if next_page: > > yield scrapy.Request(response.urljoin(next_page), > callback=self.parse) > > > -- > You received this message because you are subscribed to the Google Groups > "scrapy-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/scrapy-users. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
