Hi there, Glad you're emailing the Scrapy list!
First question -- if you remove the `allow=` parameter, do things start to work? That is, if you send all responses to parse2, do things seem OK? With this question, I hope to make sure that the page is at least being parsed OK. I'm not sure I'll be able to answer/ask any follow-up questions, but I wanted to at least ask this one. Hopefully I will, but if I'm not around, please don't take it personally, merely that the situation that allowed me the time to write this particular reply didn't re-occur. Thanks and good luck, Asheesh. On Sat, May 16, 2015 at 10:09 PM, <[email protected]> wrote: > Hello > > I have a very simple Scrapy CrawlSpider and I have given it a simple rule > "Crawl/Follow any link that contains '/search/listings'". But the spider is > not crawling/following any of these links? > > I have confirmed that the start url contains many links with the href > '/search/listings' so the links are there. > > Any idea whats going wrong? > > class MySpider(CrawlSpider): > > name = "MySpider" > allowed_domains = ["mywebsite.com"] > start_urls = ["http://www.mywebsite.com/results"] > rules = [Rule(LinkExtractor(allow=['/search/listings(.*)']), callback= > "parse2")] > > def parse2(self, response): > > # This function is never called > log.start("log.txt") > log.msg("Page crawled: " + response.url) > > The start url "http://www.mywebsite.com/results" contains these links > that I want the rule to apply to: > > <a href='/search/listings?clue=healthcare&eventType=sort&p=2' > class='button button-pagination' data-page='2' >2</a> > <a href='/search/listings?clue=healthcare&eventType=sort&p=3' > class='button button-pagination' data-page='3' >3</a> > <a href='/search/listings?clue=healthcare&eventType=sort&p=4' > class='button button-pagination' data-page='4' >4</a> > > > -- > You received this message because you are subscribed to the Google Groups > "scrapy-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/scrapy-users. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
