Help with link extractor

Tim Fitzhardinge Sat, 05 Nov 2016 03:02:30 -0700

Hi

I want to crawl through extracted url links on a webpage. I was trying to 
incorporate creating a spider template within the code rather than the 
genspider command line in the prompt


The code with dummy urls are below.

What am I missing with my code. 

Thank you for your help.


import scrapy
from scrapy.spiders import BaseSpider
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule
from bid.items import BidItem

class testSpider(CrawlSpider):
    name = 'test'
    allowed_domains = ['example.com.au']
    start_urls = ['http://www.example.com.au']

    rules = [
        Rule(LinkExtractor(allow_domains(example.com.au), follow=True, 
callback='parse_product'),
        ]

    def parse_product(self, response):
        title = response.xpath('//title/text()').extract()[0]
        print title


-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Help with link extractor

Reply via email to