Using XMLFeedSpider to parse XML and request a page for each node

Michael Puglisi Thu, 11 Aug 2016 14:35:06 -0700

Can someone provide me with a basic example using XMLFeedSpider to parse 
XML and request a page for each node?


The following is a simple boilerplate based on the 
official XMLFeedSpider example

from scrapy.spiders import XMLFeedSpider
from myproject.items import TestItem

class MySpider(XMLFeedSpider):
    name = 'feedforall'
    allowed_domains = ['feedforall.com']
    start_urls = ['http://www.feedforall.com/sample.xml']
    iterator = 'iternodes'
    itertag = 'item'

    def parse_node(self, response, node):
        item = TestItem()
        item['link'] = node.xpath('link').extract() # I would like to make a 
Request with this value and extract data from the HTML page Example value: 
http://www.feedforall.com/industry-solutions.htm
        return item


Related:

http://doc.scrapy.org/en/latest/topics/spiders.html#xmlfeedspider

I've searched for a working example, but everything I found seems to be old 
and not functional.

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Using XMLFeedSpider to parse XML and request a page for each node

Reply via email to