This is my seed url
: http://www.amazon.com/Kindle-eBooks/b?ie=UTF8&node=154606011
Im trying to fetch the pricing details by following the links in the seed
url..
This is my code :
class MySpider(CrawlSpider):
name = "scraper"
allowed_domains = ["amazon.com"]
start_urls =
["http://www.amazon.com/Kindle-eBooks/b?ie=UTF8&node=154606011"]
rules =
[Rule(SgmlLinkExtractor(allow=('.*?/\gp/\product.*?')),callback='parse_items')]
def parse_items(self, response):
sel=Selector(response)
items = []
item = AmazonScraper()
print 'inside'
print sel.css('#btAsinTitle::text').extract()
item ["title"] =
''.join(sel.css('#btAsinTitle::text').extract())
print '-----',item["title"]
item ["digitalprice"] =
''.join(sel.css('.digitalListPrice>.listprice::text').extract())
item["digitalprice"]=re.sub('\s+',' ',item["digitalprice"])
item ["listprice"] =
''.join(sel.css('.listPrice::text').extract())
item["listprice"]=re.sub('\s+',' ',item["listprice"])
item ["kindleprice"] =
''.join(sel.css('.priceLarge::text').extract())
item["kindleprice"]=re.sub('\s+',' ',item["kindleprice"])
items.append(item)
print items
return items
My code is not returning proper results..Can anyone give me a solution?
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.