Hi,
I want to extract final page data like titlename and description.
I made a spider code for cragslist website.
every page has 100 links and i got nexpage url with all links but i want to 
go every link and scrap the data from final page.
can any one help me to sort out my problem
my spider code is:


from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy.selector import HtmlXPathSelector
from scrapy.http import Request
from pagination.items import PaginationItem
class InfojobsSpider(CrawlSpider):
name = "check"
allowed_domains = ["craigslist.org"]
start_urls = [
"http://sfbay.craigslist.org/npo/";
]
rules = (
Rule(SgmlLinkExtractor(allow=(r'index.*?html'),restrict_xpaths=('//a[@class="button
 
next"]')), callback='parse_item', follow=True),
)
fname = 1
def parse_start_url(self, response):
return self.parse_item(response)

def parse_item(self, response):
hxs = HtmlXPathSelector(response)
titles = hxs.select('//span[@class="pl"]')
for title in titles:
title= title.select('a/@href').extract()

NOTE:
in title there is the link url comming now i want to go every url one by 
one and extract data from finalpage.
Please  help me.






-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to