please any one help me how to solve this??????????? On Wednesday, April 9, 2014 5:17:27 PM UTC+5:30, masroor javed wrote: > > Hi, > I want to extract final page data like titlename and description. > I made a spider code for cragslist website. > every page has 100 links and i got nexpage url with all links but i want > to go every link and scrap the data from final page. > can any one help me to sort out my problem > my spider code is: > > > from scrapy.contrib.spiders import CrawlSpider, Rule > from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor > from scrapy.selector import HtmlXPathSelector > from scrapy.http import Request > from pagination.items import PaginationItem > class InfojobsSpider(CrawlSpider): > name = "check" > allowed_domains = ["craigslist.org"] > start_urls = [ > "http://sfbay.craigslist.org/npo/" > ] > rules = ( > Rule(SgmlLinkExtractor(allow=(r'index.*?html'),restrict_xpaths=('//a[@class="button > > next"]')), callback='parse_item', follow=True), > ) > fname = 1 > def parse_start_url(self, response): > return self.parse_item(response) > > def parse_item(self, response): > hxs = HtmlXPathSelector(response) > titles = hxs.select('//span[@class="pl"]') > for title in titles: > title= title.select('a/@href').extract() > > NOTE: > in title there is the link url comming now i want to go every url one by > one and extract data from finalpage. > Please help me. > > > > > > >
-- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
