I have the following code to crawl some data but when i rum the spider it 
is not entering the parse function,
The code is as below
      from scrapy.item import Item, Field
      from scrapy.selector import Selector
      from scrapy.spider import BaseSpider
      from scrapy.selector import HtmlXPathSelector


      class MyItem(Item):
          reviewer_ranking = Field()
          print "asdadsa"


      class MySpider(BaseSpider):
          name = 'myspider'
          domain_name = ["amazon.com/gp/pdp/profile"]
          start_urls = 
["http://www.amazon.com/gp/pdp/profile/A28XDLTGHPIWE1";]
          print"*****"
          def parse(self, response):
              print"fggfggftgtr"
              sel = Selector(response)
              hxs = HtmlXPathSelector(response)
              item = MyItem()
              item["reviewer_ranking"] = 
hxs.select('//span[@class="a-size-small 
a-color-secondary"]/text()').extract()
              return item

The output screen looks like this.

asdadsa
*****
/home/raj/Documents/IIM A/Daily sales rank/Daily 
reviews/Reviews_scripts/Scripts_review/Reviews/Reviewer/crawler_reviewers_data.py:18:
 
ScrapyDeprecationWarning: crawler_reviewers_data.MySpider inherits from 
deprecated class scrapy.spider.BaseSpider, please inherit from 
scrapy.spider.Spider. (warning only on first subclass, there may be others)
  class MySpider(BaseSpider):
2014-06-24 19:41:38+0530 [scrapy] INFO: Scrapy 0.22.2 started (bot: 
scrapybot)
2014-06-24 19:41:38+0530 [scrapy] INFO: Optional features available: ssl, 
http11
2014-06-24 19:41:38+0530 [scrapy] INFO: Overridden settings: {}
2014-06-24 19:41:38+0530 [scrapy] INFO: Enabled extensions: LogStats, 
TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState
2014-06-24 19:41:38+0530 [scrapy] INFO: Enabled downloader middlewares: 
HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, 
RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, 
HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, 
HttpProxyMiddleware, ChunkedTransferMiddleware, DownloaderStats
2014-06-24 19:41:38+0530 [scrapy] INFO: Enabled spider middlewares: 
HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, 
UrlLengthMiddleware, DepthMiddleware
2014-06-24 19:41:38+0530 [scrapy] INFO: Enabled item pipelines: 
2014-06-24 19:41:38+0530 [myspider] INFO: Spider opened
2014-06-24 19:41:38+0530 [myspider] INFO: Crawled 0 pages (at 0 pages/min), 
scraped 0 items (at 0 items/min)
2014-06-24 19:41:38+0530 [scrapy] DEBUG: Telnet console listening on 
0.0.0.0:6027
2014-06-24 19:41:38+0530 [scrapy] DEBUG: Web service listening on 
0.0.0.0:6084
2014-06-24 19:41:38+0530 [myspider] DEBUG: Crawled (403) <GET 
http://www.amazon.com/gp/pdp/profile/A28XDLTGHPIWE1> (referer: None) 
['partial']
2014-06-24 19:41:38+0530 [myspider] INFO: Closing spider (finished)
2014-06-24 19:41:38+0530 [myspider] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 242,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 28486,
 'downloader/response_count': 1,
 'downloader/response_status_count/403': 1,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2014, 6, 24, 14, 11, 38, 696574),
 'log_count/DEBUG': 3,
 'log_count/INFO': 7,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2014, 6, 24, 14, 11, 38, 513615)}
2014-06-24 19:41:38+0530 [myspider] INFO: Spider closed (finished)


Please help me out, i am stuck
        

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to