i find it missing from scrapy.selector import Selector
On Thursday, June 2, 2016 at 6:01:42 PM UTC+8, [email protected] wrote: > > Error codes mentions "global name 'Selector' is not defined" so maybe > there is a problem withe line of code after "def parse (self,response) > > On Thursday, 2 June 2016 10:15:32 UTC+1, meInvent bbird wrote: >> >> >> >> >> runCallbacks >> current.result = callback(current.result, *args, **kw) >> File "C:\Users\martlee2\Downloads\google.py", line 24, in parse >> sel = Selector(response) >> NameError: global name 'Selector' is not defined >> 2016-06-02 14:46:23 [scrapy] INFO: Closing spider (finished) >> 2016-06-02 14:46:23 [scrapy] INFO: Dumping Scrapy stats: >> {'downloader/request_bytes': 491, >> 'downloader/request_count': 2, >> 'downloader/request_method_count/GET': 2, >> 'downloader/response_bytes': 13623, >> 'downloader/response_count': 2, >> 'downloader/response_status_count/200': 1, >> 'downloader/response_status_count/302': 1, >> 'finish_reason': 'finished', >> 'finish_time': datetime.datetime(2016, 6, 2, 6, 46, 23, 25000), >> 'log_count/DEBUG': 3, >> 'log_count/ERROR': 1, >> 'log_count/INFO': 7, >> 'response_received_count': 1, >> 'scheduler/dequeued': 2, >> 'scheduler/dequeued/memory': 2, >> 'scheduler/enqueued': 2, >> 'scheduler/enqueued/memory': 2, >> 'spider_exceptions/NameError': 1, >> 'start_time': datetime.datetime(2016, 6, 2, 6, 46, 22, 3000)} >> 2016-06-02 14:46:23 [scrapy] INFO: Spider closed (finished) >> >> C:\Users\martlee2\Downloads>scrapy runspider google.py > error.txt >> 2016-06-02 14:49:25 [scrapy] INFO: Scrapy 1.1.0 started (bot: scrapybot) >> 2016-06-02 14:49:25 [scrapy] INFO: Overridden settings: {} >> 2016-06-02 14:49:25 [scrapy] INFO: Enabled extensions: >> ['scrapy.extensions.logstats.LogStats', >> 'scrapy.extensions.telnet.TelnetConsole', >> 'scrapy.extensions.corestats.CoreStats'] >> 2016-06-02 14:49:25 [scrapy] INFO: Enabled downloader middlewares: >> ['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', >> 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', >> 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', >> 'scrapy.downloadermiddlewares.retry.RetryMiddleware', >> 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', >> 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', >> 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', >> 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', >> 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', >> 'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware', >> 'scrapy.downloadermiddlewares.stats.DownloaderStats'] >> 2016-06-02 14:49:25 [scrapy] INFO: Enabled spider middlewares: >> ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', >> 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', >> 'scrapy.spidermiddlewares.referer.RefererMiddleware', >> 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', >> 'scrapy.spidermiddlewares.depth.DepthMiddleware'] >> 2016-06-02 14:49:25 [scrapy] INFO: Enabled item pipelines: >> [] >> 2016-06-02 14:49:25 [scrapy] INFO: Spider opened >> 2016-06-02 14:49:25 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), >> scraped 0 >> tems (at 0 items/min) >> 2016-06-02 14:49:25 [scrapy] DEBUG: Telnet console listening on >> 127.0.0.1:6023 >> 2016-06-02 14:49:26 [scrapy] DEBUG: Redirecting (302) to <GET >> https://www.googl >> .com.hk/search?q=robot&gws_rd=cr&ei=oNZPV_voLoWX0gSohahw> from <GET >> https://www >> google.com/search?q=robot> >> 2016-06-02 14:49:26 [scrapy] DEBUG: Crawled (200) <GET >> https://www.google.com.h >> /search?q=robot&gws_rd=cr&ei=oNZPV_voLoWX0gSohahw> (referer: None) >> 2016-06-02 14:49:26 [scrapy] ERROR: Spider error processing <GET >> https://www.go >> gle.com.hk/search?q=robot&gws_rd=cr&ei=oNZPV_voLoWX0gSohahw> (referer: >> None) >> Traceback (most recent call last): >> File "c:\python27\lib\site-packages\twisted\internet\defer.py", line >> 588, in >> runCallbacks >> current.result = callback(current.result, *args, **kw) >> File "C:\Users\martlee2\Downloads\google.py", line 24, in parse >> sel = Selector(response) >> NameError: global name 'Selector' is not defined >> 2016-06-02 14:49:27 [scrapy] INFO: Closing spider (finished) >> 2016-06-02 14:49:27 [scrapy] INFO: Dumping Scrapy stats: >> {'downloader/request_bytes': 489, >> 'downloader/request_count': 2, >> 'downloader/request_method_count/GET': 2, >> 'downloader/response_bytes': 13446, >> 'downloader/response_count': 2, >> 'downloader/response_status_count/200': 1, >> 'downloader/response_status_count/302': 1, >> 'finish_reason': 'finished', >> 'finish_time': datetime.datetime(2016, 6, 2, 6, 49, 27, 11000), >> 'log_count/DEBUG': 3, >> 'log_count/ERROR': 1, >> 'log_count/INFO': 7, >> 'response_received_count': 1, >> 'scheduler/dequeued': 2, >> 'scheduler/dequeued/memory': 2, >> 'scheduler/enqueued': 2, >> 'scheduler/enqueued/memory': 2, >> 'spider_exceptions/NameError': 1, >> 'start_time': datetime.datetime(2016, 6, 2, 6, 49, 25, 877000)} >> 2016-06-02 14:49:27 [scrapy] INFO: Spider closed (finished) >> >> C:\Users\martlee2\Downloads>scrapy runspider google.py >> error.txt >> 2016-06-02 14:49:44 [scrapy] INFO: Scrapy 1.1.0 started (bot: scrapybot) >> 2016-06-02 14:49:44 [scrapy] INFO: Overridden settings: {} >> 2016-06-02 14:49:44 [scrapy] INFO: Enabled extensions: >> ['scrapy.extensions.logstats.LogStats', >> 'scrapy.extensions.telnet.TelnetConsole', >> 'scrapy.extensions.corestats.CoreStats'] >> 2016-06-02 14:49:44 [scrapy] INFO: Enabled downloader middlewares: >> ['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', >> 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', >> 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', >> 'scrapy.downloadermiddlewares.retry.RetryMiddleware', >> 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', >> 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', >> 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', >> 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', >> 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', >> 'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware', >> 'scrapy.downloadermiddlewares.stats.DownloaderStats'] >> 2016-06-02 14:49:44 [scrapy] INFO: Enabled spider middlewares: >> ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', >> 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', >> 'scrapy.spidermiddlewares.referer.RefererMiddleware', >> 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', >> 'scrapy.spidermiddlewares.depth.DepthMiddleware'] >> 2016-06-02 14:49:44 [scrapy] INFO: Enabled item pipelines: >> [] >> 2016-06-02 14:49:44 [scrapy] INFO: Spider opened >> 2016-06-02 14:49:44 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), >> scraped 0 >> tems (at 0 items/min) >> 2016-06-02 14:49:44 [scrapy] DEBUG: Telnet console listening on >> 127.0.0.1:6023 >> 2016-06-02 14:49:45 [scrapy] DEBUG: Redirecting (302) to <GET >> https://www.googl >> .com.hk/search?q=robot&gws_rd=cr&ei=s9ZPV8OhMYjV0ATGkYngDw> from <GET >> https://w >> w.google.com/search?q=robot> >> 2016-06-02 14:49:45 [scrapy] DEBUG: Crawled (200) <GET >> https://www.google.com.h >> /search?q=robot&gws_rd=cr&ei=s9ZPV8OhMYjV0ATGkYngDw> (referer: None) >> 2016-06-02 14:49:45 [scrapy] ERROR: Spider error processing <GET >> https://www.go >> gle.com.hk/search?q=robot&gws_rd=cr&ei=s9ZPV8OhMYjV0ATGkYngDw> (referer: >> None) >> Traceback (most recent call last): >> File "c:\python27\lib\site-packages\twisted\internet\defer.py", line >> 588, in >> runCallbacks >> current.result = callback(current.result, *args, **kw) >> File "C:\Users\martlee2\Downloads\google.py", line 24, in parse >> sel = Selector(response) >> NameError: global name 'Selector' is not defined >> 2016-06-02 14:49:45 [scrapy] INFO: Closing spider (finished) >> 2016-06-02 14:49:45 [scrapy] INFO: Dumping Scrapy stats: >> {'downloader/request_bytes': 491, >> 'downloader/request_count': 2, >> 'downloader/request_method_count/GET': 2, >> 'downloader/response_bytes': 13444, >> 'downloader/response_count': 2, >> 'downloader/response_status_count/200': 1, >> 'downloader/response_status_count/302': 1, >> 'finish_reason': 'finished', >> 'finish_time': datetime.datetime(2016, 6, 2, 6, 49, 45, 784000), >> 'log_count/DEBUG': 3, >> 'log_count/ERROR': 1, >> 'log_count/INFO': 7, >> 'response_received_count': 1, >> 'scheduler/dequeued': 2, >> 'scheduler/dequeued/memory': 2, >> 'scheduler/enqueued': 2, >> 'scheduler/enqueued/memory': 2, >> 'spider_exceptions/NameError': 1, >> 'start_time': datetime.datetime(2016, 6, 2, 6, 49, 44, 925000)} >> 2016-06-02 14:49:45 [scrapy] INFO: Spider closed (finished) >> >> >> >> >> # -*- coding: utf-8 -*- >> import scrapy >> import re >> import os >> import sys >> import json >> >> #from scrapy.spider import Spider >> #from scrapy.selector import Selector >> >> #scrapy startproject google >> #scrapy genspider google google.com >> #scrapy runspider google.py >> >> class GoogleSpider(scrapy.Spider): >> name = 'google' >> custom_settings = { >> 'NUM_SEARCH_RESULTS' : '30', >> 'SENTENCE_LIMIT' : '50', >> 'MIN_WORD_IN_SENTENCE' : '7', >> 'ENABLE_DATE_SORT' : '0', >> } >> #set the search result here >> name = 'Google search' >> allowed_domains = ['www.google.com'] >> start_urls = ['https://www.google.com/search?q=robot'] >> >> def parse(self, response): >> sel = Selector(response) >> google_search_links_list = sel.xpath('//h3/a/@href').extract() >> google_search_links_list = [re.search('q=(.*)&sa',n).group(1) for n in >> google_search_links_list] >> >> ## Dump the output to json file >> with open(r'C:\Users\hello\Downloads\googleresult.txt', "a") as outfile: >> json.dump({'output_url':google_search_links_list}, outfile, indent=4) >> > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
