The answer that keeps on giving - thank you 3 years later!

On Saturday, June 30, 2012 at 11:23:56 PM UTC-4, Steven Almeroth wrote:
>
> Try this:
>
> rules = (Rule(SgmlLinkExtractor(allow=('//*[@id="Form"]'))),)
>
>
> notice the extra comma near the end.
>
> On Friday, June 29, 2012 5:47:35 PM UTC-5, Scrapy_lover wrote:
>>
>> When trying to crawl a website ,i got  the following error  >> any help 
>> please ?
>>
>> *script code  *
>>
>> ----------------------------------------------------------------------------------------------------------
>>
>>> from scrapy.contrib.spiders import CrawlSpider, Rule
>>>> from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
>>>> from scrapy.selector import HtmlXPathSelector
>>>> from scrapy.item import Item
>>>>
>>>> class MySpider(CrawlSpider):
>>>>     name = 'example.com'
>>>>     allowed_domains = ['http://testaspnet.vulnweb.com/default.aspx']
>>>>     start_urls = ['http://testaspnet.vulnweb.com/default.aspx']
>>>>     
>>>>     rules = (
>>>>                 Rule(SgmlLinkExtractor(allow=('//*[@id="Form"]' ) )))
>>>>                 
>>>>     def parse_item(self, response):
>>>>         self.log('%s' % response.url)
>>>>         hxs = HtmlXPathSelector(response)
>>>>         item = Item()
>>>>     
>>>>         item['text'] = hxs.select("//input[(@id or @name) and (@type = 
>>>> 'text' or @type = 'password' or @type = 'file')]").extract()
>>>>         
>>>>         return item
>>>>          
>>>>
>>>   
>> --------------------------------------------------------------------------------------------------------------------------------------
>> *But it gave me the following error  :*
>>
>> home@home-pc:~/isa$ scrapy crawl example.com
>>>> 2012-06-30 00:32:11+0200 [scrapy] INFO: Scrapy 0.14.4 started (bot: isa)
>>>> 2012-06-30 00:32:11+0200 [scrapy] DEBUG: Enabled extensions: LogStats, 
>>>> TelnetConsole, CloseSpider, WebService, CoreStats, MemoryUsage, SpiderState
>>>> 2012-06-30 00:32:11+0200 [scrapy] DEBUG: Enabled downloader 
>>>> middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, 
>>>> UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, 
>>>> RedirectMiddleware, CookiesMiddleware, HttpCompressionMiddleware, 
>>>> ChunkedTransferMiddleware, DownloaderStats
>>>> 2012-06-30 00:32:11+0200 [scrapy] DEBUG: Enabled spider middlewares: 
>>>> HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, 
>>>> UrlLengthMiddleware, DepthMiddleware
>>>> 2012-06-30 00:32:11+0200 [scrapy] DEBUG: Enabled item pipelines: 
>>>> Traceback (most recent call last):
>>>>   File "/usr/local/bin/scrapy", line 4, in <module>
>>>>     execute()
>>>>   File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 
>>>> 132, in execute
>>>>     _run_print_help(parser, _run_command, cmd, args, opts)
>>>>   File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 
>>>> 97, in _run_print_help
>>>>     func(*a, **kw)
>>>>   File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 
>>>> 139, in _run_command
>>>>     cmd.run(args, opts)
>>>>   File 
>>>> "/usr/local/lib/python2.7/dist-packages/scrapy/commands/crawl.py", line 
>>>> 43, 
>>>> in run
>>>>     spider = self.crawler.spiders.create(spname, **opts.spargs)
>>>>   File 
>>>> "/usr/local/lib/python2.7/dist-packages/scrapy/spidermanager.py", line 44, 
>>>> in create
>>>>     return spcls(**spider_kwargs)
>>>>   File 
>>>> "/usr/local/lib/python2.7/dist-packages/scrapy/contrib/spiders/crawl.py", 
>>>> line 37, in __init__
>>>>     self._compile_rules()
>>>>   File 
>>>> "/usr/local/lib/python2.7/dist-packages/scrapy/contrib/spiders/crawl.py", 
>>>> line 83, in _compile_rules
>>>>     self._rules = [copy.copy(r) for r in self.rules]
>>>> TypeError: 'Rule' object is not iterable
>>>>
>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to