So, putting my scraper under the fresh project and executing directly from 
scrapy rather than the django manage.py works.

I realize this is the scrapy user group however would anyone have an idea 
as to why inspect_response doesn't work under django management commands? I 
have also noticed some problems with getting pdb.set_trace() to work in the 
same situation.

Here is how I'm setting up and calling the scraper from django (maybe 
something has changed in 0.22? This worked on an older version):
        settings_module = 
importlib.import_module('scrapers_2014.scrapers_2014.settings')

        settings = CrawlerSettings(settings_module)
        settings.overrides['ITEM_PIPELINES'] = self.select_pipeline(options)
        crawler = Crawler(settings)
        crawler.signals.connect(reactor.stop, signal=signals.spider_closed)
        crawler.configure()
        crawler.crawl(self._spider)
        crawler.start()
        log.start()
        reactor.run()


On Saturday, 22 February 2014 08:20:46 UTC+10, John wrote:
>
> Hmm, it does not fail in a fresh project... That is interesting. I am 
> calling it from within my parse callback. The only difference I see is that 
> I am using 'BaseSpider' as the parent class and not 'Spider'. I've tried 
> changing this but that hasn't made a difference.
>
> Scrapy version info:
> Scrapy  : 0.22.2
> lxml    : 3.3.1.0
> libxml2 : 2.7.8
> Twisted : 13.2.0
> Python  : 2.7.3 (default, Apr 10 2013, 06:20:15) - [GCC 4.6.3]
> Platform: Linux-3.2.0-57-generic-x86_64-with-Ubuntu-12.04-precise
>
> from scrapy.project import crawler is the line causing the import error.
>
> The other major difference between my fresh project and the project I'm 
> working on is that my spider is called from inside a django command...  I 
> think that is an avenue that needs further investigation. Initially I had 
> wanted my scraper to dump straight to the django db but now I'm using an 
> intermediary JSON dump so that may no longer be necessary...
>
> On Saturday, 22 February 2014 00:24:06 UTC+10, Rolando Espinoza La fuente 
> wrote:
>>
>> Does it fails in a fresh project? How/Where are you calling the function? 
>> What's the output of "scrapy version -v"?
>>
>> Sometimes the crawler import error is due to a "from scrapy.project 
>> import crawler".
>>
>> Alternatively, I like to use
>>
>>     from IPython import embed; embed()
>>
>> instead of inspect_response because it gives me access to the current 
>> variables. Although inspect_response gives you the handy shell shortcuts 
>> like view(response).
>>
>> Rolando
>>
>>
>> On Fri, Feb 21, 2014 at 7:20 AM, John <[email protected]> wrote:
>>
>>> Hi Everyone,
>>>
>>> I'm trying to debug my scraper and have discovered the 
>>> inspect_response() function which looks quite useful.
>>>
>>> However when importing it I get the following exception: 
>>> exceptions.ImportError: 
>>> cannot import name crawler
>>>
>>>  I have also attempted using inspect_response from the python shell and 
>>> get the same error.
>>>
>>> I'm using scrapy 0.22.2. Has anyone else encountered this error? What 
>>> more information can I provide to investigate this?
>>>
>>> Cheers,
>>> John
>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "scrapy-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/scrapy-users.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to