You can use this test

    if item.get('modelnumber'):

just like for a regular dict
 Le 21 mars 2014 01:11, "BrendanB" <[email protected]> a écrit :

> Hi,
>
> Just have a quick question about itempipelines and nulls / missing values
>
> I have a field in my spider to extract a model number. Now in some cases
> this is null.
>
> I had a pipeline set-up like the following however I always got an error.
> I essentially wanted to either add in a default value or skip over this
> record.
>
> This example pipeline below would always fail!
>
> class ModelNumberPipeline(object):
>     def process_item(self, item, spider):
>         if item['modelnumber']:
>             return item
>         else:
>     raise DropItem("Missing Model Number in %s" % item)
>
>
> The error was the following:
>
> 'price': u'$29.95',
>          'product_id': u'3231002',
>          'site_id': u'1',
>          'site_type': u'1'}
>         Traceback (most recent call last):
>           File
> "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 62, in
> _process_chain
>             return process_chain(self.methods[methodname], obj, *args)
>           File
> "/usr/local/lib/python2.7/dist-packages/scrapy/utils/defer.py", line 65, in
> process_chain
>             d.callback(input)
>           File
> "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line
> 382, in callback
>             self._startRunCallbacks(result)
>           File
> "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line
> 490, in _startRunCallbacks
>             self._runCallbacks()
>         --- <exception caught here> ---
>           File
> "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line
> 577, in _runCallbacks
>             current.result = callback(current.result, *args, **kw)
>           File "/home/brendanb/scrapy/cbs-crawler/cbs/pipelines.py", line
> 15, in process_item
>             if item['model_number']:
>           File "/usr/local/lib/python2.7/dist-packages/scrapy/item.py",
> line 50, in __getitem__
>             return self._values[key]
>         exceptions.KeyError: 'model_number'
>
>
> If I changed this pipeline to just use a default value pipeline I can
> fudge it.
>
> class ModelNumberPipeline(object):
>     def process_item(self, item, spider):
> item.setdefault('model_number', '')
> return item
>
>
> My question is why would this pipeline fail on checking for a null value.?
>
> thanks
> Brendan
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to