Hi everyone,
I am using an Item Loader to parse a json (dict object)
I got this strange behaviour:
while my json path in the SelectJmes processor is correct (tested, ok) the
ItemLoader do not extract the values.
It turned out that the function art_to_iter (in scrapy.utils.misc)
generates a list out of a dictionary
(it seems to be a feature, or at least a known behaviour since the
docstring indicates 'Exception: if arg is a dict, [arg] will be returned').
Basically, to circumvent the behaviour, I can
- change my json path to something like "[0].key1.key2" instead of
"key1.key2" (which is weird)
- change my processors list to TakeFirst(), SelectJmes("key1.key2")
- use a string representation of my json object and change my processors
list to json.dumps, SelectJmes("key1.key2") as in the doc
http://doc.scrapy.org/en/latest/topics/loaders.html#scrapy.loader.processors.SelectJmes
I did not find anything about this behaviour (if it is intended or not) and
I spent quite some time figuring out why, o it si probably worth sharing.
Have a nice day
Julien
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.