Hello,
If this works for you, a quick workaround might be to return a dict. Recent
Scrapy versions support it.
Another solution that is more tidy is to add an "extras" field on your
item. For each unknown_field add it on the extras dict.
If those doesn't do work for you, tell me and I'll have a look but frankly
this is a small scrapy "problem". The larger problem is how you'll consume
schema-less data after you scrape them? The answer to this question will
answer your Scrapy question as well.
On Tuesday, May 24, 2016 at 10:57:45 PM UTC+1, JEBI93 wrote:
>
> Hey there,
>
> I’m trying to figure out is it possible to dynamically assign item names
> for some while rest items are defined in items.py file, here’s sample code:
>
> from scrapy.loader import ItemLoaderfrom w3lib.html import
> replace_escape_charsfrom scrapy.loader.processors import Compose,
> MapComposefrom some_spider.items import someItems
> class someSpider(Spider):
> name = 'domain'
> allowed_domains = ['domain.com']
> # rest of code...
> def parse(self, response):
>
> # code...
>
> l = ItemLoader(item=someItems(), response=response)
> l.default_output_processor = MapCompose(lambda v: v.strip(),
> replace_escape_chars)
>
> # title is defined in items.py page
> l.add_xpath('title', '//h1/text()')
>
> # unknown_field is unknown item name
> unknown_fields = response.xpath('//*[@class="foo"]/text()').extract()
> for unknown_field in unknown_fields:
> l.add_value('unknown_field', unknown_field)
>
> return l.load_item()
>
> I searched for solutions for this kind of setup but none of solutions
> online work. Any help is appreciated, thanks.
>
>
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.