Re: How can i export multiple item types to individual files?

Gabriel Birke Wed, 03 Sep 2014 02:51:59 -0700


Am Mittwoch, 6. August 2014 10:31:51 UTC+2 schrieb Gabriel Birke:
>
> I have a spider that returns different item types (books, authors and 
> publishers) and would like to export each item type to its own file, being 
> flexible about the format (CSV or JSON). My first approach would be to use 
> a pipeline class, but then I lose the easy functionality of specifying the 
> feed exporters on the command line. Now here is a bunch of questions:
>
> 1) Is this the right approach or should I implement this differently?
> 2) How can I reuse the exporters in a pipeline?
>


I've made a second approach - an extension which subclasses the  
FeedExporter class, thus reusing existing functionality and code without 
having to modify my command line. basically, it uses a separate 
SpiderSlot(exporter and Storage) for each item (though depeding on how the 
output URL/filename is built multiple item types may use the same slot).

Here is the code: https://gist.github.com/gbirke/abc10c81aca8242b880a

It works and creates the files, however there is something wrong with the 
way I "collect" the Deferred objects 
<https://gist.github.com/gbirke/abc10c81aca8242b880a#file-multifeedexporter-py-L54>
 
from the calls to defer.maybeDeferred(slot.storage.store, slot.file)
The crawler just sits there waiting instead of terminating. Can anyone 
point out to me the correct way to collect the deferred objects into one?


-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: How can i export multiple item types to individual files?

Reply via email to