Am Mittwoch, 6. August 2014 10:31:51 UTC+2 schrieb Gabriel Birke: > > I have a spider that returns different item types (books, authors and > publishers) and would like to export each item type to its own file, being > flexible about the format (CSV or JSON). My first approach would be to use > a pipeline class, but then I lose the easy functionality of specifying the > feed exporters on the command line. Now here is a bunch of questions: > > 1) Is this the right approach or should I implement this differently? > 2) How can I reuse the exporters in a pipeline? >
I've made a second approach - an extension which subclasses the FeedExporter class, thus reusing existing functionality and code without having to modify my command line. basically, it uses a separate SpiderSlot(exporter and Storage) for each item (though depeding on how the output URL/filename is built multiple item types may use the same slot). Here is the code: https://gist.github.com/gbirke/abc10c81aca8242b880a It works and creates the files, however there is something wrong with the way I "collect" the Deferred objects <https://gist.github.com/gbirke/abc10c81aca8242b880a#file-multifeedexporter-py-L54> from the calls to defer.maybeDeferred(slot.storage.store, slot.file) The crawler just sits there waiting instead of terminating. Can anyone point out to me the correct way to collect the deferred objects into one? -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
