Hi Lhassan Okay, thank your for your reply.
Kasper On Thursday, August 18, 2016 at 2:50:18 PM UTC+2, Lhassan Baazzi wrote: > > Hi, > > If you are going with the zip option, just create you own pipeline that > extend the base file pipeline and publish it as a package on Github, if > someone else needed and use it. > > Best Regards. > Lhassan > Le 18 août 2016 13:16, "Kasper Marstal" <[email protected] <javascript:>> > a écrit : > >> Hi all, >> >> I am scraping a couple of million documents and need to save space on my >> disk to store the data. An attractive option is to save the files directly >> to a ZIP file since the compression ratio is really good with this kind of >> data (~18). However, the FilesPipeline does not allow me to provide my own >> files store, unless I hack away on the scrapy code itself, which I would >> like to avoid. So, a couple of questions for the scrapy developers: >> >> - Are you interested in a patch that allows the FilesPipeline to accept >> custom store schemes? OR >> - Are you interested in a patch with a ZipFilesStore? In addition, >> - Is this ZIP-file approach a common way of dealing with large amounts of >> data, or do you have best-practices on this subject that I am not aware of? >> >> Kind Regards, >> Kasper Marstal >> >> -- >> You received this message because you are subscribed to the Google Groups >> "scrapy-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/scrapy-users. >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
