I would like if it is good (or maybe bad) idea to give scrapy a feature which allows writing items in compressed json, xml, csv and other supported plaintext output-formats to the local disk.
I know that this could be solved by using "-" to write to stdout and compress it on the fly on unix-like systems. But i don't know if this is feasible using scrapyd on a remote system - or is it? That's why i'd suppose it may be a good extension which would belong to the settings.py file of the scrapy project. Probably as an optional middleware(?) The background is: We use scrapy to write on disk without any further processing and batch-process the output afterwards using an ETL toolchain which cleans up the raw crawling results after processing. We'd like to reduce the disk space which is used inbetween these two tasks. -- Best regards, Stefan Antoni -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
