On Wednesday, 10 May 2017 at 22:20:52 UTC, Nordlöw wrote:
What's fastest way to on-the-fly-decompress and process a
gzipped csv-fil line by line?
Is it possible to combine
http://dlang.org/phobos/std_zlib.html
with some stream variant of
File(path).byLineFast
?
I was curious what byLineFast was, I'm guessing it's from here:
https://github.com/biod/BioD/blob/master/bio/core/utils/bylinefast.d.
I didn't test it, but it appears it may pre-date the speed
improvements made to std.stdio.byLine perhaps a year and a half
ago. If so, it might be worth comparing it to the current Phobos
version, and of course iopipe.
As mentioned in one of the other replies, byLine and variants
aren't appropriate for CSV with escapes. For that, a real CSV
parser is needed. As an alternative, run a converter that
converts from csv to another format.
--Jon