Thank you all very much for your detailed feedback!

I wound up pulling the "TREE_GRM_ESTN.csv" file referred to by Jon and used it in subsequent tests. Created D-programs for reading directly through a File() structure, versus reading byLine() from the stdin alias.

After copying the large CSV file to /dev/shm/ (e.g. a ramdisk), I re-ran the two programs repeatedly, and I was able to approach the 20-30% overhead margin I would expect to see for using a shell pipe and its buffer; my results now similarly match Jon's above.

Lesson learned: be wary of networked I/O systems (e.g. Isilon storage arrays); all kinds of weirdness can happen there ...

Reply via email to