On Tue, Aug 13, 2019 at 10:23 PM Rodney W. Grimes <free...@gndrsh.dnsmgr.net> wrote:
> If you had followed the thread you would know that byte 1 > of the file is a 0xA, aka LF, and the dd was to rip that > byte off the file, but the command got morphed cause I > used a BSD iseek=1 syntax, and gnu dd does not understand > that. > Yes, dd is a fine tool: pretty cool for seeking into files given an index of offsets, imaging devices, creating sparse files, etc. But personally, I'd use tail or grep or sed to skip over that first character, which in this case is the same as a blank line. $ tail -n +2 $ tail -c +2 $ grep . $ sed -ne '/./p' That sed could be extended to remove the space after commas and the trailing comma on each line. So in the end, you'd have just two commands: tr and sed. $ time -p < hatchery_returns-2019-08-12.csv \ > tr '\r' '\n' | > sed -ne 's/, /,/g;s/,$//;/./p' | > md5 a02aa3be8cbe68e1b76debbd0b1586e7 real 1.31 user 1.96 sys 0.03 Of course, dd can be added to the mix: $ time -p < hatchery_returns-2019-08-12.csv \ > dd bs=1 iseek=1 | > tr '\r' '\n' | > sed -ne 's/, /,/g;s/,$//;/./p' | > md5 12746088+0 records in 12746088+0 records out 12746088 bytes transferred in 38.344862 secs (332407 bytes/sec) a02aa3be8cbe68e1b76debbd0b1586e7 real 38.35 user 22.58 sys 40.10 As the md5 shows, you get the same results. It's just a matter of personal preference. Regards, - Robert _______________________________________________ PLUG mailing list PLUG@pdxlinux.org http://lists.pdxlinux.org/mailman/listinfo/plug