On Tue, Aug 13, 2019 at 10:23 PM Rodney W. Grimes <free...@gndrsh.dnsmgr.net>
wrote:

> If you had followed the thread you would know that byte 1
> of the file is a 0xA, aka LF, and the dd was to rip that
> byte off the file, but the command got morphed cause I
> used a BSD iseek=1 syntax, and gnu dd does not understand
> that.
>

Yes, dd is a fine tool: pretty cool for seeking into files given an index
of offsets, imaging devices, creating sparse files, etc.

But personally, I'd use tail or grep or sed to skip over that first
character, which in this case is the same as a blank line.

$ tail -n +2
$ tail -c +2
$ grep .
$ sed -ne '/./p'

That sed could be extended to remove the space after commas and the
trailing comma on each line.  So in the end, you'd have just two commands:
tr and sed.

$   time -p < hatchery_returns-2019-08-12.csv \
> tr '\r' '\n' |
> sed -ne 's/, /,/g;s/,$//;/./p' |
> md5
a02aa3be8cbe68e1b76debbd0b1586e7
real 1.31
user 1.96
sys 0.03

Of course, dd can be added to the mix:

$ time -p < hatchery_returns-2019-08-12.csv \
> dd bs=1 iseek=1 |
> tr '\r' '\n' |
> sed -ne 's/, /,/g;s/,$//;/./p' |
> md5
12746088+0 records in
12746088+0 records out
12746088 bytes transferred in 38.344862 secs (332407 bytes/sec)
a02aa3be8cbe68e1b76debbd0b1586e7
real 38.35
user 22.58
sys 40.10

As the md5 shows, you get the same results.  It's just a matter of personal
preference.

Regards,
-  Robert
_______________________________________________
PLUG mailing list
PLUG@pdxlinux.org
http://lists.pdxlinux.org/mailman/listinfo/plug

Reply via email to