Hi all:

After giving up trying to reach the developers on GitHub, I git cloned the xsv build and created a new module "lengths.rs" by hacking the fixlengths.rs module and carefully following conventions in the files critical to enabling it as a command option.

That said, the build successfully compiled and I have a new xsv option "lengths" which lists the record # and field length for each record in the csv file. By piping this command:

% xsv lengths my_file.csv | sort | uniq -c

One can quickly get an idea and count of the actual field lengths in the csv file. If some records omit or add an extraneous field, then rerunning the xsv lengths and using grep -v {good length} will show which records need fixing.

The command is fast, a 1.19 gig file with 2,100,878 records with 79 fields took 21 secs to run.

- Randall


Reply via email to