Another way in Python is to use the CSV library and read the data line by line, checking the data quality each step. The CSV library will handle different delimiters, quoted fields, and variable fields.
#!/usr/bin/env python3 import csv with open('file.csv', 'r') as infile: # reader provides a list of lists lines = csv.reader(infile, delimiter=',') for line in lines: # check for proper length print(len(line)) On Sun, May 21, 2023 at 5:42 AM Rich Shepard <rshep...@appl-ecosys.com> wrote: > On Sat, 20 May 2023, American Citizen wrote: > > > 1. using awk -F, fails when a cell contains a quoted cell with an > embedded > > comma > > I download .csv files from agency databases where strings are double quoted > and contain commas within them, as well as using commas to separated > fields. > > I start my gawk script with > BEGIN { FS="," } > and it separates (or counts, or selects) fields ignoring commas within > quoted strings. > > Rich > >