On 2017-12-06 00:06, Steve D'Aprano wrote:
On Wed, 6 Dec 2017 04:20 am, Jason wrote:
I ran into this:
https://stackoverflow.com/questions/27707581/why-does-csv-dictreader-skip-empty-lines
# unlike the basic reader, we prefer not to return blanks,
# because we will typically wind up with a dict full of None
# values
while iterating over two files, which are line-by-line corresponding. The
DictReader skipped ahead many lines breaking the line-by-line
correspondence.
Um... this doesn't follow. If they are line-by-line corresponding, then they
should skip the same number of blank lines and read the same number of
non-blank lines.
Even if one file has blanks and the other does not, if you iterate the over
the records themselves, they should keep their correspondence.
I'm afraid that if you want to convince me this is a buggy design, you need to
demonstrate a simple pair of CSV files where the non-blank lines are
corresponding (possibly with differing numbers of blanks in between) but the
CSV readers get out of alignment somehow.
And I want to argue that the difference of behavior should be considered a
bug. It should be considered as such because: 1. I need to know what's in
the file to know what class to use.
Sure. But blank lines don't tell you what class to use.
The file content should not break at-least-1-record-per-line.
Blank lines DO break that requirement. A blank line is not a record.
There may me multiple lines per record in the
case of embedded new lines, but it should never no record per line.
I disagree. A blank line is not a record. If I have (say) five fields, then:
,,,,\n
is a blank record with five empty fields. \n alone is just a blank. The
DictReader correctly returns records with blank fields.
A blank line could be a record if there's only one field and it's empty.
[snip]
--
https://mail.python.org/mailman/listinfo/python-list