[issue36172] csv module internal consistency

Josh Rosenberg Mon, 04 Mar 2019 10:39:03 -0800


Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:


Unless someone disagrees soon, I'm going to close this as documented 
behavior/not a bug. AFAICT, the only "fixes" available for this are:

1. Changing the default dialect from 'excel' to something else. Problem: Breaks 
correct code dependent on the excel dialect, but code could explicitly opt back 
in.

2. Change the 'excel' dialect. Problem: Breaks correct code dependent on the 
excel dialect, with no obvious way to opt back in.

3. Per #10954, check the file object to ensure it's not translating newlines 
and raise an exception otherwise. Problem: AFAICT, there is no documented API 
to check this (the result of calling open, with or without passing newline='', 
looks identical initially, never changes in write mode, and even in read mode, 
only exposes the newlines observed through the .newlines attribute, not whether 
or not they were translated), adding one wouldn't change all other file-like 
objects, so the change would need to propagate to all other built-in and 
third-party file APIs, and for some file-like objects, it wouldn't make sense 
to have this API at all (io.StringIO, being purely in memory, doesn't need to 
do translation of any kind)

4. (Extreme solution) Add io APIs (or add arguments to APIs) for 
reading/writing without newline translation (that is, whether or not newline is 
passed to open, you can read/write without translation), e.g. read(size) 
becomes read(size, translate_newlines=None) where None indicates default 
behavior, or we add read_untranslated(size) as an independent API. Problem: 
Like #3, this requires us to create new, mandatory APIs in the io module that 
would then need to propagate to all other built-in and third-party file APIs.

Point is, the simple solutions (1/2) break correct code, and the complex 
solutions (3/4) involve major changes to the io module (and all other file-like 
object producers) and/or the csv module.

Even then, nothing shy of #4 would make broken code just work, they just fail 
loudly. Both #3 and #4 would require cascading changes to every file-like 
object (both built-in and third-party) to make them work; for the file-like 
objects that aren't updated, we're stuck choosing between issuing a warning 
that most folks won't see, then ignoring the problem, or making those file-like 
objects without the necessary API cause true exceptions (making them unusable 
until the third party package is updated).

If a fix is needed, I think my suggestion would be to do one or both of:

1. Emphasize the newline='' warning in the 
csv.reader/writer/DictReader/DictWriter docs (right now it's just one more 
unemphasized line in a fairly long wall of text for each)

2. Put a large, top-of-module warning about this at the top of the csv module 
docs, so people reading the basic module description are exposed to the warning 
before they even reach the API.

Might help a few folks who are skimming without reading for detail.

----------
nosy: +josh.r

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue36172>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36172] csv module internal consistency

Reply via email to