[issue32110] Make codecs.StreamReader.read() more compatible with read() of other files

2017-11-28 Thread Serhiy Storchaka

Change by Serhiy Storchaka :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32110] Make codecs.StreamReader.read() more compatible with read() of other files

2017-11-28 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:


New changeset fc73c54dae46e6c47dcd4a535f7bc68a46b8e398 by Serhiy Storchaka 
(Miss Islington (bot)) in branch '2.7':
bpo-32110: codecs.StreamReader.read(n) now returns not more than n (GH-4499) 
(#4623)
https://github.com/python/cpython/commit/fc73c54dae46e6c47dcd4a535f7bc68a46b8e398


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32110] Make codecs.StreamReader.read() more compatible with read() of other files

2017-11-28 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:


New changeset 230ffeae0a3961b1769806bd722c26227c84e8da by Serhiy Storchaka 
(Miss Islington (bot)) in branch '3.6':
bpo-32110: codecs.StreamReader.read(n) now returns not more than n (GH-4499) 
(#4622)
https://github.com/python/cpython/commit/230ffeae0a3961b1769806bd722c26227c84e8da


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32110] Make codecs.StreamReader.read() more compatible with read() of other files

2017-11-28 Thread Roundup Robot

Change by Roundup Robot :


--
pull_requests: +4538

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32110] Make codecs.StreamReader.read() more compatible with read() of other files

2017-11-28 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:


New changeset 219c2de5ad0fdac825298bed1bb251f16956c04a by Serhiy Storchaka in 
branch 'master':
bpo-32110: codecs.StreamReader.read(n) now returns not more than n (#4499)
https://github.com/python/cpython/commit/219c2de5ad0fdac825298bed1bb251f16956c04a


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32110] Make codecs.StreamReader.read() more compatible with read() of other files

2017-11-28 Thread Roundup Robot

Change by Roundup Robot :


--
pull_requests: +4537

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32110] Make codecs.StreamReader.read() more compatible with read() of other files

2017-11-22 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

> That's not true. .read(1) will at most read 1 byte from the stream
> and decode it. There's no way it will return 70 characters.

See the added tests. They are failed without changing the read() method.

.read(1) currently returns all characters from the characters buffer. And this 
buffer can be not empty after .readline().

I understand the reason of having two limitation parameters in 
StreamReader.read(). But currently its behavior does not completely match the 
expected behavior of the read() method with one argument.

Actually size already has been used instead of chars if chars < 0 for reading 
in a loop. The code can be simplified.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: [issue32110] Make codecs.StreamReader.read() more compatible with read() of other files

2017-11-22 Thread M.-A. Lemburg
On 22.11.2017 08:40, Serhiy Storchaka wrote:
> Usually the read() method of a file-like object takes one optional argument 
> which limits the amount of data (the number of bytes or characters) returned 
> if specified.
> 
> codecs.StreamReader.read() also has such parameter. But this is the second 
> parameter. The first parameter limits the number of bytes read for decoding. 
> read(1) can return 70 characters, that will confuse most callers which expect 
> either a single character or an empty string (at the end of stream).

That's not true. .read(1) will at most read 1 byte from the stream
and decode it. There's no way it will return 70 characters. It will
usually return less chars than the number of bytes read.

The reasoning here is the same as for .read() on regular byte
streams in Python 2.x: the first argument size tells the reader how
many bytes to read for decoding, since this is needed to properly
work together with .seek().

The optional second parameter chars was added as convenience,
since the user may not know how many bytes need to be read in
order to decode a certain number of characters.

That said, I see in your patch that you want to bind chars
to size. That will work and also protect the user from the
unlikely case where the codec returns more chars than bytes
read.

-- 
Marc-Andre Lemburg
eGenix.com

___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32110] Make codecs.StreamReader.read() more compatible with read() of other files

2017-11-21 Thread Serhiy Storchaka

Change by Serhiy Storchaka :


--
keywords: +patch
pull_requests: +4437
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32110] Make codecs.StreamReader.read() more compatible with read() of other files

2017-11-21 Thread Serhiy Storchaka

New submission from Serhiy Storchaka :

Usually the read() method of a file-like object takes one optional argument 
which limits the amount of data (the number of bytes or characters) returned if 
specified.

codecs.StreamReader.read() also has such parameter. But this is the second 
parameter. The first parameter limits the number of bytes read for decoding. 
read(1) can return 70 characters, that will confuse most callers which expect 
either a single character or an empty string (at the end of stream).

Some times ago codecs.open() was recommended as a replacement for the builtin 
open() in programs that should work in 2.x and 3.x (this was before adding 
io.open()), and it is still used in many programs. But this peculiarity makes 
it bad replacement of builtin open().

I wanted to fix this issue long time ago, but forgot, and the question on Stack 
Overflow has reminded me about this. 
https://stackoverflow.com/questions/46437761/codecs-openutf-8-fails-to-read-plain-ascii-file

--
assignee: serhiy.storchaka
components: IO, Library (Lib)
messages: 306701
nosy: serhiy.storchaka
priority: normal
severity: normal
status: open
title: Make codecs.StreamReader.read() more compatible with read() of other 
files
type: behavior
versions: Python 2.7, Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com