[issue39745] BlockingIOError.characters_written represents number of bytes not characters

2020-02-28 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

Antoine, Although the text may have preceded your OSError reorganization, you 
were the last to touch this entry.  Is is correct, or does it need change?

Revision: f55011f8b63d3b046c1ec580312bc52ca47d721b
Author: Antoine Pitrou 
Date: 10/12/2011 12:57:23 PM
Message:
Update doc for BlockingIOError and its alias in the io module

--
nosy: +pitrou, terry.reedy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39745] BlockingIOError.characters_written represents number of bytes not characters

2020-02-24 Thread Masahiro Sakai


Change by Masahiro Sakai :


--
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39745] BlockingIOError.characters_written represents number of bytes not characters

2020-02-24 Thread Masahiro Sakai


New submission from Masahiro Sakai :

According to https://docs.python.org/3/library/exceptions.html#BlockingIOError 
, 'characters_written' is "An integer containing the number of characters 
written to the stream before it blocked". But I observed that it represents 
number of *bytes* not *characters* in the following program.

Program:

import os
import threading
import time

r, w = os.pipe()
os.set_blocking(w, False)
f_r = os.fdopen(r, mode="rb")
f_w = os.fdopen(w, mode="w", encoding="utf-8")

msg = "\u03b1\u03b2\u03b3\u3042\u3044\u3046\u3048\u304a" * (1024 * 16)
try:
print(msg, file=f_w, flush=True)
except BlockingIOError as e:
print(f"BlockingIOError.characters_written == {e.characters_written}")
written = e.characters_written

def close():
os.set_blocking(w, True)
f_w.close()
threading.Thread(target=close).start()

b = f_r.read()
f_r.close()

print(f"{written} characters correspond to {len(msg[:written].encode('utf-8'))} 
bytes in UTF-8")
print(f"{len(b)} bytes read")


Output:

BlockingIOError.characters_written == 81920
81920 characters correspond to 215040 bytes in UTF-8
81920 bytes read


I think it is confusing behavior.
If this is intended behavior, then it should be documented as such and I think 
'bytes_written' is more appropriate name.

--
components: IO
messages: 362611
nosy: msakai
priority: normal
severity: normal
status: open
title: BlockingIOError.characters_written represents number of bytes not 
characters

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com