New submission from Joel Barry:
The openhook for fileinput currently will not be called when the input
is from sys.stdin. However, if the input contains invalid UTF-8
sequences, a program with a hook that specifies errors='replace' will
not behave as expected:
$ cat x.py
import fileinput
import sys
def hook(filename, mode):
print('hook called')
return open(filename, mode, errors='replace')
for line in fileinput.input(openhook=hook):
sys.stdout.write(line)
$ echo -e "foo\x80bar" >in.txt
$ python3 x.py in.txt
hook called
foo�bar
Good. Hook is called, and replacement character is observed.
$ python3 x.py <in.txt
Traceback (most recent call last):
File "x.py", line 8, in <module>
for line in fileinput.input(openhook=hook):
File
"/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/fileinput.py",
line 263, in __next__
line = self.readline()
File
"/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/fileinput.py",
line 363, in readline
self._buffer = self._file.readlines(self._bufsize)
File
"/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/codecs.py",
line 319, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 3:
invalid start byte
Hook was not called, and so we get the UnicodeDecodeError.
Should fileinput attempt to apply the hook code to stdin?
----------
messages: 263409
nosy: jmb236
priority: normal
severity: normal
status: open
title: fileinput handling of unicode errors from standard input
type: behavior
versions: Python 3.4
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue26756>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com