[issue26756] fileinput handling of unicode errors from standard input

2016-05-06 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26756] fileinput handling of unicode errors from standard input

2016-04-17 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Calling the openhook for the stdin will break existing code. Third-party 
openhooks don't special case the '' name, which is legitimate file name.

Instead I recommend to patch sys.stdin explicitly in your program.

sys.stdin = io.TextIOWrapper(sys.stdin.buffer, errors='replace')
for line in fileinput.input(openhook=hook):
...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26756] fileinput handling of unicode errors from standard input

2016-04-14 Thread Joel Barry

Joel Barry added the comment:

I was suggesting that the openhook could somehow be applied to a
*reopening* of sys.stdin.  Something like this:

326c326,329
< self._file = sys.stdin
---
> if self._openhook:
> self._file = self._openhook(self._filename, 
> self._mode)
> else:
> self._file = sys.stdin

But this won't work because self._filename here is '' which
isn't a real filename.  In conjunction with a change to my hook:

   def hook(filename, mode):
   if filename == '':
   return io.TextIOWrapper(sys.stdin.buffer, errors='replace')
   return open(filename, mode, errors='replace')

things would work, but this is a bit awkward.

This works for me without changing my hook:

326c326,329
< self._file = sys.stdin
---
> if self._openhook:
> self._file = self._openhook('/dev/stdin', self._mode)
> else:
> self._file = sys.stdin

but I realize that using /dev/stdin is not portable.

The desired outcome is really just to control Unicode behavior from
stdin, not necessary the ability to provide a generic hook.  Adding an
'errors' keyword to apply to stdin would solve my case, but if you
open up 'errors', someone may also want 'encoding', and the others,
which is why it would be nicer if this could somehow be solved with
the existing openhook interface.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26756] fileinput handling of unicode errors from standard input

2016-04-14 Thread SilentGhost

SilentGhost added the comment:

While documentation seems not entirely clear, the openhook only applies to 
files.

I'm not sure what is the logic behind the suggested change, what would openhook 
do in your situation?

--
components: +Library (Lib)
nosy: +SilentGhost, serhiy.storchaka
versions: +Python 3.5, Python 3.6 -Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26756] fileinput handling of unicode errors from standard input

2016-04-14 Thread Joel Barry

New submission from Joel Barry:

The openhook for fileinput currently will not be called when the input
is from sys.stdin.  However, if the input contains invalid UTF-8
sequences, a program with a hook that specifies errors='replace' will
not behave as expected:

  $ cat x.py
  import fileinput
  import sys
  
  def hook(filename, mode):
  print('hook called')
  return open(filename, mode, errors='replace')
  
  for line in fileinput.input(openhook=hook):
  sys.stdout.write(line)


  $ echo -e "foo\x80bar" >in.txt

  $ python3 x.py in.txt
  hook called
  foo�bar

Good.  Hook is called, and replacement character is observed.

  $ python3 x.py 
  for line in fileinput.input(openhook=hook):
File 
"/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/fileinput.py",
 line 263, in __next__
  line = self.readline()
File 
"/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/fileinput.py",
 line 363, in readline
  self._buffer = self._file.readlines(self._bufsize)
File 
"/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/codecs.py",
 line 319, in decode
  (result, consumed) = self._buffer_decode(data, self.errors, final)
  UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 3: 
invalid start byte

Hook was not called, and so we get the UnicodeDecodeError.

Should fileinput attempt to apply the hook code to stdin?

--
messages: 263409
nosy: jmb236
priority: normal
severity: normal
status: open
title: fileinput handling of unicode errors from standard input
type: behavior
versions: Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com