Pascal wrote: > I have a small python (3.7.4) script that should open a log file and > display its content but as you can see, an encoding error occurs : > > ----------------------- > > import fileinput > import sys > try: > source = sys.argv[1:] > except IndexError: > source = None > for line in fileinput.input(source): > print(line.strip()) > > ----------------------- > > python3.7.4 myscript.py myfile.log > Traceback (most recent call last): > ... > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799: > invalid continuation byte > > python3.7.4 myscript.py < myfile.log > Traceback (most recent call last): > ... > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799: > invalid continuation byte > > ----------------------- > > I add the encoding hook to overcome the error but this time, the script > reacts differently depending on the input used : > > ----------------------- > > import fileinput > import sys > try: > source = sys.argv[1:] > except IndexError: > source = None > for line in fileinput.input(source, > openhook=fileinput.hook_encoded("utf-8", "ignore")): > print(line.strip()) > > ----------------------- > > python3.7.4 myscript.py myfile.log > first line of myfile.log > ... > last line of myfile.log > > python3.7.4 myscript.py < myfile.log > Traceback (most recent call last): > ... > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799: > invalid continuation byte > > python3.7.4 myscript.py /dev/stdin < myfile.log > first line of myfile.log > ... > last line of myfile.log > > python3.7.4 myscript.py - < myfile.log > Traceback (most recent call last): > ... > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799: > invalid continuation byte > > ----------------------- > > does anyone have an explanation and/or solution ?
'-' or no argument tell fileinput to use sys.stdin. This is already text decoded using Python's default io-encoding, and the open hook is not called. You can override the default encoding by setting the environment variable PYTHONIOENCODING=UTF8:ignore -- https://mail.python.org/mailman/listinfo/python-list