On 2019-10-25 22:12:23 +0200, Pascal wrote: > for line in fileinput.input(source): > print(line.strip()) > > ----------------------- > > python3.7.4 myscript.py myfile.log > Traceback (most recent call last): > ... > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799: > invalid continuation byte [...] > for line in fileinput.input(source, > openhook=fileinput.hook_encoded("utf-8", "ignore")): > print(line.strip())
The file you were trying to read was obviously not encoded in UTF-8, since you got a decode error. So the first question you should ask is: Is it supposed to be encoded in UTF-8 (and just corrupted) or is in supposed to be encoded in something else (e.g. iso-8859-1 or win-1252)? If it is supposed to be in UTF-8 but may contain errors, ignoring errors may be reasonable. If is supposed to be something else, determine what that "something else" actually is, and use that. hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | h...@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
signature.asc
Description: PGP signature
-- https://mail.python.org/mailman/listinfo/python-list