Nick Coghlan added the comment: Since we don't need to worry about ASCII incompatible encodings (difflib will already have issues with such files due to the assumptions about newlines), it should be possible to use the same approach as that used in urllib.parse, but based on latin-1 rather than ascii.
It's the least bad option for this kind of use case (surrogateescape can be good too, but it doesn't work properly in this case where the two encodings may be different and we want to compare the raw bytes directly). (changed scope of issue to reflect the subsequent discussion) ---------- title: Return the type you accept -> Handle bytes comparisons in difflib.Differ _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue17445> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com