Re: UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 10442: character maps to

Chris Angelico Tue, 29 May 2018 02:54:02 -0700

On Tue, May 29, 2018 at 6:34 PM, Peter J. Holzer <[email protected]> wrote:
> On 2018-05-23 06:03:38 +0000, Steven D'Aprano wrote:
>> Mojibake is especially difficult to deal with when you are dealing with
>> short text snippets like file names or user names which can contain
>> arbitrary characters, where there is rarely any way to recognise the
>> "correct" string.
>
> For single file names or user names, sure. But if you have a list of
> them, there is still a high probability that many of them will contain
> recognizable words which can be used to deduce the (or a) correct
> encoding. (Unless it's from the Ministry of Silly Names).


Ohh... are you assuming that, in a list of file names, all of them use
the same encoding? Ah, yes, well, that WOULD make it easier, wouldn't
it. Sadly, not the case.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 10442: character maps to

Reply via email to