On Thu, Mar 30, 2017 at 4:43 PM, Marko Rauhamaa <ma...@pacujo.net> wrote:
> The input is not in my control, and bailing out may not be an option:
>
>    $ echo
> aa\n\xdd\naa' | grep aa
>    aa
>    aa
>    $ echo \xdd' | python2 -c 'import sys; sys.stdin.read(1)'
>    $ echo \xdd' | python3 -c 'import sys; sys.stdin.read(1)'
>    Traceback (most recent call last):
>      File "<string>", line 1, in <module>
>      File "/usr/lib64/python3.5/codecs.py", line 321, in decode
>        (result, consumed) = self._buffer_decode(data, self.errors, final)
>    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdd in position 0:
>     invalid continuation byte
>
> Note that "grep" is also locale-aware.

So what exactly does byte value 0xDD mean in your stream?

And if you say "it doesn't matter", then why are you assigning meaning
to byte value 0x0A in your first example? Truly binary data doesn't
give any meaning to 0x0A.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to