On 6/20/22 00:36, William Michels wrote:


On Sun, Jun 19, 2022 at 11:00 PM ToddAndMargo via perl6-users <perl6-us...@perl.org <mailto:perl6-us...@perl.org>> wrote:

    On 6/19/22 21:49, William Michels via perl6-users wrote:
     > Hi Todd, I'm trying to follow what you're doing (below in
    Terminal app
     > on MacOS):
     >
     > ~$ raku
     > Welcome to 𝐑𝐚𝐤𝐮𝐝𝐨™ v2021.06.
     > Implementing the 𝐑𝐚𝐤𝐮™ programming language v6.d.
     > Built on MoarVM version 2021.06.
     >
     > To exit type 'exit' or '^D'
     >  > print Buf.new(0x84, 0x73, 0x77, 0x84, 0x79).decode("utf8-c8")
    ~ "\n"
     > 􏿽x84sw􏿽x84y
     >  >
     >
     > Not clear if this is what you expect. I've also run code from
     > https://docs.raku.org/language/unicode#UTF8-C8
    <https://docs.raku.org/language/unicode#UTF8-C8>
     > <https://docs.raku.org/language/unicode#UTF8-C8
    <https://docs.raku.org/language/unicode#UTF8-C8>> and see something
     > different from what's posted there:
     >
     >     my $test-file = "/tmp/test";
     >     given open($test-file, :w, :bin) {
     >        .write: Buf.new(ord('A'), 0xFA, ord('B'), 0xFB, 0xFC,
    ord('C'),
     >     0xFD);
     >        .close;
     >     }
     >
     >     say slurp($test-file, enc => 'utf8-c8');
     >     # OUTPUT: «(65 250 66 251 252 67 253)»
     >
     >
     > The output I actually see is:
     >
     > A􏿽xFAB􏿽xFB􏿽xFCC􏿽xFD
     >
     > If I go into /tmp and look at the file created, it contains the
     > following single line:
     >
     > AúBûüCý
     >
     >
     > HTH, Bill.
     >
     >
     > On Sun, Jun 19, 2022 at 8:42 PM ToddAndMargo via perl6-users
     > <perl6-us...@perl.org <mailto:perl6-us...@perl.org>
    <mailto:perl6-us...@perl.org <mailto:perl6-us...@perl.org>>> wrote:
     >
     >       >print Buf.new(0x84, 0x73, 0x77, 0x84,
    0x79).decode("utf8-c8") ~ "\n"
     >
     >     􏿽x84sw􏿽x84y
     >



    Hi Bill,

    I was after getting anything to print.  I expected
    it to look like utter non-sense.

    It was part of my Keeper on buffers.  And I
    had to use "utf8-c8" to keep the line
    from crashing.

    -T



    Buffer to String:
         > say Buf.new(97,98,99).decode
         abc

         >print Buf.new(0x84, 0x73, 0x77, 0x84, 0x79).decode("utf8-c8")
    ~ "\n"
         􏿽x84sw􏿽x84y

         Decoding values, see:
    https://docs.raku.org/routine/encoding#class_IO::Handle
    <https://docs.raku.org/routine/encoding#class_IO::Handle>
    utf8
    utf16
    utf16le
    utf16be
    utf8-c8
    iso-8859-1
    windows-1251
    windows-1252
    windows-932
    ascii


That  "0x84" issue seems pretty well-known:

"Python3 Fix→ UnicodeDecodeError: ‘utf-8’ codec can’t decode byte in position" https://medium.com/code-kings/python3-fix-unicodedecodeerror-utf-8-codec-can-t-decode-byte-in-position-be6c2e2235ee <https://medium.com/code-kings/python3-fix-unicodedecodeerror-utf-8-codec-can-t-decode-byte-in-position-be6c2e2235ee>

HTH, Bill.

Hi Bill,

The "utf8-c8" gets around that.

And as I stated before, I am working with string
populated with non-printable characters, so I
just wanted to see something.  Did not matter
much what.

Thank you for the tips!

:-)

-T

Reply via email to