On 6/20/22 00:36, William Michels wrote:
On Sun, Jun 19, 2022 at 11:00 PM ToddAndMargo via perl6-users
<perl6-us...@perl.org <mailto:perl6-us...@perl.org>> wrote:
On 6/19/22 21:49, William Michels via perl6-users wrote:
> Hi Todd, I'm trying to follow what you're doing (below in
Terminal app
> on MacOS):
>
> ~$ raku
> Welcome to 𝐑𝐚𝐤𝐮𝐝𝐨™ v2021.06.
> Implementing the 𝐑𝐚𝐤𝐮™ programming language v6.d.
> Built on MoarVM version 2021.06.
>
> To exit type 'exit' or '^D'
> > print Buf.new(0x84, 0x73, 0x77, 0x84, 0x79).decode("utf8-c8")
~ "\n"
> x84swx84y
> >
>
> Not clear if this is what you expect. I've also run code from
> https://docs.raku.org/language/unicode#UTF8-C8
<https://docs.raku.org/language/unicode#UTF8-C8>
> <https://docs.raku.org/language/unicode#UTF8-C8
<https://docs.raku.org/language/unicode#UTF8-C8>> and see something
> different from what's posted there:
>
> my $test-file = "/tmp/test";
> given open($test-file, :w, :bin) {
> .write: Buf.new(ord('A'), 0xFA, ord('B'), 0xFB, 0xFC,
ord('C'),
> 0xFD);
> .close;
> }
>
> say slurp($test-file, enc => 'utf8-c8');
> # OUTPUT: «(65 250 66 251 252 67 253)»
>
>
> The output I actually see is:
>
> AxFABxFBxFCCxFD
>
> If I go into /tmp and look at the file created, it contains the
> following single line:
>
> AúBûüCý
>
>
> HTH, Bill.
>
>
> On Sun, Jun 19, 2022 at 8:42 PM ToddAndMargo via perl6-users
> <perl6-us...@perl.org <mailto:perl6-us...@perl.org>
<mailto:perl6-us...@perl.org <mailto:perl6-us...@perl.org>>> wrote:
>
> >print Buf.new(0x84, 0x73, 0x77, 0x84,
0x79).decode("utf8-c8") ~ "\n"
>
> x84swx84y
>
Hi Bill,
I was after getting anything to print. I expected
it to look like utter non-sense.
It was part of my Keeper on buffers. And I
had to use "utf8-c8" to keep the line
from crashing.
-T
Buffer to String:
> say Buf.new(97,98,99).decode
abc
>print Buf.new(0x84, 0x73, 0x77, 0x84, 0x79).decode("utf8-c8")
~ "\n"
x84swx84y
Decoding values, see:
https://docs.raku.org/routine/encoding#class_IO::Handle
<https://docs.raku.org/routine/encoding#class_IO::Handle>
utf8
utf16
utf16le
utf16be
utf8-c8
iso-8859-1
windows-1251
windows-1252
windows-932
ascii
That "0x84" issue seems pretty well-known:
"Python3 Fix→ UnicodeDecodeError: ‘utf-8’ codec can’t decode byte in
position"
https://medium.com/code-kings/python3-fix-unicodedecodeerror-utf-8-codec-can-t-decode-byte-in-position-be6c2e2235ee
<https://medium.com/code-kings/python3-fix-unicodedecodeerror-utf-8-codec-can-t-decode-byte-in-position-be6c2e2235ee>
HTH, Bill.
Hi Bill,
The "utf8-c8" gets around that.
And as I stated before, I am working with string
populated with non-printable characters, so I
just wanted to see something. Did not matter
much what.
Thank you for the tips!
:-)
-T