On Mon, 24 Jun 2024 at 08:20, Rayner Lucas via Python-list
<python-list@python.org> wrote:
>
> In article <mailman.159.1718991773.2909.python-l...@python.org>,
> ros...@gmail.com says...
> >
> > If you switch to a Linux system, it should work correctly, and you'll
> > be able to migrate the rest of the way onto Python 3. Once you achieve
> > that, you'll be able to operate on Windows or Linux equivalently,
> > since Python 3 solved this problem. At least, I *think* it will; my
> > current system has a Python 2 installed, but doesn't have tkinter
> > (because I never bothered to install it), and it's no longer available
> > from the upstream Debian repos, so I only tested it in the console.
> > But the decoding certainly worked.
>
> Thank you for the idea of trying it on a Linux system. I did so, and my
> example code generated the error:
>
> _tkinter.TclError: character U+1f40d is above the range (U+0000-U+FFFF)
> allowed by Tcl
>
> So it looks like the problem is ultimately due to a limitation of
> Tcl/Tk.
Yep, that seems to be the case. Not sure if that's still true on a
more recent Python, but it does look like you won't get astral
characters in tkinter on the one you're using.

> I'm still not sure why it doesn't give an error on Windows and

Because of the aforementioned weirdness of old (that is: pre-3.3)
Python versions on Windows. They were built to use a messy, buggy
hybrid of UCS-2 and UTF-16. Sometimes this got you around problems, or
at least masked them; but it wouldn't be reliable. That's why, in
Python 3.3, all that was fixed :)

> instead either works (when UTF-8 encoding is specified) or converts the
> out-of-range characters to ones it can display (when the encoding isn't
> specified). But now I know what the root of the problem is, I can deal
> with it appropriately (and my curiosity is at least partly satisfied).

Converting out-of-range characters is fairly straightforward, at least
as long as your Python interpreter is correctly built (so, Python 3,
or a Linux build of Python 2).

"".join(c if ord(c) < 65536 else "?" for c in text)

> This has given me a much better understanding of what I need to do in
> order to migrate to Python 3 and add proper support for non-ASCII
> characters, so I'm very grateful for your help!
>

Excellent. Hopefully all this mess is just a transitional state and
you'll get to something that REALLY works, soon!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to