On Wed, Jun 4, 2014 at 2:40 PM, Rustom Mody <rustompm...@gmail.com> wrote: > On Wednesday, June 4, 2014 9:22:54 AM UTC+5:30, Chris Angelico wrote: >> On Wed, Jun 4, 2014 at 1:37 PM, Rustom Mody wrote: >> > And so a pure BMP-supporting implementation may be a reasonable >> > compromise. [As long as no surrogate-pairs are there] > >> Not if you're working on the internet. There are several critical >> groups of characters that aren't in the BMP, such as: > > Of course. But what has the internet to do with micropython?
Earlier you said: > IOW from pov of a universallly acceptable character set this is mostly > rubbish "Universally acceptable character set" and microcontrollers may well not meet, but if you're talking about universality, you need Unicode. It's that simple. Maybe there's a use-case for a microcontroller that works in ISO-8859-5 natively, thus using only eight bits per character, but even if there is, I would expect a Python implementation on it to expose Unicode codepoints in its strings. (Most of the time you won't even be aware of the exact codepoint values. It's only when you put \xNN or \uNNNN or U000NNNNN escapes into your strings, or explicitly use ord/chr or equivalent, that it'd make a difference.) The point is not that you might be able to get away with sticking your head in the sand and wishing Unicode would just go away. Even if you can, it's not something Python 3 can ever do. And I don't think anybody can, anyway. If your device is big enough to hold Python, it should be big enough to handle Unicode; and then you don't have to say "Oh, sorry rest-of-the-world, this only works in English... and only a subset of English... and stuff". ChrisA -- https://mail.python.org/mailman/listinfo/python-list