On Thursday, March 30, 2017 at 2:58:53 AM UTC-5, Steven D'Aprano wrote: > On Wed, 29 Mar 2017 20:53:35 -0700, Rick Johnson wrote: > > > On Sunday, March 26, 2017 at 1:21:18 PM UTC-5, Chris Angelico wrote: > >> On Mon, Mar 27, 2017 at 4:43 AM, Steve D'Aprano > >> <steve+pyt...@pearwood.info> wrote:
[...] > > You place a lot of faith in the supposed "immutability of > > Unicode code points", but a lot people have placed a lot > > of faith in many past and current encoding systems, only > > to be bamboozled by the gatekeepers some time later. > > So far, Unicode has kept that promise not to move or rename > code points. Promises are not worth the paper they are written on -- unless you have them officially notarize, then they might be worth "something". > They haven't done so even when they've made embarrassing > mistakes, like giving characters the completely wrong name. So they're screw-ups, but at least they're "honest screwups"? MmmKay. Got it. > That makes Unicode more stable than ASCII, which has gone > through a number of incompatible changes since the very > first version back in 1963 (I think it was 63?). Why does every comparison of Unicode end with some cheap jab at ASCII? It brings back horrible memories of Obama blaming Bush for everything. I mean, sure, Bush was a total idiot and he trampled the constitution with his Tony Lama's on more than one occasion, but after the hundred-thousandth time of hearing Obama decry "Hey, but Bush did it -- WAH!", it starts to get really old, really fast. At some point, both Obama and Unicode have to put their big boy pants on and take responsibility for their own mistakes and short comings. And with the Trump election and the Brexit referendum, we can see that Obama took it on the shorts. > To put it another way: > > Unicode character Д (U+0414 CYRILLIC CAPITAL LETTER DE) is > no more likely to change to another character than ASCII > character 0x41 is likely to change from "A" to "Z". > Trusting the stability of Unicode is a pretty safe bet. I don't understand how that argument supports your beloved Unicode over ASCII? > > And so, although Unicode was created to solve the endless > > compatibility problems between multiple archaic encoding > > systems, the designers thought it necessary to add a > > "custom space" that will keep the incompatibilities on > > life support. > > No, that's not how it works. The PUAs really are for > *PRIVATE* use. If your in-house application wants some > dedicated, custom characters (say, for your company logo), > there are three ways you can do it, starting from the > dumbest: > > - pick some existing code point that you think nobody will > ever use, like "Q" say, and turn that into your logo; > > - pick some *currently* unused code point, and hope that it > will not become used; > > - pick a code point from the PUA which is permanently > reserved for private use by private agreement. > > You can't expect other people's applications to treat that > code point as your logo (or whatever special purpose you > give it), but that's okay, you couldn't expect that in any > case. The problem is, as has always been the case with encodings, is that some random dev will create a so-called "private" code point, and others, liking what they see, will adopt the same code point into their own projects. And after enough of these "emulations" occur, you've created a new unofficial de facto standard. And we're all back to square one again. Another possibility, and one that this community is all to familiar with, is that the Unicode gatekeepers could intentionally isolate themselves from the user base and withdraw into their ivory towers, thereby creating a large swath of disillusioned folks who look to the PUA for their collective salvation. And again, we are back to square one. It could be that the PUA is to Unicode what type-hints are to Python. Something to think about... > One of the excellent ways the PUAs have been used is by > medieval researchers. They have been collecting the various > special characters used by medieval scribes, and by private > agreement putting them into a PUA area where special > purpose software can use it. That way they can determine > which of the thousands of special characters used by > medieval monks are actually significant enough to graduate > to genuine Unicode characters, and which are best handled > some other way. And what if the gatekeepers refuse to graduate those special chars? And what if, in response, the natives become restless? A revolt. That's what! And their liberty will be found not in the new lands, but in the PUA. -- https://mail.python.org/mailman/listinfo/python-list