On Friday, April 8, 2016 at 11:14:21 PM UTC+5:30, Marko Rauhamaa wrote: > Peter Pearson : > > > On Fri, 08 Apr 2016 16:00:10 +1000, Steven D'Aprano wrote: > >> They are not, and never have been, in the typesetting business. > >> Perhaps characters are not the only things easily confused *wink* > > > > Defining codepoints that deal with appearance but not with meaning is > > going into the typesetting business. Examples: ligatures, and spaces > > of varying widths with specific typesetting properties like being > > non-breaking. > > > > Typesetting done in MS Word using such Unicode codepoints will never > > be more than a goofy approximation to real typesetting (e.g., TeX), > > but it will cost a huge amount of everybody's time, with the current > > discussion of ligatures in variable names being just a straw in the > > wind. Getting all the world's writing systems into a single, coherent > > standard was an extraordinarily ambitious, monumental undertaking, and > > I'm baffled that the urge to broaden its scope in this irrelevant > > direction was entertained at all. > > I agree completely but at the same time have a lot of understanding for > the reasons why Unicode had to become such a mess. Part of it is > historical, part of it is political, yet part of it is in the > unavoidable messiness of trying to define what a character is.
There are standards and standards. Just because they are standard does not make them useful, well-designed, reasonable etc.. Its reasonably likely that all our keyboards start QWERT... Doesn't make it a sane design. Likewise using NFKC to define the equivalence relation on identifiers is analogous to saying: Since QWERTY has been in use for over a hundred years its a perfectly good design. Just because NFKC has the stamp of the unicode consortium it does not straightaway make it useful for all purposes -- https://mail.python.org/mailman/listinfo/python-list