Executive summary: I guess the bottom line is that I'm sympathetic to both the NFC and NFKC positions.
I think that wetware is such that people will go to the trouble of picking out a letter-like symbol from a palette rarely, and in my environment that's not going to happen at all because I use Japanese phonetic input to get most symbols ("sekibun" = integral, "siguma" = sigma), and I don't use calligraphic R for the real line, I use \newcommand{\R}{{\cal R}}, except on a physical whiteboard, where I use blackboard bold (go figure that one out!) So to my mind the letter-like block in Unicode is a failed experiemnt. Jim J. Jewett writes: > When I was a math student, these were clearly different symbols, > with much less relation to each other than a mere case difference. Arguable. The letter-like symbols block has script (cursive), blackboard bold, and Fraktur versions of R. I've seen all of them as well as plain Roman, bold, italic and bold italic facts used to denote the real line, and I've personally used most of them for that purpose depending on availability of fonts and input methods and medium (ie, computer text vs. hand-written). I've also seen several of them used for reaction functions or spaces thereof in game theory (although blackboard bold and Fraktur seem to be used uniquely for the real line). Clearly the common denominator is the uppercase latin letter "R", and the glyph being recognizably "R" is necessary and sufficient to each of those purposes. The story for uppercase sigma as sum is somewhat similar: sum is by far not the only use of that letter, although I don't know of any other operator symbol for sum over a set or series (outside of programming languages, which I think we can discount). I agree that we should consider math to be a separate language, but it doesn't have a consistent script independent of the origins of the symbols. Even today none of my engineering and economics students can type any symbols except those in the JIS repertoire, which they type by original name ("siguma", "ramuda", "arefu", "yajirushi" == arrow, etc, "sekibun" == integration does bring up the integral sign in at least some modern input methods, but it doesn't have a script name, while "kasann" == addition does not bring up sigma, although "siguma" does, and "essu" brings up sigma -- but only in "ASCII emoji" strings, go figure). I have seen students use fullwidth R for the real line, though, but distinguishing that is a deprecated compatibility feature of Unicode (and of Japanese practice -- even in very formal university documents such as grade reports for a final doctoral examination I've seen numbers and names containing mixed half-width and full-width ASCII). So I think "letter-like" was a reasonable idea (I'm pretty sure this block goes back to the '90s but I'm too lazy to check), but it hasn't turned out well, and I doubt it ever will. > So by the Unicode consortium's goals, they are independent > characters that should each be defined. I admit that isn't ideal > for most use cases outside of math, I don't think it even makes sense *inside* of math for the letter-like symbols. The nature of math means that any "R" will be grabbed for something whose name starts with "r" as soon as that's convenient. Something like the integral sign (which is a stretched "S" for "sum"), OK -- although category theory uses that for "ends" which still don't look anything like integrals even if you turn them inside out, rotate 90 degrees, and paint them blue. > > It's also a UX problem. At slightly higher layer in the stack, I'm > > used to using Japanese input methods to input sigma and pi which > > produce characters in the Greek block, and at least the upper case > > forms that denote sum and product have separate characters in the math > > operators block. > > I think that is mostly a backwards compatibility problem; XeTeX > itself had to worry about compatibility with TeX (which preceded > Unicode) and with the fonts actually available and then with > earlier versions of XeTeX. IMO, the analogy fails because the backward compatibility issue for Unicode is in the wetware, not in the software. Steve _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YTIIFIF75RMWP5J3GCSXWVXSUP5SX7AA/ Code of Conduct: http://python.org/psf/codeofconduct/