On Sun, Nov 19, 2017 at 5:16 AM, Nick Coghlan <[email protected]> wrote: > On 19 November 2017 at 13:22, Mikhail V <[email protected]> wrote: >> For me, one "cheap" solution against underscores is to use >> syntax highlighting which grays them out, but if those become like >> spaces, then it becomes a bit confusing, e.g. in function with many >> arguments. >> Also, unfortunately, not many editors allow easy (if any) highlighting >> customisation on that level. > > Changing the way editors display underscore-using variable names still > seems like a more productive direction to explore than changing the > text encoding read by the compiler.
Indeed that would be a solution. *Would* be. But I don't know of any editor that does that afaik (and they should not in this case, see below). My view on pros&cons for this solution: Pros: other languages also have the same issue, so if editors maintainers would agree to compromise and introduce feature of dynamic substitution, that would give users possibility to face-lift other syntaxes as well. Cons: this feature would make sense if the substitution happens only in those part where it should, namely it should not touch anything in string literals, comment blocks. So the lexer should 'know' where to substitute or not and it is not the same as just passing the internal memory representation through a translation table. My opinion about this however is based on other principles. Imagine that you are the language designer and I am responsible for the typesetting component of some editor, and we have such a dialogue: you: "hey Mikhail, we use hyphen for minus operator, now can you please patch the renderer so that our users see the minus instead of hyphen, and please make sure users can also toggle it in real time to see what actual char is there and also make the substitution only in the places where hyphen is used as the operator." me: "well, I understand your complain, but my renderer already supports Unicode, and I do my best to support typography practices, namely render hyphen as *hyphen*, which is well established for centuries in typography, and defined as a dash of 50% width of the letter "o" and is aligned to lowercase. As well as the Minus glyph which is defined as ca. 110% of "o" width and is aligned to the digits&caps. So you as the language designer should be interested to deliver best practices to the users, and hyphen is way more important for the lexical structure of the written language, than the minus operator. Why would not you just try to solve the issue in a "fair" way?" By the fair way I understand the way which tends to bring the correct usage of characters back, instead of trying to hide the problem with some patch. Now I can't say what is the least problematic way for Python, but if I were responsible for that, I would base the solution on these principles: 1. The future versions of syntax, ideally, must allow ONLY minus U2212 for the minus operator, and allow hyphens 002D in identifiers. Since it is impossible to the current moment, I must think out the least painful transition. 2. I want users to be able to use underscore as well. Underscore is derived from the mechanical type-writers - to make an underlined text one pushed the carriage back and tipped the underscore to make the line under the text. Currently in digital print it does not make much sense and as a separator looks ugly, but still it not so hopeless. Currently the underscore lies below the font baseline but if one makes it closer to the baseline, then it can be used as a fairly adequate additional separator, so a user would become more ways to denote lexical identifiers. 3. I don't want to break the backward-compatibility but still I am oriented on compliance with typography practices and standards for charcodes. Also I want users who are interested in better UX become the benefits out-of-the-box, without forcing them to tweak the text-editors or writing own translators. What to do? One option IMO would be to introduce a header in the sources, e.g.: # opt-in: hyphen-minus Which would tell the parser to toggle the "new" rules, namely U+2212 would be parsed as minus operator and hyphens as part of identifiers. Then users who are aware of benefits and remember monospaced fonts only as unpleasant incident from their youth, can enjoy the beauty of source code without any tweaks, and the only thing they need to do is to bind a key to input the U+2212 sign. The users who do not want it, just leave this out. Further, I'd add a command-line util that can directly translate to the "old" syntax, in case one want to export a project in old syntax. So one could avoid backward compatibility issue. That is just one option that comes to my mind. Another thing which might be important in this regard: Say you want to publish a book about Python. With such syntax you could directly import the code into a DTP software, and you don't need to make any corrections, so it looks almost as a normal English text, and no worries about strange looking minus operators. Mikhail _______________________________________________ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
