Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
MRAB writes: > That's a strange thing to do. It's more usual to use a _subscript_ to > indicate an index: a₃ vs a³ Oh, we economic theorists do that too. It's typically a double-indexed array of parameters, where both rows and columns can be meaningfully be treated as vectors. So a₃ is the vector of quantities of good 3 produced by all firms, while a³ is the vector of quantities of all goods produced by firm 3. Or in analysis of international or interregional trade, there's an index indicating which country exports which good to which importing country. Some people put the good index in the superscript and the two countries in the subscript, others the opposite. IIRC, mathematical physicist use both subscript and superscript in tensor notation, nuclear physicists use one for atomic number and the other for atomic weight (and thus would expect both subscript and superscript to be treated lexically as identifier components, not as expression components). The point is not that polynomials are not the most common use of superscript notation -- I don't care one way or the other. It's that there are many uses, important to those fields, that aren't polynomials. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
Steven D'Aprano writes: > In other words, ² behaves as a unary postfix operator that squares > its argument. Likewise for ³, etc. You can even combine them: x³³ > would be the same as x**33. There's more here: I hope that's configurable. I use superscripts to indicate an index as often as I use them to indicate an exponent. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
On Mon, Oct 31, 2016 at 10:19:58AM +0900, Stephen J. Turnbull wrote: > Steven D'Aprano writes: > > > I see that Perl is leading the way here, supporting a large number of > > Unicode symbols: > > > > https://docs.perl6.org/language/unicode_entry.html > > In what sense is that "support"? In the sense that Perl 6 not only allows Unicode identifiers (as Python has for many years) but also Unicode operators and symbols. For example, you can use either the Unicode character ⊂ \N{SUBSET OF} or the ASCII trigraph (<) for doing subset tests. > > I must say that it is kinda cute that Perl6 does the right thing for x². > > Uh, as far as I can tell from that page, Perl has absolutely nothing > to do with that. You enter the Unicode code point as hex, and if the > font supports, you get the character. You missed the bit that Parl 6 interprets "x²" in code as the equivalent of x**2 (x squared). In other words, ² behaves as a unary postfix operator that squares its argument. Likewise for ³, etc. You can even combine them: x³³ would be the same as x**33. There's more here: https://docs.perl6.org/language/unicode_texas -- Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
On Sun, Oct 30, 2016 at 10:16:19AM -0700, David Mertz wrote: > My vim configuration for a year or two has looked something like this (the > screenshot doesn't show the empty set symbol, but that's part of my conceal > configuration: http://gnosis.cx/bin/.vim/after/syntax/python.vim). Oh nice! By the way, anyone looking at this in a browser may find that the browser defaults to treating it as Latin-1, which gives you mojibake. Just tell you browser to treat it as Unicode. -- Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
My vim configuration for a year or two has looked something like this (the screenshot doesn't show the empty set symbol, but that's part of my conceal configuration: http://gnosis.cx/bin/.vim/after/syntax/python.vim). On Sun, Oct 30, 2016 at 7:13 AM, Chris Angelicowrote: > On Mon, Oct 31, 2016 at 12:39 AM, Paul Moore wrote: > > It's certainly not difficult, in principle. I have (had, I lost it in > > an upgrade recently...) a little AutoHotkey program that interpreted > > Vim-style digraphs in any application that needed them. But my point > > was that we don't want to require people to write such custom > > utilities, just to be able to write Python code. Or is the feeling > > that it's acceptable to require that? > > There's a chicken-and-egg problem. So long as most people don't have > tools like that, a language that requires them is going to be very > annoying - but so long as no major language uses such characters, > there's no reason for developers to set up those kinds of tools. > > Possibly the best way is a gentle introduction of alternative > syntaxes. Since Python currently has no "empty set display" syntax, > that seems like a perfect starting point. You can always type "set()", > but that involves an actual function call; using ∅ gives a small > performance boost, eliminates the risk of shadowing, etc, etc. All > minor points, but could be convenient enough. Also, if repr(set()) > returns "∅", it'll be easy for anyone to get hold of the character for > copy/paste. > > As of 2016, I think it's not acceptable to *require* this, but it may > be time to start making use of it, retaining ASCII-only digraphs and > trigraphs, the way C has alternative spelling for braces and so on. > Then time passes, most people will be comfortable using the characters > themselves, and the digraphs/trigraphs can be deprecated, with new > syntax not being given any. > > Pipe dream? > > ChrisA > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
On Mon, Oct 31, 2016 at 12:19 PM, Stephen J. Turnbullwrote: > Uh, as far as I can tell from that page, Perl has absolutely nothing > to do with that. You enter the Unicode code point as hex, and if the > font supports, you get the character. What Paul is arguing is that > entering any character, non-ASCII or ASCII, as a hex code point or as > an Alt+digits sequence, is a non-starter for our audience. Much as > I'd like to disagree, I can't. Back when I used a single codepage (IBM OEM, now called 437) and 256 characters, it wasn't unreasonable to memorize the alt-codes for most of those characters. I could do all the single-line and double-line characters from memory (might take me a couple of tries to get the right corner), and if I needed to mix line types, I could just look those up. But with all of Unicode? Totally impractical. You can't expect people to use the hex codes. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
Steven D'Aprano writes: > I see that Perl is leading the way here, supporting a large number of > Unicode symbols: > > https://docs.perl6.org/language/unicode_entry.html In what sense is that "support"? What I see on that page is a lot of advice for the kind of people who are already using non-ASCII in Python, as I have been doing since 2001 or so. > I must say that it is kinda cute that Perl6 does the right thing for x². Uh, as far as I can tell from that page, Perl has absolutely nothing to do with that. You enter the Unicode code point as hex, and if the font supports, you get the character. What Paul is arguing is that entering any character, non-ASCII or ASCII, as a hex code point or as an Alt+digits sequence, is a non-starter for our audience. Much as I'd like to disagree, I can't. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
On Mon, Oct 31, 2016 at 12:02:54AM +1000, Nick Coghlan wrote: > What this means is that there aren't likely to be many practical gains > in using the "right" symbol for something, even when it's already > defined in Unicode, as we expect the number of people learning that > symbology *before* learning Python to be dramatically smaller than the > proportion learning Python first and the formal mathematical symbols > later (if they learn them at all). Depends on the symbol. Most people do study maths in secondary school where they will be introduced to symbols beyond the ASCII + - * / etc, for instance set union and intersection ∪ ∩, although your point certainly applies to some of the more exotic (even for mathematicians) symbols in Unicode. > This means that instead of placing more stringent requirements on > editing environments for Python source code in order to use non-ASCII > input symbols, we're still far more likely to look to define a > suitable keyword, or assign a relatively arbitrary meaning to an ASCII > punctuation symbol Indeed. But there's only so many ASCII punctuation marks, and digraphs and trigraphs can become tiresome. And once people have solved the keyboard entry issue, it is no harder to memorise the "correct" symbol than some arbitrary ASCII sequence. > (and that's assuming we accept that a proposal will > see sufficient use to be worthy of new syntax in the first place, > which is far from being a given). I see that Perl is leading the way here, supporting a large number of Unicode symbols: https://docs.perl6.org/language/unicode_entry.html I must say that it is kinda cute that Perl6 does the right thing for x². -- Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
Paul Moore writes: > My point wasn't so much about dealing with the character set of > Unicode, as it was about physical entry of non-native text. For > example, on my (UK) keyboard, all of the printed keycaps are basically > used. How do you type the pound sign and the Euro sign? Are they on the UK keyboard? Or are you not in the UK and don't need them? > And yet, I can't even enter accented letters from latin-1 with a > standard keypress, much less extended Unicode. I'm pretty sure you can, but since I've been Windows-free for 20 years (except for a short period when I was treasurer for an NGO, and only used it to access the accounting system), I can't tell you what it is. On the Mac, you press alt/option plus a graphic key. Most result in what somebody decided are common non-ASCII characters (German sharp S, Greek lowercase mu, Greek upper- and lowercase sigma), but several are dead keys, producing accented characters when combined with a base character: tilde, accents acute and grave, and so on. Surely Windows has a similar system (I don't mean Alt+digits). (But maybe not, I didn't notice one in my brief Googling.) > My interest in East Asian experience is at least in part because > the "normal" character sets, as I understand it, are big enough > that it's impractical for a keyboard to include a plausible basic > range of characters, so I'm curious as to what the physical process > is for typing from a vocabulary of thousands of characters on a > sanely-sized keyboard. You're right about the size. Korean is special, because the 11,000- odd Hangul are phonetic and generated algorithmically from a set of about 70 phonetic partial glyphs, divided into three groups. The same keys do multiple duty when typed in phonetic order. Other systems use the shift key. For the 100,000 Han ideographs[1], there are a wide variety of methods for entry by key sequence, ranging from code point entry to context-dependent phonetic entry of entire sentences as they would be spoken. Then, of course, there's voice recognition, and handwriting recognition (both static from the image, and dynamic, taking account of the order of pen strokes). The more advanced input methods not only take account of grammar, but also learn the users' habits, remember recent conversions, and predict coming keystrokes based on current context, offering several conversions based on plausible continuations. > In mentioning emoji, my main point was that "average computer > users" are more and more likely to want to use emoji in general > applications (emails, web applications, even documents) - and if a > sufficiently general solution for that problem is found, it may > provide a solution for the general character-entry case. Not for the Asian languages. For them, "character entry" in the sense of character-by-character has long since been obsoleted by predictive sentence-level phonetic methods. But emoji are a perfect example for the present purpose, since they don't have standard pronunciations (although probably many will get them based on the Unicode standard names). On systems with high- enough resolution displays, a palette showing the glyphs is the obvious solution. But that's not pleasant if you type quickly and need those characters frequently. I don't think there's an alternative for emoji though, except for personalized shortcut maps. Math symbols are similar, I think. > Coming back to a more mundane example, if I need to type a character > like é in an email, I currently need to reach for Character Map and > cut and paste it. The same is true if I have to type it into the > console. You probably have Control, Windows, Menu, Alt, and maybe a "function" key. If you're lucky, one labelled AltGr for "Alternate Graphic" is the obvious suspect. Some combination of the above probably allows entry of accented Latin-1 characters, miscellaneous Latin-1 (eg, sharp S), and a few oddballs (Greek letters, ligatures like oe, the leminiscate usually read infinity). > That's a sufficiently annoying stumbling block It very well could be, although my Windows Google-foo isn't great. But this is what I found. For WHITE SQUARE, the Mac doesn't have a keyboard equivalent, but there's a standard way to set up a set of shortcut keys[2]: http://stackoverflow.com/questions/3685146/how-do-you-do-the-therefore-%E2%88%B4-symbol-on-a-mac-or-in-textmate And I think you can also use the "Input Preferences" screen in System Preferences to set up a few of them. For Windows, it seems that Alt+decimal character codes, or hex Unicode followed by Alt+x are the built-in ways to enter characters not on your keyboard.. It's also possible to set up "Math Autocorrect" to automatically convert keysequences according to https://blogs.msdn.microsoft.com/murrays/2011/08/29/sans-serif-mathematical-symbols/ but that's hardly obvious (although maybe it is if you're Dutch?) I have to wonder why so many people stick with a system that
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
Steven D'Aprano wrote: > I cannot wait for the day that we can use non-ASCII operators. But I > don't think that day has come: it is still too hard for many people > (including me) to generate non-ASCII characters at the keyboard, and > font support for some of the more useful ones are still inconsistent or > lacking. > For example, we don't have a good literal for empty sets. How about ∅? > Sadly, in my mail client and in the Python REPR, it displays as a > "missing glyph" open rectangle. And how would you type it? I will just share my view on the whole problematic, trying to concentrate more on the actual code look. So I see it all as a chain of big steps, roughly: 1. One defines *the real code* or syntax, this means: One takes a pen and a paper (photoshop/paint bucket) and *defines* the syntax, this means one defines everything as it is, including pixel precise spaces between operators, punctuation and so on. 2. One develops an application (IDE) which enables you to automatically load code file and (at least) view it *exactly* as you have defined it. 3. Only after that one starts to think about ASCII/unicode/Hangul (forgive me Lord) or whatever someone has defined as something useful/standard. > Java, I believe, allows you to enter escape sequences in source code, > not just in strings. So we could hypothetically allow one of: > >myobject\N{WHITE SQUARE}attribute >myobject\u25a1attribute > > as a pure-ASCII way of getting > >myobject□attribute So this actually would be a possible kind of "bridge" from the real code to what is shown up in arbitrary text editing application or a mailing client. In other words, you believe that in Unicode table you'll find something useful for code definition, but I personally would not even start relying on that, also because it is merely down-top problem solving. Mikhail ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
Hi all, For those of you not aware, the Julia Programming Language [1] does make extensive use of (mathematical) unicode symbols in its standard library, even document a method of input [2] (hint tab completion). They go even further by recognizing some characters (like \oplus) that parse as operators and have predefined precedences, but no implementations, leaving them available to the user. Regardless of my personal feeling about that, I have observed that this does not seem to hinder Julia developement. Many developers seem to like it a lot. Though my sampling is heavily biased toward developers with a strong math background. So it might be a case study to actually see how this affect an existing language both technically and community wide. Cheers, -- M [1] : julialang.org [2] : http://docs.julialang.org/en/release-0.5/manual/unicode-input/ [3] : http://docs.julialang.org/en/release-0.5/manual/variables/#allowed-variable-names On Sun, Oct 30, 2016 at 7:02 AM, Nick Coghlanwrote: > On 30 October 2016 at 23:39, Paul Moore wrote: >> It's certainly not difficult, in principle. I have (had, I lost it in >> an upgrade recently...) a little AutoHotkey program that interpreted >> Vim-style digraphs in any application that needed them. But my point >> was that we don't want to require people to write such custom >> utilities, just to be able to write Python code. Or is the feeling >> that it's acceptable to require that? > > Getting folks used to the idea that they need to use the correct kinds > of quotes is already challenging :) > > However, the main issue is the one I mentioned in PEP 531 regarding > the "THERE EXISTS" symbol: Python and other programming languages > re-use "+", "-", "=" etc because a lot of folks are already familiar > with them from learning basic arithmetic. Other symbols are used in > Python because they were inherited from C, or were relatively > straightforward puns on such previously inherited symbols. > > What this means is that there aren't likely to be many practical gains > in using the "right" symbol for something, even when it's already > defined in Unicode, as we expect the number of people learning that > symbology *before* learning Python to be dramatically smaller than the > proportion learning Python first and the formal mathematical symbols > later (if they learn them at all). > > This means that instead of placing more stringent requirements on > editing environments for Python source code in order to use non-ASCII > input symbols, we're still far more likely to look to define a > suitable keyword, or assign a relatively arbitrary meaning to an ASCII > punctuation symbol (and that's assuming we accept that a proposal will > see sufficient use to be worthy of new syntax in the first place, > which is far from being a given). > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
On 2016-10-30 10:47 AM, Paul Moore wrote: On 30 October 2016 at 14:43,wrote: Just picking a nit, here, windows will happily let you do silly things like hook 14 keyboards up and let you map all of emoji to them. Sadly, this requires lua. Off topic, I know, but how? I have a laptop with an external and an internal keyboard. Can I map the internal keyboard to different characters somehow? Look up "The Art of the Bodge: How I Made the Emoji Keyboard" by Tom Scott on Youtube. As the name implies, it's a huge hack with no practicality whatsoever Alex ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
Paul Moore writes: > My point wasn't so much about dealing with the character set of > Unicode, as it was about physical entry of non-native text. For > example, on my (UK) keyboard, all of the printed keycaps are basically > used. How do you type the pound sign and the Euro sign? Are they on the UK keyboard? Or are you not in the UK and don't need them? > And yet, I can't even enter accented letters from latin-1 with a > standard keypress, much less extended Unicode. I'm pretty sure you can, but since I've been Windows-free for 20 years (except for a short period when I was treasurer for an NPO, and only used it to access the accounting system), I can't tell you what it is. On the Mac, you press alt/option plus a graphic key. Most result in what somebody decided are common non-ASCII characters (German sharp S, Greek lowercase mu, Greek upper- and lowercase sigma), but several are dead keys, producing accented characters when combined with a base character: tilde, accents acute and grave, and so on. Surely Windows has a similar system (I don't mean Alt+digits). (But maybe not, I didn't notice one in my brief Googling.) > My interest in East Asian experience is at least in part because > the "normal" character sets, as I understand it, are big enough > that it's impractical for a keyboard to include a plausible basic > range of characters, so I'm curious as to what the physical process > is for typing from a vocabulary of thousands of characters on a > sanely-sized keyboard. You're right about the size. Korean is special, because the 11,000- odd Hangul are phonetic and generated algorithmically from a set of about 70 phonetic partial glyphs, divided into three groups. The same keys do multiple duty when typed in phonetic order. Other systems use the shift key. For the 100,000 Han ideographs[1], there are a wide variety of methods for entry by key sequence, ranging from code point entry to context-dependent phonetic entry of entire sentences as they would be spoken. Then, of course, there's voice recognition, and handwriting recognition (both static from the image, and dynamic, taking account of the order of pen strokes). The more advanced input methods not only take account of grammar, but also learn the users' habits, remember recent conversions, and predict coming keystrokes based on current context, offering several conversions based on plausible continuations. > In mentioning emoji, my main point was that "average computer > users" are more and more likely to want to use emoji in general > applications (emails, web applications, even documents) - and if a > sufficiently general solution for that problem is found, it may > provide a solution for the general character-entry case. Not for the Asian languages. For them, "character entry" in the sense of character-by-character has long since been obsoleted by predictive sentence-level phonetic methods. But emoji are a perfect example for the present purpose, since they don't have standard pronunciations (although probably many will get them based on the Unicode standard names). On systems with high- enough resolution displays, a palette showing the glyphs is the obvious solution. But that's not pleasant if you type quickly and need those characters frequently. I don't think there's an alternative for emoji though, except for personalized shortcut maps. Math symbols are similar, I think. > Coming back to a more mundane example, if I need to type a character > like é in an email, I currently need to reach for Character Map and > cut and paste it. The same is true if I have to type it into the > console. You probably have Control, Windows, Menu, Alt, and maybe a "function" key. If you're lucky, one labelled AltGr for "Alternate Graphic" is the obvious suspect. Some combination of the above probably allows entry of accented Latin-1 characters, miscellaneous Latin-1 (eg, sharp S), and a few oddballs (Greek letters, ligatures like oe, the leminiscate usually read infinity). > That's a sufficiently annoying stumbling block It very well could be, although my Windows Google-foo isn't great. But this is what I found. For WHITE SQUARE, the Mac doesn't have a keyboard equivalent, but there's a standard way to set up a set of shortcut keys[2]: http://stackoverflow.com/questions/3685146/how-do-you-do-the-therefore-%E2%88%B4-symbol-on-a-mac-or-in-textmate And I think you can also use the "Input Preferences" screen in System Preferences to set up a few of them. For Windows, it seems that Alt+decimal character codes, or hex Unicode followed by Alt+x are the built-in ways to enter characters not on your keyboard. It's also possible to set up "Math Autocorrect" to automatically convert keysequences according to https://blogs.msdn.microsoft.com/murrays/2011/08/29/sans-serif-mathematical-symbols/ but that's hardly obvious (although maybe it is if you're Dutch?) I have to wonder why so many people stick with a system that
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
On 30 October 2016 at 14:43,wrote: > Just picking a nit, here, windows will happily let you do silly things like > hook 14 keyboards up and let you map all of emoji to them. Sadly, this > requires lua. Off topic, I know, but how? I have a laptop with an external and an internal keyboard. Can I map the internal keyboard to different characters somehow? Paul ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
> -Original Message- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon@python.org] On Behalf Of Paul Moore > Sent: Sunday, October 30, 2016 8:22 AM > To: Stephen J. Turnbull <turnbull.stephen...@u.tsukuba.ac.jp> > Cc: Python-Ideas <python-ideas@python.org> > Subject: Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null > coalescing operator] > > > My point wasn't so much about dealing with the character set of > Unicode, as it was about physical entry of non-native text. For > example, on my (UK) keyboard, all of the printed keycaps are basically > used. And yet, I can't even enter accented letters from latin-1 with a > standard keypress, much less extended Unicode. Of course it's possible > to get those characters (either by specialised mappings in an editor, > or by using an application like Character Map) but there's nothing > guaranteed to work across all applications. That's a hardware and OS > limitation - the hardware only has so many keys to use, and the OS > (Windows, in my case) doesn't support global key mapping (at least not > to my knowledge, in a user-friendly manner - I'm excluding writing my > own keyboard driver :-)) My interest in East Asian experience is at > least in part because the "normal" character sets, as I understand it, > are big enough that it's impractical for a keyboard to include a > plausible basic range of characters, so I'm curious as to what the > physical process is for typing from a vocabulary of thousands of > characters on a sanely-sized keyboard. > Just picking a nit, here, windows will happily let you do silly things like hook 14 keyboards up and let you map all of emoji to them. Sadly, this requires lua. > In mentioning emoji, my main point was that "average computer users" > are more and more likely to want to use emoji in general applications > (emails, web applications, even documents) - and if a sufficiently > general solution for that problem is found, it may provide a solution > for the general character-entry case. (Also, I couldn't resist the > irony of using a :-) smiley while referring to emoji...) But it may be > that app-specific solutions (e.g., the smiley menu in Skype) are > sufficient for that use case. Or the typical emoji user is likely to > be using a tablet/phone rather than a keyboard, and mobile OSes have > included an emoji menu in their on-screen keyboards. > > Coming back to a more mundane example, if I need to type a character > like é in an email, I currently need to reach for Character Map and > cut and paste it. The same is true if I have to type it into the > console. That's a sufficiently annoying stumbling block that I'm > inclined to avoid it - using clumsy workarounds like referring to "the > OP" rather than using their name. I'd be fairly concerned about > introducing non-ASCII syntax into Python while such stumbling blocks > remain - the amount of code typed outside of an editor (interactive > prompt, emails, web applications like Jupyter) mean that editor-based > workarounds like custom mappings are only a partial solution. > > But maybe you are right, and it's just my age showing. The fate of APL > probably isn't that relevant these days :-) (or ☺ if you prefer...) > > Paul > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
On Mon, Oct 31, 2016 at 12:39 AM, Paul Moorewrote: > It's certainly not difficult, in principle. I have (had, I lost it in > an upgrade recently...) a little AutoHotkey program that interpreted > Vim-style digraphs in any application that needed them. But my point > was that we don't want to require people to write such custom > utilities, just to be able to write Python code. Or is the feeling > that it's acceptable to require that? There's a chicken-and-egg problem. So long as most people don't have tools like that, a language that requires them is going to be very annoying - but so long as no major language uses such characters, there's no reason for developers to set up those kinds of tools. Possibly the best way is a gentle introduction of alternative syntaxes. Since Python currently has no "empty set display" syntax, that seems like a perfect starting point. You can always type "set()", but that involves an actual function call; using ∅ gives a small performance boost, eliminates the risk of shadowing, etc, etc. All minor points, but could be convenient enough. Also, if repr(set()) returns "∅", it'll be easy for anyone to get hold of the character for copy/paste. As of 2016, I think it's not acceptable to *require* this, but it may be time to start making use of it, retaining ASCII-only digraphs and trigraphs, the way C has alternative spelling for braces and so on. Then time passes, most people will be comfortable using the characters themselves, and the digraphs/trigraphs can be deprecated, with new syntax not being given any. Pipe dream? ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
On 30 October 2016 at 23:39, Paul Moorewrote: > It's certainly not difficult, in principle. I have (had, I lost it in > an upgrade recently...) a little AutoHotkey program that interpreted > Vim-style digraphs in any application that needed them. But my point > was that we don't want to require people to write such custom > utilities, just to be able to write Python code. Or is the feeling > that it's acceptable to require that? Getting folks used to the idea that they need to use the correct kinds of quotes is already challenging :) However, the main issue is the one I mentioned in PEP 531 regarding the "THERE EXISTS" symbol: Python and other programming languages re-use "+", "-", "=" etc because a lot of folks are already familiar with them from learning basic arithmetic. Other symbols are used in Python because they were inherited from C, or were relatively straightforward puns on such previously inherited symbols. What this means is that there aren't likely to be many practical gains in using the "right" symbol for something, even when it's already defined in Unicode, as we expect the number of people learning that symbology *before* learning Python to be dramatically smaller than the proportion learning Python first and the formal mathematical symbols later (if they learn them at all). This means that instead of placing more stringent requirements on editing environments for Python source code in order to use non-ASCII input symbols, we're still far more likely to look to define a suitable keyword, or assign a relatively arbitrary meaning to an ASCII punctuation symbol (and that's assuming we accept that a proposal will see sufficient use to be worthy of new syntax in the first place, which is far from being a given). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
On 30 October 2016 at 12:31, Chris Angelicowrote: > On Sun, Oct 30, 2016 at 11:22 PM, Paul Moore wrote: >> In mentioning emoji, my main point was that "average computer users" >> are more and more likely to want to use emoji in general applications >> (emails, web applications, even documents) - and if a sufficiently >> general solution for that problem is found, it may provide a solution >> for the general character-entry case. > > Before Unicode emoji were prevalent, ASCII emoticons dominated, and > it's not uncommon for multi-character sequences to be automatically > transformed into their corresponding emoji. It isn't hard to set > something up that does these kinds of transformations for other > Unicode characters - use trigraphs for clarity, and type "/:0" to > produce "∅". Or whatever's comfortable for you. Maybe rig it on > Ctrl-Alt-0, if you prefer shift-key sequences. It's certainly not difficult, in principle. I have (had, I lost it in an upgrade recently...) a little AutoHotkey program that interpreted Vim-style digraphs in any application that needed them. But my point was that we don't want to require people to write such custom utilities, just to be able to write Python code. Or is the feeling that it's acceptable to require that? Paul ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
On 30 October 2016 at 07:00, Stephen J. Turnbullwrote: >> as I imagine Unicode characters would be for me. I really hope it > > isn't... > > I think your imagination is running away with you. While I understand > how costly it is for those over the age of 12 to develop new habits > (I'm 58, and painfully aware of how frequently I balk at learning > anything new no matter how productivity-enhancing it is likely to be, > and how much more slowly it becomes part of my repertoire), the number > of new things you would need to learn would be few, and frequently > enough used, at least in Python. It's hard enough to get Guido (and > the other Masters of Pythonic Language Design) to sign on to new ASCII > syntax; even if in principle non-ASCII were to be admitted, I suspect > the barrier there would be even higher. > > Most of Unicode is irrelevant to everybody. Mathematicians use only a > small fraction of the math notation available to them -- it's just > that it's a different small fraction for each field. The East Asians > need a big chunk (I would guess that educated Chinese and Japanese > encounter about 10,000 characters in "daily life" over a lifetime, > while those encountered at least once a week number about 3000), but > those that need to be memorized are a small minority (less than 5%) of > the already defined Unicode repertoire. > > For Western programmers, the mechanics are almost certainly there. > Every personal computer should have at least one font containing all > characters defined in the Basic Multilingual Plane, and most will have > chunks of the astral planes (emoji, rare math symbols, country flags, > ...). Even the Happy Hacker keyboard has enough mode keys (shift, > control, ...) to allow defining "3-finger salutes" for commonly-used > characters not on the keycaps -- in daily life if you don't need a > input method now, you won't need one if Python decides to use WHITE > SQUARE to represent an operation you frequently use -- just an extra > "control key combo" like the editing control keys (eg, for copy, cut, > paste, undo) that aren't marked on any keyboard I have. My point wasn't so much about dealing with the character set of Unicode, as it was about physical entry of non-native text. For example, on my (UK) keyboard, all of the printed keycaps are basically used. And yet, I can't even enter accented letters from latin-1 with a standard keypress, much less extended Unicode. Of course it's possible to get those characters (either by specialised mappings in an editor, or by using an application like Character Map) but there's nothing guaranteed to work across all applications. That's a hardware and OS limitation - the hardware only has so many keys to use, and the OS (Windows, in my case) doesn't support global key mapping (at least not to my knowledge, in a user-friendly manner - I'm excluding writing my own keyboard driver :-)) My interest in East Asian experience is at least in part because the "normal" character sets, as I understand it, are big enough that it's impractical for a keyboard to include a plausible basic range of characters, so I'm curious as to what the physical process is for typing from a vocabulary of thousands of characters on a sanely-sized keyboard. In mentioning emoji, my main point was that "average computer users" are more and more likely to want to use emoji in general applications (emails, web applications, even documents) - and if a sufficiently general solution for that problem is found, it may provide a solution for the general character-entry case. (Also, I couldn't resist the irony of using a :-) smiley while referring to emoji...) But it may be that app-specific solutions (e.g., the smiley menu in Skype) are sufficient for that use case. Or the typical emoji user is likely to be using a tablet/phone rather than a keyboard, and mobile OSes have included an emoji menu in their on-screen keyboards. Coming back to a more mundane example, if I need to type a character like é in an email, I currently need to reach for Character Map and cut and paste it. The same is true if I have to type it into the console. That's a sufficiently annoying stumbling block that I'm inclined to avoid it - using clumsy workarounds like referring to "the OP" rather than using their name. I'd be fairly concerned about introducing non-ASCII syntax into Python while such stumbling blocks remain - the amount of code typed outside of an editor (interactive prompt, emails, web applications like Jupyter) mean that editor-based workarounds like custom mappings are only a partial solution. But maybe you are right, and it's just my age showing. The fate of APL probably isn't that relevant these days :-) (or ☺ if you prefer...) Paul ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]
Paul Moore writes: > On 29 October 2016 at 18:19, Stephen J. Turnbull >wrote: > >> For better or worse, it may be emoji that drive that change ;-) > > > > I suspect that the 100 million or so Chinese, Japanese, Korean, and > > Indian programmers who have had systems that have no trouble > > whatsoever handling non-ASCII for as long they've used computers will > > drive that change. > > My apologies. You are of course absolutely right. tl;dr: A quick apology for the snark, and an attempt at FUD reduction. Using non-ASCII characters will involve some cost, but there are real benefits, and the fear and loathing often evoked by the prospect is unnecessary. I'm not ready to advocate introduction *right* now, but "never" isn't acceptable either. :-) On with the show: "Absolutely" is more than I deserve, as I was being a bit snarky. That said, Ed Yourdon wrote a book in 1990 or so with the self-promoting title of "Decline and Fall of the American Programmer"[1] in which he argued that for many kinds of software outsourcing to China, India, or Ireland got you faster, better, cheaper, and internationalized, with no tradeoffs. (The "and internationalized" is my hobby horse, it wasn't part of Yourdon's thesis.) He later recanted the extremist doomsaying, but a quick review of the fraction of H1B visas granted to Asian-origin programmers should convince you that USA/EUR/ANZ doesn't have a monopoly of good-to-great programming (probably never did, but that's a topic for a different thread). Also note that in Japan, without controlling for other factors, just the programming language used most frequently, Python programmers are the highest paid among developers in all languages with more than 1% of the sample (and yes, that includes COBOL!) To the extent that internationalization matters to a particular kind of programming, these programmers are better placed for those jobs, I think. And while in many cases "on site" has a big advantage (so you can't telecommute from Bangalore, you need that H1B which is available in rather restrictive number), more and more outsourcing does cross oceans so potential competition is immense. There is a benefit to increasing our internationalization in backward- incompatible ways. And that benefit is increasing both in magnitude and in the number of Python developers who will receive it. > I'm curious to know how easy it is for Chinese, Japanese, Korean and > Indian programmers to use *ASCII* characters. I have no idea in > practice whether the current basically entirely-ASCII nature of > programming languages is as much a problem for them Characters are zero problem for them. The East Asian national standards all include the ASCII repertoire, and some device (usually based on ISO 2022 coding extensions rather than UTF-8) for allowing ASCII to be one-byte, even if the "local" characters require two or more bytes. I forget if India's original national standard also included an ASCII subset, but they switched over to Unicode quite early[2], so UTF-8 does the trick for them. English (the language) is a much bigger issue. Most Indians, of course, have little trouble with the derived-from- English nature of much programming syntax and library identifiers, and the Asians all get enough training in both (very) basic English and rote memorization that handling English-derived syntax and library nomenclature is not a problem. However, reading and especially creating documentation can be expensive and inaccurate. At least in Japanese, "straightforward" translations are often poor, as nuances are lost. E.g., a literal Japanese translation from English requires many words to indicate the differences a simple "a" vs. "the" vs. "some" indicates in English. Mostly such nuances can be expressed economically by restructuring a whole paragraph, but translators rarely bother and often seem unaware of the issues. Many Japanese programmers' use of articles is literally chaotic: it's deterministic but appears random to all but the most careful analysis.[3] > as I imagine Unicode characters would be for me. I really hope it > isn't... I think your imagination is running away with you. While I understand how costly it is for those over the age of 12 to develop new habits (I'm 58, and painfully aware of how frequently I balk at learning anything new no matter how productivity-enhancing it is likely to be, and how much more slowly it becomes part of my repertoire), the number of new things you would need to learn would be few, and frequently enough used, at least in Python. It's hard enough to get Guido (and the other Masters of Pythonic Language Design) to sign on to new ASCII syntax; even if in principle non-ASCII were to be admitted, I suspect the barrier there would be even higher. Most of Unicode is irrelevant to everybody. Mathematicians use only a small fraction of the math notation available to them -- it's just that it's