Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-11-01 Thread Stephen J. Turnbull
MRAB writes:

 > That's a strange thing to do. It's more usual to use a _subscript_ to 
 > indicate an index: a₃ vs a³

Oh, we economic theorists do that too.  It's typically a
double-indexed array of parameters, where both rows and columns can be
meaningfully be treated as vectors.  So a₃ is the vector of
quantities of good 3 produced by all firms, while a³ is the vector of
quantities of all goods produced by firm 3.  Or in analysis of
international or interregional trade, there's an index indicating
which country exports which good to which importing country.  Some
people put the good index in the superscript and the two countries in
the subscript, others the opposite.  IIRC, mathematical physicist use
both subscript and superscript in tensor notation, nuclear physicists
use one for atomic number and the other for atomic weight (and thus
would expect both subscript and superscript to be treated lexically as
identifier components, not as expression components).

The point is not that polynomials are not the most common use of
superscript notation -- I don't care one way or the other.  It's that
there are many uses, important to those fields, that aren't
polynomials.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-11-01 Thread Stephen J. Turnbull
Steven D'Aprano writes:

 > In other words, ² behaves as a unary postfix operator that squares
 > its argument. Likewise for ³, etc. You can even combine them: x³³
 > would be the same as x**33. There's more here:

I hope that's configurable.  I use superscripts to indicate an index
as often as I use them to indicate an exponent.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-31 Thread Steven D'Aprano
On Mon, Oct 31, 2016 at 10:19:58AM +0900, Stephen J. Turnbull wrote:
> Steven D'Aprano writes:
> 
>  > I see that Perl is leading the way here, supporting a large number of 
>  > Unicode symbols:
>  > 
>  > https://docs.perl6.org/language/unicode_entry.html
> 
> In what sense is that "support"?  

In the sense that Perl 6 not only allows Unicode identifiers (as Python 
has for many years) but also Unicode operators and symbols. For example, 
you can use either the Unicode character ⊂ \N{SUBSET OF} or the ASCII 
trigraph (<) for doing subset tests.


>  > I must say that it is kinda cute that Perl6 does the right thing for x².
> 
> Uh, as far as I can tell from that page, Perl has absolutely nothing
> to do with that.  You enter the Unicode code point as hex, and if the
> font supports, you get the character.

You missed the bit that Parl 6 interprets "x²" in code as the equivalent 
of x**2 (x squared).

In other words, ² behaves as a unary postfix operator that squares its 
argument. Likewise for ³, etc. You can even combine them: x³³ would be 
the same as x**33. There's more here:

https://docs.perl6.org/language/unicode_texas




-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-31 Thread Steven D'Aprano
On Sun, Oct 30, 2016 at 10:16:19AM -0700, David Mertz wrote:

> My vim configuration for a year or two has looked something like this (the
> screenshot doesn't show the empty set symbol, but that's part of my conceal
> configuration: http://gnosis.cx/bin/.vim/after/syntax/python.vim).

Oh nice!

By the way, anyone looking at this in a browser may find that the 
browser defaults to treating it as Latin-1, which gives you mojibake. 
Just tell you browser to treat it as Unicode.



-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-31 Thread David Mertz
My vim configuration for a year or two has looked something like this (the
screenshot doesn't show the empty set symbol, but that's part of my conceal
configuration: http://gnosis.cx/bin/.vim/after/syntax/python.vim).

On Sun, Oct 30, 2016 at 7:13 AM, Chris Angelico  wrote:

> On Mon, Oct 31, 2016 at 12:39 AM, Paul Moore  wrote:
> > It's certainly not difficult, in principle. I have (had, I lost it in
> > an upgrade recently...) a little AutoHotkey program that interpreted
> > Vim-style digraphs in any application that needed them. But my point
> > was that we don't want to require people to write such custom
> > utilities, just to be able to write Python code. Or is the feeling
> > that it's acceptable to require that?
>
> There's a chicken-and-egg problem. So long as most people don't have
> tools like that, a language that requires them is going to be very
> annoying - but so long as no major language uses such characters,
> there's no reason for developers to set up those kinds of tools.
>
> Possibly the best way is a gentle introduction of alternative
> syntaxes. Since Python currently has no "empty set display" syntax,
> that seems like a perfect starting point. You can always type "set()",
> but that involves an actual function call; using ∅ gives a small
> performance boost, eliminates the risk of shadowing, etc, etc. All
> minor points, but could be convenient enough. Also, if repr(set())
> returns "∅", it'll be easy for anyone to get hold of the character for
> copy/paste.
>
> As of 2016, I think it's not acceptable to *require* this, but it may
> be time to start making use of it, retaining ASCII-only digraphs and
> trigraphs, the way C has alternative spelling for braces and so on.
> Then time passes, most people will be comfortable using the characters
> themselves, and the digraphs/trigraphs can be deprecated, with new
> syntax not being given any.
>
> Pipe dream?
>
> ChrisA
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Chris Angelico
On Mon, Oct 31, 2016 at 12:19 PM, Stephen J. Turnbull
 wrote:
> Uh, as far as I can tell from that page, Perl has absolutely nothing
> to do with that.  You enter the Unicode code point as hex, and if the
> font supports, you get the character.  What Paul is arguing is that
> entering any character, non-ASCII or ASCII, as a hex code point or as
> an Alt+digits sequence, is a non-starter for our audience.  Much as
> I'd like to disagree, I can't.

Back when I used a single codepage (IBM OEM, now called 437) and 256
characters, it wasn't unreasonable to memorize the alt-codes for most
of those characters. I could do all the single-line and double-line
characters from memory (might take me a couple of tries to get the
right corner), and if I needed to mix line types, I could just look
those up. But with all of Unicode? Totally impractical. You can't
expect people to use the hex codes.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Stephen J. Turnbull
Steven D'Aprano writes:

 > I see that Perl is leading the way here, supporting a large number of 
 > Unicode symbols:
 > 
 > https://docs.perl6.org/language/unicode_entry.html

In what sense is that "support"?  What I see on that page is a lot of
advice for the kind of people who are already using non-ASCII in
Python, as I have been doing since 2001 or so.

 > I must say that it is kinda cute that Perl6 does the right thing for x².

Uh, as far as I can tell from that page, Perl has absolutely nothing
to do with that.  You enter the Unicode code point as hex, and if the
font supports, you get the character.  What Paul is arguing is that
entering any character, non-ASCII or ASCII, as a hex code point or as
an Alt+digits sequence, is a non-starter for our audience.  Much as
I'd like to disagree, I can't.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Steven D'Aprano
On Mon, Oct 31, 2016 at 12:02:54AM +1000, Nick Coghlan wrote:

> What this means is that there aren't likely to be many practical gains
> in using the "right" symbol for something, even when it's already
> defined in Unicode, as we expect the number of people learning that
> symbology *before* learning Python to be dramatically smaller than the
> proportion learning Python first and the formal mathematical symbols
> later (if they learn them at all).

Depends on the symbol. Most people do study maths in secondary school 
where they will be introduced to symbols beyond the ASCII + - * / etc, 
for instance set union and intersection ∪ ∩, although your point 
certainly applies to some of the more exotic (even for mathematicians) 
symbols in Unicode.


> This means that instead of placing more stringent requirements on
> editing environments for Python source code in order to use non-ASCII
> input symbols, we're still far more likely to look to define a
> suitable keyword, or assign a relatively arbitrary meaning to an ASCII
> punctuation symbol

Indeed. But there's only so many ASCII punctuation marks, and digraphs 
and trigraphs can become tiresome. And once people have solved the 
keyboard entry issue, it is no harder to memorise the "correct" symbol 
than some arbitrary ASCII sequence.


> (and that's assuming we accept that a proposal will
> see sufficient use to be worthy of new syntax in the first place,
> which is far from being a given).

I see that Perl is leading the way here, supporting a large number of 
Unicode symbols:

https://docs.perl6.org/language/unicode_entry.html

I must say that it is kinda cute that Perl6 does the right thing for x².



-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Stephen J. Turnbull
Paul Moore writes:

 > My point wasn't so much about dealing with the character set of
 > Unicode, as it was about physical entry of non-native text. For
 > example, on my (UK) keyboard, all of the printed keycaps are basically
 > used.

How do you type the pound sign and the Euro sign?  Are they on the UK
keyboard?  Or are you not in the UK and don't need them?

 > And yet, I can't even enter accented letters from latin-1 with a
 > standard keypress, much less extended Unicode.

I'm pretty sure you can, but since I've been Windows-free for 20 years
(except for a short period when I was treasurer for an NGO, and only
used it to access the accounting system), I can't tell you what it is.
On the Mac, you press alt/option plus a graphic key.  Most result in
what somebody decided are common non-ASCII characters (German sharp S,
Greek lowercase mu, Greek upper- and lowercase sigma), but several are
dead keys, producing accented characters when combined with a base
character: tilde, accents acute and grave, and so on.  Surely Windows
has a similar system (I don't mean Alt+digits).  (But maybe not, I
didn't notice one in my brief Googling.)

 > My interest in East Asian experience is at least in part because
 > the "normal" character sets, as I understand it, are big enough
 > that it's impractical for a keyboard to include a plausible basic
 > range of characters, so I'm curious as to what the physical process
 > is for typing from a vocabulary of thousands of characters on a
 > sanely-sized keyboard.

You're right about the size.  Korean is special, because the 11,000-
odd Hangul are phonetic and generated algorithmically from a set of
about 70 phonetic partial glyphs, divided into three groups.  The same
keys do multiple duty when typed in phonetic order.  Other systems use
the shift key.

For the 100,000 Han ideographs[1], there are a wide variety of methods
for entry by key sequence, ranging from code point entry to
context-dependent phonetic entry of entire sentences as they would be
spoken.  Then, of course, there's voice recognition, and handwriting
recognition (both static from the image, and dynamic, taking account
of the order of pen strokes).

The more advanced input methods not only take account of grammar, but
also learn the users' habits, remember recent conversions, and predict
coming keystrokes based on current context, offering several
conversions based on plausible continuations.

 > In mentioning emoji, my main point was that "average computer
 > users" are more and more likely to want to use emoji in general
 > applications (emails, web applications, even documents) - and if a
 > sufficiently general solution for that problem is found, it may
 > provide a solution for the general character-entry case.

Not for the Asian languages.  For them, "character entry" in the sense
of character-by-character has long since been obsoleted by predictive
sentence-level phonetic methods.

But emoji are a perfect example for the present purpose, since they
don't have standard pronunciations (although probably many will get
them based on the Unicode standard names).  On systems with high-
enough resolution displays, a palette showing the glyphs is the
obvious solution.  But that's not pleasant if you type quickly and
need those characters frequently.  I don't think there's an
alternative for emoji though, except for personalized shortcut maps.
Math symbols are similar, I think.

 > Coming back to a more mundane example, if I need to type a character
 > like é in an email, I currently need to reach for Character Map and
 > cut and paste it. The same is true if I have to type it into the
 > console.

You probably have Control, Windows, Menu, Alt, and maybe a "function"
key.  If you're lucky, one labelled AltGr for "Alternate Graphic" is
the obvious suspect.  Some combination of the above probably allows
entry of accented Latin-1 characters, miscellaneous Latin-1 (eg, sharp
S), and a few oddballs (Greek letters, ligatures like oe, the
leminiscate usually read infinity).

 > That's a sufficiently annoying stumbling block

It very well could be, although my Windows Google-foo isn't great.
But this is what I found.

For WHITE SQUARE, the Mac doesn't have a keyboard equivalent, but
there's a standard way to set up a set of shortcut keys[2]:
http://stackoverflow.com/questions/3685146/how-do-you-do-the-therefore-%E2%88%B4-symbol-on-a-mac-or-in-textmate
And I think you can also use the "Input Preferences" screen in System
Preferences to set up a few of them.

For Windows, it seems that Alt+decimal character codes, or hex Unicode
followed by Alt+x are the built-in ways to enter characters not on
your keyboard..  It's also possible to set up "Math Autocorrect" to
automatically convert keysequences according to
https://blogs.msdn.microsoft.com/murrays/2011/08/29/sans-serif-mathematical-symbols/
but that's hardly obvious (although maybe it is if you're Dutch?)

I have to wonder why so many people stick with a system that 

Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Mikhail V
Steven D'Aprano wrote:

> I cannot wait for the day that we can use non-ASCII operators. But I
> don't think that day has come: it is still too hard for many people
> (including me) to generate non-ASCII characters at the keyboard, and
> font support for some of the more useful ones are still inconsistent or
> lacking.

> For example, we don't have a good literal for empty sets. How about ∅?
> Sadly, in my mail client and in the Python REPR, it displays as a
> "missing glyph" open rectangle. And how would you type it?

I will just share my view on the whole problematic, trying to
concentrate more on the actual code look.
So I see it all as a chain of big steps, roughly:

1. One defines *the real code* or syntax, this means:
  One takes a pen and a paper (photoshop/paint bucket)
  and *defines* the syntax, this means one defines everything
  as it is, including pixel precise spaces between operators,
  punctuation and so on.

2. One develops an application (IDE) which enables you to
  automatically load code file and (at least) view it
  *exactly* as you have defined it.

3. Only after that one starts to think about ASCII/unicode/Hangul
  (forgive me Lord) or whatever someone has defined as something
useful/standard.

> Java, I believe, allows you to enter escape sequences in source code,
> not just in strings. So we could hypothetically allow one of:
>
>myobject\N{WHITE SQUARE}attribute
>myobject\u25a1attribute
>
> as a pure-ASCII way of getting
>
>myobject□attribute


So this actually would be a possible kind of "bridge" from the real code
to what is shown up in arbitrary text editing application or a mailing client.

In other words, you believe that in Unicode table you'll find something
useful for code definition, but I personally would not even start
relying on that, also because it is merely down-top problem solving.


Mikhail
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Matthias Bussonnier
Hi all,

For those of you not aware, the Julia Programming Language [1] does
make extensive use of (mathematical) unicode symbols in its standard
library, even document a method of input [2] (hint tab completion).
They go even further by recognizing some characters (like \oplus) that
parse as operators and have predefined precedences, but no
implementations, leaving them available to the user.

Regardless of my personal feeling about that, I have observed that
this does not seem to hinder Julia developement. Many developers seem
to like it a lot. Though my sampling is heavily biased toward
developers with a strong math  background.

So it might be a case study to actually see how this affect an
existing language both technically and community wide.

Cheers,
-- 
M


[1] : julialang.org
[2] : http://docs.julialang.org/en/release-0.5/manual/unicode-input/
[3] : 
http://docs.julialang.org/en/release-0.5/manual/variables/#allowed-variable-names

On Sun, Oct 30, 2016 at 7:02 AM, Nick Coghlan  wrote:
> On 30 October 2016 at 23:39, Paul Moore  wrote:
>> It's certainly not difficult, in principle. I have (had, I lost it in
>> an upgrade recently...) a little AutoHotkey program that interpreted
>> Vim-style digraphs in any application that needed them. But my point
>> was that we don't want to require people to write such custom
>> utilities, just to be able to write Python code. Or is the feeling
>> that it's acceptable to require that?
>
> Getting folks used to the idea that they need to use the correct kinds
> of quotes is already challenging :)
>
> However, the main issue is the one I mentioned in PEP 531 regarding
> the "THERE EXISTS" symbol: Python and other programming languages
> re-use "+", "-", "=" etc because a lot of folks are already familiar
> with them from learning basic arithmetic. Other symbols are used in
> Python because they were inherited from C, or were relatively
> straightforward puns on such previously inherited symbols.
>
> What this means is that there aren't likely to be many practical gains
> in using the "right" symbol for something, even when it's already
> defined in Unicode, as we expect the number of people learning that
> symbology *before* learning Python to be dramatically smaller than the
> proportion learning Python first and the formal mathematical symbols
> later (if they learn them at all).
>
> This means that instead of placing more stringent requirements on
> editing environments for Python source code in order to use non-ASCII
> input symbols, we're still far more likely to look to define a
> suitable keyword, or assign a relatively arbitrary meaning to an ASCII
> punctuation symbol (and that's assuming we accept that a proposal will
> see sufficient use to be worthy of new syntax in the first place,
> which is far from being a given).
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Alexandre Brault

On 2016-10-30 10:47 AM, Paul Moore wrote:

On 30 October 2016 at 14:43,   wrote:

Just picking a nit, here, windows will happily let you do silly things like 
hook 14 keyboards up and let you map all of emoji to them.  Sadly, this 
requires lua.

Off topic, I know, but how? I have a laptop with an external and an
internal keyboard. Can I map the internal keyboard to different
characters somehow?

Look up "The Art of the Bodge: How I Made the Emoji Keyboard" by Tom 
Scott on Youtube. As the name implies, it's a huge hack with no 
practicality whatsoever

Alex
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Stephen J. Turnbull
Paul Moore writes:

 > My point wasn't so much about dealing with the character set of
 > Unicode, as it was about physical entry of non-native text. For
 > example, on my (UK) keyboard, all of the printed keycaps are basically
 > used.

How do you type the pound sign and the Euro sign?  Are they on the UK
keyboard?  Or are you not in the UK and don't need them?

 > And yet, I can't even enter accented letters from latin-1 with a
 > standard keypress, much less extended Unicode.

I'm pretty sure you can, but since I've been Windows-free for 20 years
(except for a short period when I was treasurer for an NPO, and only
used it to access the accounting system), I can't tell you what it is.
On the Mac, you press alt/option plus a graphic key.  Most result in
what somebody decided are common non-ASCII characters (German sharp S,
Greek lowercase mu, Greek upper- and lowercase sigma), but several are
dead keys, producing accented characters when combined with a base
character: tilde, accents acute and grave, and so on.  Surely Windows
has a similar system (I don't mean Alt+digits).  (But maybe not, I
didn't notice one in my brief Googling.)

 > My interest in East Asian experience is at least in part because
 > the "normal" character sets, as I understand it, are big enough
 > that it's impractical for a keyboard to include a plausible basic
 > range of characters, so I'm curious as to what the physical process
 > is for typing from a vocabulary of thousands of characters on a
 > sanely-sized keyboard.

You're right about the size.  Korean is special, because the 11,000-
odd Hangul are phonetic and generated algorithmically from a set of
about 70 phonetic partial glyphs, divided into three groups.  The same
keys do multiple duty when typed in phonetic order.  Other systems use
the shift key.

For the 100,000 Han ideographs[1], there are a wide variety of methods
for entry by key sequence, ranging from code point entry to
context-dependent phonetic entry of entire sentences as they would be
spoken.  Then, of course, there's voice recognition, and handwriting
recognition (both static from the image, and dynamic, taking account
of the order of pen strokes).

The more advanced input methods not only take account of grammar, but
also learn the users' habits, remember recent conversions, and predict
coming keystrokes based on current context, offering several
conversions based on plausible continuations.

 > In mentioning emoji, my main point was that "average computer
 > users" are more and more likely to want to use emoji in general
 > applications (emails, web applications, even documents) - and if a
 > sufficiently general solution for that problem is found, it may
 > provide a solution for the general character-entry case.

Not for the Asian languages.  For them, "character entry" in the sense
of character-by-character has long since been obsoleted by predictive
sentence-level phonetic methods.

But emoji are a perfect example for the present purpose, since they
don't have standard pronunciations (although probably many will get
them based on the Unicode standard names).  On systems with high-
enough resolution displays, a palette showing the glyphs is the
obvious solution.  But that's not pleasant if you type quickly and
need those characters frequently.  I don't think there's an
alternative for emoji though, except for personalized shortcut maps.
Math symbols are similar, I think.

 > Coming back to a more mundane example, if I need to type a character
 > like é in an email, I currently need to reach for Character Map and
 > cut and paste it. The same is true if I have to type it into the
 > console.

You probably have Control, Windows, Menu, Alt, and maybe a "function"
key.  If you're lucky, one labelled AltGr for "Alternate Graphic" is
the obvious suspect.  Some combination of the above probably allows
entry of accented Latin-1 characters, miscellaneous Latin-1 (eg, sharp
S), and a few oddballs (Greek letters, ligatures like oe, the
leminiscate usually read infinity).

 > That's a sufficiently annoying stumbling block

It very well could be, although my Windows Google-foo isn't great.
But this is what I found.

For WHITE SQUARE, the Mac doesn't have a keyboard equivalent, but
there's a standard way to set up a set of shortcut keys[2]:
http://stackoverflow.com/questions/3685146/how-do-you-do-the-therefore-%E2%88%B4-symbol-on-a-mac-or-in-textmate
And I think you can also use the "Input Preferences" screen in System
Preferences to set up a few of them.

For Windows, it seems that Alt+decimal character codes, or hex Unicode
followed by Alt+x are the built-in ways to enter characters not on
your keyboard.  It's also possible to set up "Math Autocorrect" to
automatically convert keysequences according to
https://blogs.msdn.microsoft.com/murrays/2011/08/29/sans-serif-mathematical-symbols/
but that's hardly obvious (although maybe it is if you're Dutch?)

I have to wonder why so many people stick with a system that 

Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Paul Moore
On 30 October 2016 at 14:43,   wrote:
> Just picking a nit, here, windows will happily let you do silly things like 
> hook 14 keyboards up and let you map all of emoji to them.  Sadly, this 
> requires lua.

Off topic, I know, but how? I have a laptop with an external and an
internal keyboard. Can I map the internal keyboard to different
characters somehow?
Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread tritium-list


> -Original Message-
> From: Python-ideas [mailto:python-ideas-bounces+tritium-
> list=sdamon@python.org] On Behalf Of Paul Moore
> Sent: Sunday, October 30, 2016 8:22 AM
> To: Stephen J. Turnbull <turnbull.stephen...@u.tsukuba.ac.jp>
> Cc: Python-Ideas <python-ideas@python.org>
> Subject: Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null
> coalescing operator]
> 
> 
> My point wasn't so much about dealing with the character set of
> Unicode, as it was about physical entry of non-native text. For
> example, on my (UK) keyboard, all of the printed keycaps are basically
> used. And yet, I can't even enter accented letters from latin-1 with a
> standard keypress, much less extended Unicode. Of course it's possible
> to get those characters (either by specialised mappings in an editor,
> or by using an application like Character Map) but there's nothing
> guaranteed to work across all applications. That's a hardware and OS
> limitation - the hardware only has so many keys to use, and the OS
> (Windows, in my case) doesn't support global key mapping (at least not
> to my knowledge, in a user-friendly manner - I'm excluding writing my
> own keyboard driver :-)) My interest in East Asian experience is at
> least in part because the "normal" character sets, as I understand it,
> are big enough that it's impractical for a keyboard to include a
> plausible basic range of characters, so I'm curious as to what the
> physical process is for typing from a vocabulary of thousands of
> characters on a sanely-sized keyboard.
> 

Just picking a nit, here, windows will happily let you do silly things like 
hook 14 keyboards up and let you map all of emoji to them.  Sadly, this 
requires lua.

> In mentioning emoji, my main point was that "average computer users"
> are more and more likely to want to use emoji in general applications
> (emails, web applications, even documents) - and if a sufficiently
> general solution for that problem is found, it may provide a solution
> for the general character-entry case. (Also, I couldn't resist the
> irony of using a :-) smiley while referring to emoji...) But it may be
> that app-specific solutions (e.g., the smiley menu in Skype) are
> sufficient for that use case. Or the typical emoji user is likely to
> be using a tablet/phone rather than a keyboard, and mobile OSes have
> included an emoji menu in their on-screen keyboards.
> 
> Coming back to a more mundane example, if I need to type a character
> like é in an email, I currently need to reach for Character Map and
> cut and paste it. The same is true if I have to type it into the
> console. That's a sufficiently annoying stumbling block that I'm
> inclined to avoid it - using clumsy workarounds like referring to "the
> OP" rather than using their name. I'd be fairly concerned about
> introducing non-ASCII syntax into Python while such stumbling blocks
> remain - the amount of code typed outside of an editor (interactive
> prompt, emails, web applications like Jupyter) mean that editor-based
> workarounds like custom mappings are only a partial solution.
> 
> But maybe you are right, and it's just my age showing. The fate of APL
> probably isn't that relevant these days :-) (or ☺ if you prefer...)
> 
> Paul
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Chris Angelico
On Mon, Oct 31, 2016 at 12:39 AM, Paul Moore  wrote:
> It's certainly not difficult, in principle. I have (had, I lost it in
> an upgrade recently...) a little AutoHotkey program that interpreted
> Vim-style digraphs in any application that needed them. But my point
> was that we don't want to require people to write such custom
> utilities, just to be able to write Python code. Or is the feeling
> that it's acceptable to require that?

There's a chicken-and-egg problem. So long as most people don't have
tools like that, a language that requires them is going to be very
annoying - but so long as no major language uses such characters,
there's no reason for developers to set up those kinds of tools.

Possibly the best way is a gentle introduction of alternative
syntaxes. Since Python currently has no "empty set display" syntax,
that seems like a perfect starting point. You can always type "set()",
but that involves an actual function call; using ∅ gives a small
performance boost, eliminates the risk of shadowing, etc, etc. All
minor points, but could be convenient enough. Also, if repr(set())
returns "∅", it'll be easy for anyone to get hold of the character for
copy/paste.

As of 2016, I think it's not acceptable to *require* this, but it may
be time to start making use of it, retaining ASCII-only digraphs and
trigraphs, the way C has alternative spelling for braces and so on.
Then time passes, most people will be comfortable using the characters
themselves, and the digraphs/trigraphs can be deprecated, with new
syntax not being given any.

Pipe dream?

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Nick Coghlan
On 30 October 2016 at 23:39, Paul Moore  wrote:
> It's certainly not difficult, in principle. I have (had, I lost it in
> an upgrade recently...) a little AutoHotkey program that interpreted
> Vim-style digraphs in any application that needed them. But my point
> was that we don't want to require people to write such custom
> utilities, just to be able to write Python code. Or is the feeling
> that it's acceptable to require that?

Getting folks used to the idea that they need to use the correct kinds
of quotes is already challenging :)

However, the main issue is the one I mentioned in PEP 531 regarding
the "THERE EXISTS" symbol: Python and other programming languages
re-use "+", "-", "=" etc because a lot of folks are already familiar
with them from learning basic arithmetic. Other symbols are used in
Python because they were inherited from C, or were relatively
straightforward puns on such previously inherited symbols.

What this means is that there aren't likely to be many practical gains
in using the "right" symbol for something, even when it's already
defined in Unicode, as we expect the number of people learning that
symbology *before* learning Python to be dramatically smaller than the
proportion learning Python first and the formal mathematical symbols
later (if they learn them at all).

This means that instead of placing more stringent requirements on
editing environments for Python source code in order to use non-ASCII
input symbols, we're still far more likely to look to define a
suitable keyword, or assign a relatively arbitrary meaning to an ASCII
punctuation symbol (and that's assuming we accept that a proposal will
see sufficient use to be worthy of new syntax in the first place,
which is far from being a given).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Paul Moore
On 30 October 2016 at 12:31, Chris Angelico  wrote:
> On Sun, Oct 30, 2016 at 11:22 PM, Paul Moore  wrote:
>> In mentioning emoji, my main point was that "average computer users"
>> are more and more likely to want to use emoji in general applications
>> (emails, web applications, even documents) - and if a sufficiently
>> general solution for that problem is found, it may provide a solution
>> for the general character-entry case.
>
> Before Unicode emoji were prevalent, ASCII emoticons dominated, and
> it's not uncommon for multi-character sequences to be automatically
> transformed into their corresponding emoji. It isn't hard to set
> something up that does these kinds of transformations for other
> Unicode characters - use trigraphs for clarity, and type "/:0" to
> produce "∅". Or whatever's comfortable for you. Maybe rig it on
> Ctrl-Alt-0, if you prefer shift-key sequences.

It's certainly not difficult, in principle. I have (had, I lost it in
an upgrade recently...) a little AutoHotkey program that interpreted
Vim-style digraphs in any application that needed them. But my point
was that we don't want to require people to write such custom
utilities, just to be able to write Python code. Or is the feeling
that it's acceptable to require that?

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Paul Moore
On 30 October 2016 at 07:00, Stephen J. Turnbull
 wrote:
>> as I imagine Unicode characters would be for me. I really hope it
>  > isn't...
>
> I think your imagination is running away with you.  While I understand
> how costly it is for those over the age of 12 to develop new habits
> (I'm 58, and painfully aware of how frequently I balk at learning
> anything new no matter how productivity-enhancing it is likely to be,
> and how much more slowly it becomes part of my repertoire), the number
> of new things you would need to learn would be few, and frequently
> enough used, at least in Python.  It's hard enough to get Guido (and
> the other Masters of Pythonic Language Design) to sign on to new ASCII
> syntax; even if in principle non-ASCII were to be admitted, I suspect
> the barrier there would be even higher.
>
> Most of Unicode is irrelevant to everybody.  Mathematicians use only a
> small fraction of the math notation available to them -- it's just
> that it's a different small fraction for each field.  The East Asians
> need a big chunk (I would guess that educated Chinese and Japanese
> encounter about 10,000 characters in "daily life" over a lifetime,
> while those encountered at least once a week number about 3000), but
> those that need to be memorized are a small minority (less than 5%) of
> the already defined Unicode repertoire.
>
> For Western programmers, the mechanics are almost certainly there.
> Every personal computer should have at least one font containing all
> characters defined in the Basic Multilingual Plane, and most will have
> chunks of the astral planes (emoji, rare math symbols, country flags,
> ...).  Even the Happy Hacker keyboard has enough mode keys (shift,
> control, ...) to allow defining "3-finger salutes" for commonly-used
> characters not on the keycaps -- in daily life if you don't need a
> input method now, you won't need one if Python decides to use WHITE
> SQUARE to represent an operation you frequently use -- just an extra
> "control key combo" like the editing control keys (eg, for copy, cut,
> paste, undo) that aren't marked on any keyboard I have.

My point wasn't so much about dealing with the character set of
Unicode, as it was about physical entry of non-native text. For
example, on my (UK) keyboard, all of the printed keycaps are basically
used. And yet, I can't even enter accented letters from latin-1 with a
standard keypress, much less extended Unicode. Of course it's possible
to get those characters (either by specialised mappings in an editor,
or by using an application like Character Map) but there's nothing
guaranteed to work across all applications. That's a hardware and OS
limitation - the hardware only has so many keys to use, and the OS
(Windows, in my case) doesn't support global key mapping (at least not
to my knowledge, in a user-friendly manner - I'm excluding writing my
own keyboard driver :-)) My interest in East Asian experience is at
least in part because the "normal" character sets, as I understand it,
are big enough that it's impractical for a keyboard to include a
plausible basic range of characters, so I'm curious as to what the
physical process is for typing from a vocabulary of thousands of
characters on a sanely-sized keyboard.

In mentioning emoji, my main point was that "average computer users"
are more and more likely to want to use emoji in general applications
(emails, web applications, even documents) - and if a sufficiently
general solution for that problem is found, it may provide a solution
for the general character-entry case. (Also, I couldn't resist the
irony of using a :-) smiley while referring to emoji...) But it may be
that app-specific solutions (e.g., the smiley menu in Skype) are
sufficient for that use case. Or the typical emoji user is likely to
be using a tablet/phone rather than a keyboard, and mobile OSes have
included an emoji menu in their on-screen keyboards.

Coming back to a more mundane example, if I need to type a character
like é in an email, I currently need to reach for Character Map and
cut and paste it. The same is true if I have to type it into the
console. That's a sufficiently annoying stumbling block that I'm
inclined to avoid it - using clumsy workarounds like referring to "the
OP" rather than using their name. I'd be fairly concerned about
introducing non-ASCII syntax into Python while such stumbling blocks
remain - the amount of code typed outside of an editor (interactive
prompt, emails, web applications like Jupyter) mean that editor-based
workarounds like custom mappings are only a partial solution.

But maybe you are right, and it's just my age showing. The fate of APL
probably isn't that relevant these days :-) (or ☺ if you prefer...)

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

2016-10-30 Thread Stephen J. Turnbull
Paul Moore writes:
 > On 29 October 2016 at 18:19, Stephen J. Turnbull
 >  wrote:
 > >> For better or worse, it may be emoji that drive that change ;-)
 > >
 > > I suspect that the 100 million or so Chinese, Japanese, Korean, and
 > > Indian programmers who have had systems that have no trouble
 > > whatsoever handling non-ASCII for as long they've used computers will
 > > drive that change.
 > 
 > My apologies. You are of course absolutely right.

tl;dr: A quick apology for the snark, and an attempt at FUD reduction.
Using non-ASCII characters will involve some cost, but there are real
benefits, and the fear and loathing often evoked by the prospect is
unnecessary.  I'm not ready to advocate introduction *right* now, but
"never" isn't acceptable either. :-)

On with the show:

"Absolutely" is more than I deserve, as I was being a bit snarky.

That said, Ed Yourdon wrote a book in 1990 or so with the
self-promoting title of "Decline and Fall of the American
Programmer"[1] in which he argued that for many kinds of software
outsourcing to China, India, or Ireland got you faster, better,
cheaper, and internationalized, with no tradeoffs.  (The "and
internationalized" is my hobby horse, it wasn't part of Yourdon's
thesis.)  He later recanted the extremist doomsaying, but a quick
review of the fraction of H1B visas granted to Asian-origin
programmers should convince you that USA/EUR/ANZ doesn't have a
monopoly of good-to-great programming (probably never did, but that's
a topic for a different thread).  Also note that in Japan, without
controlling for other factors, just the programming language used most
frequently, Python programmers are the highest paid among developers
in all languages with more than 1% of the sample (and yes, that
includes COBOL!)  To the extent that internationalization matters to a
particular kind of programming, these programmers are better placed
for those jobs, I think.  And while in many cases "on site" has a big
advantage (so you can't telecommute from Bangalore, you need that H1B
which is available in rather restrictive number), more and more
outsourcing does cross oceans so potential competition is immense.

There is a benefit to increasing our internationalization in backward-
incompatible ways.  And that benefit is increasing both in magnitude
and in the number of Python developers who will receive it.

 > I'm curious to know how easy it is for Chinese, Japanese, Korean and
 > Indian programmers to use *ASCII* characters. I have no idea in
 > practice whether the current basically entirely-ASCII nature of
 > programming languages is as much a problem for them

Characters are zero problem for them.  The East Asian national
standards all include the ASCII repertoire, and some device (usually
based on ISO 2022 coding extensions rather than UTF-8) for allowing
ASCII to be one-byte, even if the "local" characters require two or
more bytes.  I forget if India's original national standard also
included an ASCII subset, but they switched over to Unicode quite
early[2], so UTF-8 does the trick for them.  English (the language) is a
much bigger issue.

Most Indians, of course, have little trouble with the derived-from-
English nature of much programming syntax and library identifiers, and
the Asians all get enough training in both (very) basic English and
rote memorization that handling English-derived syntax and library
nomenclature is not a problem.

However, reading and especially creating documentation can be
expensive and inaccurate.  At least in Japanese, "straightforward"
translations are often poor, as nuances are lost.  E.g., a literal
Japanese translation from English requires many words to indicate the
differences a simple "a" vs. "the" vs. "some" indicates in English.
Mostly such nuances can be expressed economically by restructuring a
whole paragraph, but translators rarely bother and often seem unaware
of the issues.  Many Japanese programmers' use of articles is literally
chaotic: it's deterministic but appears random to all but the most
careful analysis.[3]

 > as I imagine Unicode characters would be for me. I really hope it
 > isn't...

I think your imagination is running away with you.  While I understand
how costly it is for those over the age of 12 to develop new habits
(I'm 58, and painfully aware of how frequently I balk at learning
anything new no matter how productivity-enhancing it is likely to be,
and how much more slowly it becomes part of my repertoire), the number
of new things you would need to learn would be few, and frequently
enough used, at least in Python.  It's hard enough to get Guido (and
the other Masters of Pythonic Language Design) to sign on to new ASCII
syntax; even if in principle non-ASCII were to be admitted, I suspect
the barrier there would be even higher.

Most of Unicode is irrelevant to everybody.  Mathematicians use only a
small fraction of the math notation available to them -- it's just
that it's