Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-12-09 Thread Thomas Jollans
On 10/12/17 02:42, Steve D'Aprano wrote:
> On Sun, 10 Dec 2017 09:20 am, Terry Reedy wrote:
> 
>> On 12/9/2017 5:57 AM, Gilmeh Serda wrote:
>>
>>> And next demands to allow Unicode as keywords in a translated version of
>>> Python
>>
>> Python's liberal open source license allows people to revise and
>> distribute their own python or python-like interpreters.  I believe
>> there are already a couple of non-english versions aimed at schoolkids.
>> The cpython core developer group has nothing to do with such.
> 
> I don't know who their target audiences are, but there's a German and Chinese
> version of Python.
> 
> http://www.fiber-space.de/EasyExtend/doc/teuton/teuton.htm
> 
> http://www.chinesepython.org/english/english.html
> 
> My *guess* is that Teuton is intended as a proof-of-concept just to show it
> can be done,

That's certainly what it looks like, though the documentation page reads
like a bad joke (or a once-mediocre one that really hasn't aged well).

Joke or not, it does appear to be real, however, and a part of this long
dead and forgotten package: https://pypi.python.org/pypi/EasyExtend/3.0.1

-- Thomas
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-12-09 Thread Steve D'Aprano
On Sun, 10 Dec 2017 09:20 am, Terry Reedy wrote:

> On 12/9/2017 5:57 AM, Gilmeh Serda wrote:
> 
>> And next demands to allow Unicode as keywords in a translated version of
>> Python
> 
> Python's liberal open source license allows people to revise and
> distribute their own python or python-like interpreters.  I believe
> there are already a couple of non-english versions aimed at schoolkids.
> The cpython core developer group has nothing to do with such.

I don't know who their target audiences are, but there's a German and Chinese
version of Python.

http://www.fiber-space.de/EasyExtend/doc/teuton/teuton.htm

http://www.chinesepython.org/english/english.html

My *guess* is that Teuton is intended as a proof-of-concept just to show it
can be done, and ChinesePython is intended for students.




-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-12-09 Thread Terry Reedy

On 12/9/2017 5:57 AM, Gilmeh Serda wrote:


And next demands to allow Unicode as keywords in a translated version of
Python


Python's liberal open source license allows people to revise and 
distribute their own python or python-like interpreters.  I believe 
there are already a couple of non-english versions aimed at schoolkids. 
The cpython core developer group has nothing to do with such.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator

2017-11-27 Thread breamoreboy
On Monday, November 27, 2017 at 10:08:06 PM UTC, wxjmfauth wrote:
> Le lundi 27 novembre 2017 14:52:19 UTC+1, Rustom Mody a ÄCcritâ :
> > On Monday, November 27, 2017 at 6:48:56 PM UTC+5:30, Rustom Mody wrote:
> > > Having said that I should be honest to mention that I saw your post first
> on
> > > my phone where the î, showed but the gØÜ« showed as a rectangle something
> like âî$
> > >
> > > I suspect that îö OTOH would have workedâ | dunno
> >
> > Yeah îö shows whereas gØÜ« doesn't (on my phone)
> > And âî$ does show but much squatter than the replacement char the phone 
> > shows
> > when it cant display a char
> 
> It is a least unicode.
> 
> Much better than what this idotic and buggy Flexible String Representation is
> presenting to the eyes of users.

Why is this drivel now getting through onto the main mailing list/gmane?

--
Kindest regards.

Mark Lawrence.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-27 Thread Peter J. Holzer
On 2017-11-24 04:52:57 +0100, Mikhail V wrote:
> On Fri, Nov 24, 2017 at 4:13 AM, Chris Angelico  wrote:
> > On Fri, Nov 24, 2017 at 1:44 PM, Mikhail V  wrote:
> >> From my above example, you could probably see that I prefer somewhat
> >> middle-sized identifiers, one-two syllables. And naturally, they tend to
> >> reflect some process/meaining, it is not always achievable,
> >> but yes there is such a natural tendency, although by me personally
> >> not so strong, and quite often I use totally meaningless names,
> >> mainly to avoid visual similarity to already created names.
> >> So for very expanded names, it ends up with a lot of underscores :(
> >
> > Okay. So if it makes sense for you to use English words instead of
> > individual letters, since you are fluent in English, does it stand to
> > reason that it would make sense for other programmers to use Russian,
> > Norwegian, Hebrew, Korean, or Japanese words the same way?
> 
> I don't know. Probably, especially if those *programmers* don't know latin
> letters, then they would want to write code with their letters and their
> language. This target group, as I said, will have really hard time
> with programming,

I don't think that's the target group. As you say, if you don't know
latin letters you'll have a hard time with Python (or almost any
programming language). You can't read the keywords or the standard
library function names.

I think the target group is people who can read the latin alphabet and
probably also at least a bit of English, but who are working on in-house
projects.

As a very simple example, many years ago, when I was still at the
university, we decided that we needed a program to manage our students.
So we got some students to write one ;-). As a general rule, identifiers
and comments for all projects had to be in English, which generally made
a lot of sense since we collaborated with institutes in other countries.
But for that project that rule wasn't really appropriate, as we noticed
when one of the students asked us what "Matrikelnummer" is in English.
Nobody knew, so we consulted a dictionary and apparently it's
"enrollment number". Simple enough, but is that intelligible to all
English speakers or is it specific to British universities? And even
worse - whoever is going to maintain that code would be either a staff
member or a student of our institute - they would certainly know what a
"Matrikelnummer" is, but would they understand that enrollment_number is
supposed to contain that? So we decided that domain specific jargon
should not be translated. A bit of bilingual mishmash (first_name and
course_title, but matrikelnummer and kennummer) was better than using
words that knowbody understood.

Now that particular word doesn't contain any non-ASCII characters and
German has only 4 letters not in ASCII, and for all of them there are
official ASCII substitutes, so writing German words in ASCII isn't a
problem.

But for languages with non-latin alphabets (or just a higher density of
accented letters) that's different. If my native language was Russian
and I was writing some in-house application for a Russian company which
contained a lot of Russian company jargon which can't be easily
translated to English (and back), I'm quite sure that I would prefer to
write that jargon in cyrillic and not in some transliteration.

> and in Python in particular, because they will be not only forced to learn
> some english, but also will have all 'pleasures' of  multi-script editing.
> But wait, probably one can write python code in, say Arabic script *only*?
> How about such feature proposal?

There is source filter which lets you write Perl in traditional Chinese.
This even changes the syntax to be closer to Chinese syntax. There is
also one which lets you write Perl in Latin (obviously that uses the
Latin alphabet, but it changes the syntax even more). Don't know whether
something like this is possible in Python, but arguably the result
wouldn't be Python any more (just like Lingua::Romana::Perligata isn't
really Perl - it just happens to be implemented using the Perl
interpreter).


> Ok, so we return back to my original question: apart from
> ability to do so, how beneficial is it on a pragmatical basis?

When I use German identifiers (which I generally don't) I do use
umlauts. When I need to do some physical computations, I might use greek
letters (or maybe not - as a vim user I can type Δt easily enough, but
can the colleague using PyCharm on Windows? I have no idea). So for me
the benefit is rather small. But as I said, German is almost
ASCII-compatible.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature
-- 

Re: Benefits of unicode identifiers (was: Allow additional separator

2017-11-27 Thread nospam . Rustom Mody
On Monday, November 27, 2017 at 6:48:56 PM UTC+5:30, Rustom Mody wrote:
> Having said that I should be honest to mention that I saw your post first on
> my phone where the î, showed but the gØÜ« showed as a rectangle something
like âî$
>
> I suspect that îö OTOH would have workedâ | dunno

Yeah îö shows whereas gØÜ« doesn't (on my phone) And âî$ does show but much
squatter than the replacement char the phone shows when it cant display a char

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator

2017-11-27 Thread nospam . Rustom Mody
On Monday, November 27, 2017 at 3:43:20 PM UTC+5:30, Antoon Pardon wrote:
> Op 23-11-17 om 19:42 schreef Mikhail V:
> > Chris A wrote:
> >
> >>> On Fri, Nov 24, 2017 at 1:10 AM, Mikhail V wrote:
> >>>
>  Chris A wrote:
> 
>  Fortunately for the world, you're not the one who decided which
>  characters were permitted in Python identifiers. The ability to use
>  non-English words for function/variable names is of huge value; the
>  ability to use a hyphen is of some value, but not nearly as much.
> >>> Fortunately for the world we have Chris A. Who knows what is
> >>> fortunate and of huge values.
> >>> So is there any real world projects example of usage of non-latin scripts
> >>> in identifiers? Or is it still only a plan for the new world?
> >
> >> Yes, I've used them personally. And I know other people who have.
> >
> > Oh, I though it would be more impressive showcase for 'huge value'.
> > If we drop the benefit of the bare fact that you can do it, or you just
> > don't know English, how would describe the practical benefit?
> > If you don't know english, then programming at all will be just too hard.
> > (or one must define a new whole language specially for some local script)
>
> Well maybe the value is not huge, but I really appreciate the possibility.
> Being able to write something like below, makes things a lot more clear
> for me.
>
> Po = Pc + R * Vec(cos(î,o), sin(î,o))
> Pe = Pc + R * Vec(cos(î,e), sin(î,e))
> gØÜ«î, = î,e - î,o
> gØÜ«P = Pe - Po

Yeahâ | This is important
And Ive tried to elaborate such suggestions here
http://blog.languager.org/2014/04/unicoded-python.html
[includes some of your suggestions!] I should emphasize that the details there
range between straightforward and facetious.  The general sense of going beyond
 ASCII is not facetious at all In fact its ridiculous in the reverse direction:
 just as FORTRAN and COBOL believed that programming IN ALL UPPERCASE was
somehow kosher, likewise a 2017 language believing that sticking to ASCII is
sound is faintly ridiculous.

But that brings me to the opposite point: I feel its important to distinguish â
 ÿparochial/sectarian unicodeâ Ö from
â ÿuniversal unicodeâ Ö.
More on the distinction http://blog.languager.org/2015/03/whimsical-unicode.htm
 l
More on the universal aspect: http://blog.languager.org/2015/02/universal-unico
 de.html

Having said that I should be honest to mention that I saw your post first on my
 phone where the î, showed but the gØÜ« showed as a rectangle something like
âî$

I suspect that îö OTOH would have workedâ | dunno

So yes, there can be non-trivial logistic problems going beyond ASCII As there
are problems with errant mail-clients transmitting indentation-sensitive
languages and so on!

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator

2017-11-27 Thread wxjmfauth
Le lundi 27 novembre 2017 14:52:19 UTC+1, Rustom Mody a ÄCcritâ :
> On Monday, November 27, 2017 at 6:48:56 PM UTC+5:30, Rustom Mody wrote:
> > Having said that I should be honest to mention that I saw your post first
on
> > my phone where the î, showed but the gØÜ« showed as a rectangle something
like âî$
> >
> > I suspect that îö OTOH would have workedâ | dunno
>
> Yeah îö shows whereas gØÜ« doesn't (on my phone)
> And âî$ does show but much squatter than the replacement char the phone shows
> when it cant display a char

It is a least unicode.

Much better than what this idotic and buggy Flexible String Representation is
presenting to the eyes of users.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator

2017-11-27 Thread nospam . Antoon Pardon
Op 23-11-17 om 19:42 schreef Mikhail V:
> Chris A wrote:
>
>>> On Fri, Nov 24, 2017 at 1:10 AM, Mikhail V wrote:
>>>
 Chris A wrote:

 Fortunately for the world, you're not the one who decided which
 characters were permitted in Python identifiers. The ability to use
 non-English words for function/variable names is of huge value; the
 ability to use a hyphen is of some value, but not nearly as much.
>>> Fortunately for the world we have Chris A. Who knows what is
>>> fortunate and of huge values.
>>> So is there any real world projects example of usage of non-latin scripts
>>> in identifiers? Or is it still only a plan for the new world?
>
>> Yes, I've used them personally. And I know other people who have.
>
> Oh, I though it would be more impressive showcase for 'huge value'.
> If we drop the benefit of the bare fact that you can do it, or you just
> don't know English, how would describe the practical benefit?
> If you don't know english, then programming at all will be just too hard.
> (or one must define a new whole language specially for some local script)

Well maybe the value is not huge, but I really appreciate the possibility.
Being able to write something like below, makes things a lot more clear for me.

Po = Pc + R * Vec(cos(î,o), sin(î,o)) Pe = Pc + R * Vec(cos(î,e), sin(î,e))
gØÜ«î, = î,e - î,o
gØÜ«P = Pe - Po

--
Antoon Pardon.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-27 Thread Rustom Mody
On Monday, November 27, 2017 at 6:48:56 PM UTC+5:30, Rustom Mody wrote:
> Having said that I should be honest to mention that I saw your post first on
> my phone where the θ showed but the 횫 showed as a rectangle something like ⌧
> 
> I suspect that Δ OTOH would have worked… dunno

Yeah Δ shows whereas 횫 doesn't (on my phone)
And ⌧ does show but much squatter than the replacement char the phone shows
when it cant display a char
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-27 Thread Rustom Mody
On Monday, November 27, 2017 at 3:43:20 PM UTC+5:30, Antoon Pardon wrote:
> Op 23-11-17 om 19:42 schreef Mikhail V:
> > Chris A wrote:
> >
> >>> On Fri, Nov 24, 2017 at 1:10 AM, Mikhail V wrote:
> >>>
>  Chris A wrote:
> 
>  Fortunately for the world, you're not the one who decided which
>  characters were permitted in Python identifiers. The ability to use
>  non-English words for function/variable names is of huge value; the
>  ability to use a hyphen is of some value, but not nearly as much.
> >>> Fortunately for the world we have Chris A. Who knows what is
> >>> fortunate and of huge values.
> >>> So is there any real world projects example of usage of non-latin scripts
> >>> in identifiers? Or is it still only a plan for the new world?
> >
> >> Yes, I've used them personally. And I know other people who have.
> >
> > Oh, I though it would be more impressive showcase for 'huge value'.
> > If we drop the benefit of the bare fact that you can do it, or you just
> > don't know English, how would describe the practical benefit?
> > If you don't know english, then programming at all will be just too hard.
> > (or one must define a new whole language specially for some local script)
> 
> Well maybe the value is not huge, but I really appreciate the possibility.
> Being able to write something like below, makes things a lot more clear
> for me.
> 
> Po = Pc + R * Vec(cos(θo), sin(θo))
> Pe = Pc + R * Vec(cos(θe), sin(θe))
> 횫θ = θe - θo
> 횫P = Pe - Po

Yeah… This is important
And Ive tried to elaborate such suggestions here
http://blog.languager.org/2014/04/unicoded-python.html
[includes some of your suggestions!]
I should emphasize that the details there range between straightforward and
facetious.  The general sense of going beyond ASCII is not facetious at all
In fact its ridiculous in the reverse direction: just as FORTRAN and COBOL
believed that programming IN ALL UPPERCASE was somehow kosher, likewise
a 2017 language believing that sticking to ASCII is sound is faintly ridiculous.

But that brings me to the opposite point:
I feel its important to distinguish ‘parochial/sectarian unicode’ from 
‘universal unicode’.
More on the distinction http://blog.languager.org/2015/03/whimsical-unicode.html
More on the universal aspect: 
http://blog.languager.org/2015/02/universal-unicode.html

Having said that I should be honest to mention that I saw your post first on
my phone where the θ showed but the 횫 showed as a rectangle something like ⌧

I suspect that Δ OTOH would have worked… dunno

So yes, there can be non-trivial logistic problems going beyond ASCII
As there are problems with errant mail-clients transmitting 
indentation-sensitive languages and so on!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-27 Thread Antoon Pardon
Op 23-11-17 om 19:42 schreef Mikhail V:
> Chris A wrote:
>
>>> On Fri, Nov 24, 2017 at 1:10 AM, Mikhail V wrote:
>>>
 Chris A wrote:

 Fortunately for the world, you're not the one who decided which
 characters were permitted in Python identifiers. The ability to use
 non-English words for function/variable names is of huge value; the
 ability to use a hyphen is of some value, but not nearly as much.
>>> Fortunately for the world we have Chris A. Who knows what is
>>> fortunate and of huge values.
>>> So is there any real world projects example of usage of non-latin scripts
>>> in identifiers? Or is it still only a plan for the new world?
>
>> Yes, I've used them personally. And I know other people who have.
>
> Oh, I though it would be more impressive showcase for 'huge value'.
> If we drop the benefit of the bare fact that you can do it, or you just
> don't know English, how would describe the practical benefit?
> If you don't know english, then programming at all will be just too hard.
> (or one must define a new whole language specially for some local script)

Well maybe the value is not huge, but I really appreciate the possibility.
Being able to write something like below, makes things a lot more clear
for me.

Po = Pc + R * Vec(cos(θo), sin(θo))
Pe = Pc + R * Vec(cos(θe), sin(θe))
횫θ = θe - θo
횫P = Pe - Po

-- 
Antoon Pardon.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-25 Thread Chris Angelico
On Sat, Nov 25, 2017 at 11:33 PM, Rustom Mody  wrote:
> Personally I feel that there should be a law against languages that disallow
> the creation of magic tricks!¡!

I agree. The programming language should also ensure that your program
will terminate eventually, that it is bug-free (this can actually be
done in Python - all you have to do is type-annotate all your
functions to return the correct values), and that it is optimally
implemented.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-25 Thread Rustom Mody
On Saturday, November 25, 2017 at 6:03:52 PM UTC+5:30, Rustom Mody wrote:
> On Friday, November 24, 2017 at 12:20:29 AM UTC+5:30, Mikhail V wrote:
> > Ok, I personally could find some practical usage for that, but
> > merely for fun. I doubt though that someone with less
> > typographical experience and overall computer literacy could
> > really make benefits even for personal usage.
> > 
> > So - fun is one benefit. And fun is important. But is that the
> > idea behind it?
> 
> Are you under-estimating the fun-value? 
> 
> Python 3.5.3 (default, Sep 14 2017, 22:58:41) 
> [GCC 6.3.0 20170406] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> python.el: native completion setup loaded
> >>> A = 1
> >>> Α = 2
> >>> А = 3
> >>> (A, Α, А)
> (1, 2, 3)
> >>> # And there are 5 other variations on this magic trick
> >>> # Or if you prefer…
> >>> A == Α
> False
> >>> 
> 
> Now compare with the boring spoilsport called python 2:
> 
> Python 2.7.13 (default, Jan 19 2017, 14:48:08) 
> [GCC 6.3.0 20170118] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> python.el: native completion setup loaded
> >>> A = 1
> >>> Α = 2
>   File "", line 1
> Α = 2
> ^
> SyntaxError: invalid syntax
> >>> 
> 
> Personally I feel that there should be a law against languages that disallow 
> the creation of magic tricks!¡!

I should mention also that some languages are even more advanced in their
jovialness regarding unicode tricks

Haskell:
GHCi, version 8.0.2: http://www.haskell.org/ghc/  :? for help
Prelude> let flag = 1
Prelude> let flag = 2
Prelude> flag == flag
False
Prelude> (flag, flag)
(2,1)
Prelude> 

Python3 is quite boring by contrast:

Python 3.5.3 (default, Sep 14 2017, 22:58:41) 
[GCC 6.3.0 20170406] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> python.el: native completion setup loaded
>>> flag = 1
>>> flag = 2
>>> flag == flag
True
>>> (flag, flag)
(2, 2)
>>>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-25 Thread Rustom Mody
On Friday, November 24, 2017 at 12:20:29 AM UTC+5:30, Mikhail V wrote:
> Ok, I personally could find some practical usage for that, but
> merely for fun. I doubt though that someone with less
> typographical experience and overall computer literacy could
> really make benefits even for personal usage.
> 
> So - fun is one benefit. And fun is important. But is that the
> idea behind it?

Are you under-estimating the fun-value? 

Python 3.5.3 (default, Sep 14 2017, 22:58:41) 
[GCC 6.3.0 20170406] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> python.el: native completion setup loaded
>>> A = 1
>>> Α = 2
>>> А = 3
>>> (A, Α, А)
(1, 2, 3)
>>> # And there are 5 other variations on this magic trick
>>> # Or if you prefer…
>>> A == Α
False
>>> 

Now compare with the boring spoilsport called python 2:

Python 2.7.13 (default, Jan 19 2017, 14:48:08) 
[GCC 6.3.0 20170118] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> python.el: native completion setup loaded
>>> A = 1
>>> Α = 2
  File "", line 1
Α = 2
^
SyntaxError: invalid syntax
>>> 

Personally I feel that there should be a law against languages that disallow 
the creation of magic tricks!¡!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread Richard Damon

On 11/24/17 5:46 PM, Ned Batchelder wrote:

On 11/24/17 5:26 PM, Richard Damon wrote:

Have you tried using U+2010 (HYPHEN) ‐. It is in the class 
XID_CONTINUE (in fact it is in XID_START) so should be available.


U+2010 isn't allowed in Python 3 identifiers.

The rules for identifiers are here: 
https://docs.python.org/3/reference/lexical_analysis.html#identifiers 
.   U+2010 is in category Pd 
(http://www.fileformat.info/info/unicode/char/2010), which isn't one 
of the categories allowed in identifiers.  Category Pc 
(http://www.fileformat.info/info/unicode/category/Pc/list.htm) is 
allowed, but it doesn't include anything that would look like a hyphen.


--Ned.


Looks like the site that I looked up characters in XID_CONTINUE/START 
was incorrect. Looks like not only is U+2010 not in any of the character 
classes that are put into ID_START or ID_CONTINUE but is in 
Pattern_Syntax which is explicitly removed from those categories.


--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread Ned Batchelder

On 11/24/17 5:26 PM, Richard Damon wrote:

Have you tried using U+2010 (HYPHEN) ‐. It is in the class 
XID_CONTINUE (in fact it is in XID_START) so should be available.


U+2010 isn't allowed in Python 3 identifiers.

The rules for identifiers are here: 
https://docs.python.org/3/reference/lexical_analysis.html#identifiers 
.   U+2010 is in category Pd 
(http://www.fileformat.info/info/unicode/char/2010), which isn't one of 
the categories allowed in identifiers.  Category Pc 
(http://www.fileformat.info/info/unicode/category/Pc/list.htm) is 
allowed, but it doesn't include anything that would look like a hyphen.


--Ned.


--
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread Mikhail V
On Fri, Nov 24, 2017 at 11:26 PM, Richard Damon
 wrote:
>
> Have you tried using U+2010 (HYPHEN) ‐. It is in the class XID_CONTINUE (in
> fact it is in XID_START) so should be available.
>

Hi Richard.

U+2010 is SyntaxError.
5 days ago I made a proposal on python-ideas, and we have already discussed
many aspects including straw-man arguments about fonts,etc


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread Richard Damon

On 11/24/17 4:04 PM, Mikhail V wrote:

On Fri, Nov 24, 2017 at 9:08 PM, Chris Angelico  wrote:

On Sat, Nov 25, 2017 at 7:00 AM, Mikhail V  wrote:

I agree that one should have more choices, but
people still can't really choose many things.
I can't choose hyphen, I can't choose minus sign,
and many tech people would probably want more operators.
It counts probably not so *big* amount of people, compared to *all*
people that potentially would say "oh how wonderful is it to be able
to write in various scripts", still it is just a "use it at your own risk"
thing at a minimum, and merely based on emotions rather than
common sense.

Regardless of what Unicode decides for classifications, there simply must
be careful analysis how the major *Python* code actually looks in the end
of all experiments. Especially true for characters in regard
identifiers versus operators.

And it's the "identifiers versus operators" question that is why you
can't use hyphen in an identifier. Underscore is available as an ASCII
joiner, and there are various non-ASCII joiners available too. Why is
hyphen so important?

Yes I understand this, so it is how Unicode defines joiners.
Yeah, debates about the classifications can be
hold forever, but one should not forget about the hyphen during
these debates. Hyphen is used I think more then six hundreds
years as a joiner (or probably some other classification term one prefer).
And just comes so it works very well.
Among Unicode joiners, middledot reminds of hyphen,
but it is not used in typography for this task. So it is not good option
and has issues in most fonts (too small or not aligned with lowercase).
Often it is used to show up whitespace in editors,
so it is kind of 'reserved'.
Other joiners in unicode classification - well probably ok for a 1st April
proposal.

About importance, it was already covered in the proposal.
Why it is SO important? It is rhetorical question.
Important compared to what? Compared to the question, what
one will eat and where sleep tomorrow? Then it is not so important.


Mikhail


Have you tried using U+2010 (HYPHEN) ‐. It is in the class XID_CONTINUE 
(in fact it is in XID_START) so should be available.


What isn't available is U+002D (HYPHEN-MINUS) - because that is 
otherwise defined as the subtraction/negation operator.


It may make your code harder to read, if your font doesn't make enough 
of a distinction between those characters


--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread breamoreboy
On Thursday, November 23, 2017 at 6:50:29 PM UTC, Mikhail V wrote:
> Chris A wrote:
> 
> >> On Fri, Nov 24, 2017 at 1:10 AM, Mikhail V wrote:
> >>
> >>> Chris A wrote:
> >>>
> >>> Fortunately for the world, you're not the one who decided which
> >>> characters were permitted in Python identifiers. The ability to use
> >>> non-English words for function/variable names is of huge value; the
> >>> ability to use a hyphen is of some value, but not nearly as much.
> >>
> >> Fortunately for the world we have Chris A. Who knows what is
> >> fortunate and of huge values.
> >> So is there any real world projects example of usage of non-latin scripts
> >> in identifiers? Or is it still only a plan for the new world?
> 
> 
> > Yes, I've used them personally. And I know other people who have.
> 
> 
> Oh, I though it would be more impressive showcase for 'huge value'.
> If we drop the benefit of the bare fact that you can do it, or you just
> don't know English, how would describe the practical benefit?
> If you don't know english, then programming at all will be just too hard.
> (or one must define a new whole language specially for some local script)
> 
> I mean for a real practical situation - for example for an average
> Python programmer or someone who seeks a programmer job.
> And who does not have a 500-key keyboard, and who has
> a not enough high threshold of vision sensitivity to bear the look
> of various scripts in one small text piece?
> 
> Ok, I personally could find some practical usage for that, but
> merely for fun. I doubt though that someone with less
> typographical experience and overall computer literacy could
> really make benefits even for personal usage.
> 
> So - fun is one benefit. And fun is important. But is that the
> idea behind it?
> 
> 
> Mikhail

Your normal rubbish.  Do you ever give up with wasting our time?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread Rick Johnson
On Thursday, November 23, 2017 at 3:06:00 PM UTC-6, Chris Angelico wrote:
> Seriously? Do I need to wrench this part out of you? This
> was supposed to be the EASY question that everyone can
> agree on, from which I can then draw my line of argument.

Translation:

"Dag-nab-it! You're supposed to answer my false dichotomy in
a way that makes you look ~really~ bad, so that i can tear
your argument apart ~really~ easy. Got it? Now stop clowning
around and do it right this time!"
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread Mikhail V
On Fri, Nov 24, 2017 at 9:08 PM, Chris Angelico  wrote:
> On Sat, Nov 25, 2017 at 7:00 AM, Mikhail V  wrote:
>> I agree that one should have more choices, but
>> people still can't really choose many things.
>> I can't choose hyphen, I can't choose minus sign,
>> and many tech people would probably want more operators.
>> It counts probably not so *big* amount of people, compared to *all*
>> people that potentially would say "oh how wonderful is it to be able
>> to write in various scripts", still it is just a "use it at your own risk"
>> thing at a minimum, and merely based on emotions rather than
>> common sense.
>>
>> Regardless of what Unicode decides for classifications, there simply must
>> be careful analysis how the major *Python* code actually looks in the end
>> of all experiments. Especially true for characters in regard
>> identifiers versus operators.
>
> And it's the "identifiers versus operators" question that is why you
> can't use hyphen in an identifier. Underscore is available as an ASCII
> joiner, and there are various non-ASCII joiners available too. Why is
> hyphen so important?

Yes I understand this, so it is how Unicode defines joiners.
Yeah, debates about the classifications can be
hold forever, but one should not forget about the hyphen during
these debates. Hyphen is used I think more then six hundreds
years as a joiner (or probably some other classification term one prefer).
And just comes so it works very well.
Among Unicode joiners, middledot reminds of hyphen,
but it is not used in typography for this task. So it is not good option
and has issues in most fonts (too small or not aligned with lowercase).
Often it is used to show up whitespace in editors,
so it is kind of 'reserved'.
Other joiners in unicode classification - well probably ok for a 1st April
proposal.

About importance, it was already covered in the proposal.
Why it is SO important? It is rhetorical question.
Important compared to what? Compared to the question, what
one will eat and where sleep tomorrow? Then it is not so important.


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread Mikhail V
On Fri, Nov 24, 2017 at 5:37 PM, Chris Angelico  wrote:
> On Sat, Nov 25, 2017 at 3:33 AM, Mikhail V  wrote:
>> On Fri, Nov 24, 2017 at 8:03 AM, Chris Angelico  wrote:
>>
 and in Python in particular, because they will be not only forced to learn
 some english, but also will have all 'pleasures' of  multi-script editing.
 But wait, probably one can write python code in, say Arabic script *only*?
 How about such feature proposal?
>>>
>>> If Python supports ASCII identifiers only, people have no choice but
>>> to transliterate. As it is, people get to choose which is better for
>>> them - to transliterate or not to transliterate, that is the
>>> readability question.
>>
>> Sure, let them choose.
>> Transliteration though is way more reasonable solution.
>
> That right there has settled it: you agree that identifiers have to
> use the broader Unicode set, not limited to ASCII. Otherwise they
> can't choose. Everything else is down to style guides; the language
> MUST support all alphabets so that people have this choice.

That's a valid and somewhat obvious point.
I agree that one should have more choices, but
people still can't really choose many things.
I can't choose hyphen, I can't choose minus sign,
and many tech people would probably want more operators.
It counts probably not so *big* amount of people, compared to *all*
people that potentially would say "oh how wonderful is it to be able
to write in various scripts", still it is just a "use it at your own risk"
thing at a minimum, and merely based on emotions rather than
common sense.

Regardless of what Unicode decides for classifications, there simply must
be careful analysis how the major *Python* code actually looks in the end
of all experiments. Especially true for characters in regard
identifiers versus operators.


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread Chris Angelico
On Sat, Nov 25, 2017 at 7:00 AM, Mikhail V  wrote:
> I agree that one should have more choices, but
> people still can't really choose many things.
> I can't choose hyphen, I can't choose minus sign,
> and many tech people would probably want more operators.
> It counts probably not so *big* amount of people, compared to *all*
> people that potentially would say "oh how wonderful is it to be able
> to write in various scripts", still it is just a "use it at your own risk"
> thing at a minimum, and merely based on emotions rather than
> common sense.
>
> Regardless of what Unicode decides for classifications, there simply must
> be careful analysis how the major *Python* code actually looks in the end
> of all experiments. Especially true for characters in regard
> identifiers versus operators.

And it's the "identifiers versus operators" question that is why you
can't use hyphen in an identifier. Underscore is available as an ASCII
joiner, and there are various non-ASCII joiners available too. Why is
hyphen so important?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread Chris Angelico
On Sat, Nov 25, 2017 at 3:33 AM, Mikhail V  wrote:
> On Fri, Nov 24, 2017 at 8:03 AM, Chris Angelico  wrote:
>
>>> and in Python in particular, because they will be not only forced to learn
>>> some english, but also will have all 'pleasures' of  multi-script editing.
>>> But wait, probably one can write python code in, say Arabic script *only*?
>>> How about such feature proposal?
>>
>> If Python supports ASCII identifiers only, people have no choice but
>> to transliterate. As it is, people get to choose which is better for
>> them - to transliterate or not to transliterate, that is the
>> readability question.
>
> Sure, let them choose.
> Transliteration though is way more reasonable solution.

That right there has settled it: you agree that identifiers have to
use the broader Unicode set, not limited to ASCII. Otherwise they
can't choose. Everything else is down to style guides; the language
MUST support all alphabets so that people have this choice.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread Mikhail V
On Fri, Nov 24, 2017 at 8:03 AM, Chris Angelico  wrote:

>> and in Python in particular, because they will be not only forced to learn
>> some english, but also will have all 'pleasures' of  multi-script editing.
>> But wait, probably one can write python code in, say Arabic script *only*?
>> How about such feature proposal?
>
> If Python supports ASCII identifiers only, people have no choice but
> to transliterate. As it is, people get to choose which is better for
> them - to transliterate or not to transliterate, that is the
> readability question.

Sure, let them choose.
Transliteration though is way more reasonable solution.

>
>> As for non-english speaker who know some English already,
>> could of course want to include identifiers in those scripts.
>> But how about libraries?
>
> If you want to use numpy, you have to understand the language of
> numpy. That's a lot of technical jargon, so even if you understand
> English, you have to learn that. So there's ultimately no difference.

That's what I'm saying. There will be anyway major parts of code in
English and pretty much every already existing modules that can
further  help the developer will be in English, like it or not.

>> Ok, so we return back to my original question: apart from
>> ability to do so, how beneficial is it on a pragmatical basis?
>> I mean, e.g. Cyrillic will introduce homoglyph issues.
>> CJK and Arabic scripts are metrically and optically incompatible with
>> latin, so such mixing will end up with messy look. So just for
>> the experiment, yes, it's fun.
>
> Does it really introduce homoglyph issues in real-world situations,
> though? Are there really cases where people can't figure out from
> context what's going on? I haven't seen that happening. Usually there
> are *entire words* (and more) in a single language, making it pretty
> easy to figure out.

The issues can be discussed long, but I have no doubt that even placing words
in two different scripts on one text line is a bad idea, not only for source
code. For mixing Cyrillic+Latin, yes, this also causes extra issues due to
homoglyphs in many cases, I know it practically from everyday work with
Cyrillic filenames, and from past experience with English-Russian textbooks.
In textbooks at least I can help it by proper layout - separating them
in tables,
or putting in quotes or bold for inline usage.


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-24 Thread Karsten Hilbert
On Thu, Nov 23, 2017 at 05:47:04PM -0700, Ian Kelly wrote:

> > Understanding, let alone being able to read, code written in Arabic ?
> 
> People are going to write code in Arabic whether you like it or not,
> because not everybody speaks English, and not everybody who does
> *wants* to use it. Now, would you prefer to read code where the
> variable names are written in Arabic script, or where the variable
> names are still in Arabic but transliterated to Latin characters?
> Either way, you're not going to be able to understand it, so I'm not
> sure why it makes a difference to you.

I can visually pattern match "words" based on Latin
characters. I can't with Arabic letters. So that answers the
"would you prefer" part.

However, the main point has been answered - Python already
does what is talked about. End of story.

Karsten
-- 
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Chris Angelico
On Fri, Nov 24, 2017 at 2:52 PM, Mikhail V  wrote:
> On Fri, Nov 24, 2017 at 4:13 AM, Chris Angelico  wrote:
>> On Fri, Nov 24, 2017 at 1:44 PM, Mikhail V  wrote:
>>> From my above example, you could probably see that I prefer somewhat
>>> middle-sized identifiers, one-two syllables. And naturally, they tend to
>>> reflect some process/meaining, it is not always achievable,
>>> but yes there is such a natural tendency, although by me personally
>>> not so strong, and quite often I use totally meaningless names,
>>> mainly to avoid visual similarity to already created names.
>>> So for very expanded names, it ends up with a lot of underscores :(
>>
>> Okay. So if it makes sense for you to use English words instead of
>> individual letters, since you are fluent in English, does it stand to
>> reason that it would make sense for other programmers to use Russian,
>> Norwegian, Hebrew, Korean, or Japanese words the same way?
>
> I don't know. Probably, especially if those *programmers* don't know latin
> letters, then they would want to write code with their letters and their
> language. This target group, as I said, will have really hard time
> with programming,
> and in Python in particular, because they will be not only forced to learn
> some english, but also will have all 'pleasures' of  multi-script editing.
> But wait, probably one can write python code in, say Arabic script *only*?
> How about such feature proposal?

If Python supports ASCII identifiers only, people have no choice but
to transliterate. As it is, people get to choose which is better for
them - to transliterate or not to transliterate, that is the
readability question.

> As for non-english speaker who know some English already,
> could of course want to include identifiers in those scripts.
> But how about libraries?

If you want to use numpy, you have to understand the language of
numpy. That's a lot of technical jargon, so even if you understand
English, you have to learn that. So there's ultimately no difference.

The most popular libraries, just like the standard library, are
usually going to choose to go ASCII-only as the
lowest-common-denominator. But that is, again, their own choice.

> Ok, so we return back to my original question: apart from
> ability to do so, how beneficial is it on a pragmatical basis?
> I mean, e.g. Cyrillic will introduce homoglyph issues.
> CJK and Arabic scripts are metrically and optically incompatible with
> latin, so such mixing will end up with messy look. So just for
> the experiment, yes, it's fun.

Does it really introduce homoglyph issues in real-world situations,
though? Are there really cases where people can't figure out from
context what's going on? I haven't seen that happening. Usually there
are *entire words* (and more) in a single language, making it pretty
easy to figure out.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Mikhail V
On Fri, Nov 24, 2017 at 4:13 AM, Chris Angelico  wrote:
> On Fri, Nov 24, 2017 at 1:44 PM, Mikhail V  wrote:
>> From my above example, you could probably see that I prefer somewhat
>> middle-sized identifiers, one-two syllables. And naturally, they tend to
>> reflect some process/meaining, it is not always achievable,
>> but yes there is such a natural tendency, although by me personally
>> not so strong, and quite often I use totally meaningless names,
>> mainly to avoid visual similarity to already created names.
>> So for very expanded names, it ends up with a lot of underscores :(
>
> Okay. So if it makes sense for you to use English words instead of
> individual letters, since you are fluent in English, does it stand to
> reason that it would make sense for other programmers to use Russian,
> Norwegian, Hebrew, Korean, or Japanese words the same way?

I don't know. Probably, especially if those *programmers* don't know latin
letters, then they would want to write code with their letters and their
language. This target group, as I said, will have really hard time
with programming,
and in Python in particular, because they will be not only forced to learn
some english, but also will have all 'pleasures' of  multi-script editing.
But wait, probably one can write python code in, say Arabic script *only*?
How about such feature proposal?

As for non-english speaker who know some English already,
could of course want to include identifiers in those scripts.
But how about libraries?
Ok, so we return back to my original question: apart from
ability to do so, how beneficial is it on a pragmatical basis?
I mean, e.g. Cyrillic will introduce homoglyph issues.
CJK and Arabic scripts are metrically and optically incompatible with
latin, so such mixing will end up with messy look. So just for
the experiment, yes, it's fun.


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Chris Angelico
On Fri, Nov 24, 2017 at 1:44 PM, Mikhail V  wrote:
> From my above example, you could probably see that I prefer somewhat
> middle-sized identifiers, one-two syllables. And naturally, they tend to
> reflect some process/meaining, it is not always achievable,
> but yes there is such a natural tendency, although by me personally
> not so strong, and quite often I use totally meaningless names,
> mainly to avoid visual similarity to already created names.
> So for very expanded names, it ends up with a lot of underscores :(

Okay. So if it makes sense for you to use English words instead of
individual letters, since you are fluent in English, does it stand to
reason that it would make sense for other programmers to use Russian,
Norwegian, Hebrew, Korean, or Japanese words the same way?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Mikhail V
On Thu, Nov 23, 2017 at 10:05 PM, Chris Angelico  wrote:
> On Fri, Nov 24, 2017 at 8:02 AM, Mikhail V  wrote:
>> On Thu, Nov 23, 2017 at 9:39 PM, Chris Angelico  wrote:
>>> On Fri, Nov 24, 2017 at 7:38 AM, Mikhail V  wrote:
 I see you manually 'optimise' the look?
 I personally would end with something like this:

 def zip_longest(*A, **K):
 value = K.get ('fillvalue')
 count = len(a) - 1
 def sentinel():
 nonlocal count
 if not count:
 raise ZipExhausted
 count -= 1
 yield  value
 fillers = repeat (value)
 iterators = [chain (it, sentinel(), fillers) for it in A]
 try:
 while iterators:
 yield tuple (map (next, iterators))
 except ZipExhausted:
 pass


 So I would say, my option would be something inbetween.
 Note that I tweaked it for proportional font, namely Times New Roman.
>>
>>> I don't see how the font applies here, but whatever.
>>
>> For a different font, say CourierNew (monospaced) the tweaking strategy might
>> be different.
>
> If you have ANY font-specific "tweaking", you're doing it wrong.
> Thanks for making it look worse on everyone else's screen.

Trolling attempt counted :)
No I don't have any particular font-specific strategy,
it is just my wording reflecting the fact that things look different
in different fonts, even among proportional fonts.


>
>>> Which is better? The one-letter names or the longer ones that tie in with 
>>> what they're
>>> doing?
>>
>> I think I have answered more or less in previous post, that you cutted off.
>> So you were not satisfied?
>> But now I am probably not get your 'better' meaning.
>> Better for understanding, or purely visually, i.e. less eye-straining?
>
> Which one would you prefer to maintain? Which would you prefer in a code 
> review?
>
> Do you want to have one- and two-letter variable names, or longer and
> more descriptive ones?
>
> Seriously? Do I need to wrench this part out of you? This was supposed
> to be the EASY question that everyone can agree on, from which I can
> then draw my line of argument.


>From my above example, you could probably see that I prefer somewhat
middle-sized identifiers, one-two syllables. And naturally, they tend to
reflect some process/meaining, it is not always achievable,
but yes there is such a natural tendency, although by me personally
not so strong, and quite often I use totally meaningless names,
mainly to avoid visual similarity to already created names.
So for very expanded names, it ends up with a lot of underscores :(


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Ian Kelly
On Thu, Nov 23, 2017 at 2:19 PM, Richard Damon  wrote:
> The Unicode Standard provides a fairly good classification of the
> characters, and it would make sense to define that an character that is
> defined as a 'Letter' or a 'Number', and some classes of Punctuation
> (connector and dash) be allowed in identifiers.
>
> Fully implementing may be more complicated than it is worth. An interim
> simple solution would be just allow ALL (or maybe most, excluding a limited
> number of obvious exceptions) of the characters above the ASCII set, with a
> warning that only those classified as above are promised to remain valid,
> and that other characters, while currently not generating a syntax error,
> may do so in the future. It should also be stated that while currently no
> character normalization is being done, it may be added in the future, so
> identifiers that differ only by code point sequences that are defined as
> being equivalent, might in the future not be distinct.

It's already implemented; nothing needs to be done. Unicode Standard
Annex #31 defines a recommended syntax of identifiers, which Python
basically follows, except that for backward compatibility Python also
allows identifiers to begin with an underscore. Compare the
recommended syntax at
http://unicode.org/reports/tr31/#Default_Identifier_Syntax with the
Python syntax at
https://docs.python.org/3/reference/lexical_analysis.html#identifiers.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Ian Kelly
On Thu, Nov 23, 2017 at 1:04 PM, Karsten Hilbert
 wrote:
> On Thu, Nov 23, 2017 at 08:46:01PM +0100, Thomas Jollans wrote:
>
>> > I mean for a real practical situation - for example for an average
>> > Python programmer or someone who seeks a programmer job.
>> > And who does not have a 500-key keyboard,
>>
>> I don't think it's too much to ask for a programmer to have the
>> technology and expertise necessary to type their own language in its
>> proper alphabet.
>
> Surely, but it can make reusing code a nightmare.
>
> Using function arguments written in Thai script ?
>
> Understanding, let alone being able to read, code written in Arabic ?

People are going to write code in Arabic whether you like it or not,
because not everybody speaks English, and not everybody who does
*wants* to use it. Now, would you prefer to read code where the
variable names are written in Arabic script, or where the variable
names are still in Arabic but transliterated to Latin characters?
Either way, you're not going to be able to understand it, so I'm not
sure why it makes a difference to you.

If Arabic characters are allowed however, then it might be of use to
the people who are going to code in Arabic anyway. And if it isn't,
then they have the option not to use it either.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Thomas Jollans
On 24/11/17 00:18, Richard Damon wrote:
> On 11/23/17 5:45 PM, Thomas Jollans wrote:
>> On 23/11/17 23:15, Richard Damon wrote:
>>> My thought is you define a legal only those Unicode characters that via
>>> the defined classification would be normally legal, but perhaps the
>>> first implementation doesn't diagnose many of the illegal combinations.
>>> If that isn't Pythonic, then yes, implementing a fuller classification
>>> would be needed. That might also say normalization questions would need
>>> to be decided too.
>>>
>> You do realise that Python has a perfectly good definition of what's
>> allowed in an identifier that is thoroughly grounded in the Unicode
>> standard and works very well, right?
>>
>>
>> -- Thomas
> 
> No, I wasn't aware that Python was already Unicode enabled in the source
> code set. Still fairly new with it, but the fact that people seemed to
> argue about doing it made me think it was allowed yet.
> 

It's an old favourite some people to shout about on slow news days...

Python allows identifiers to start with any letter or an underscore, and
continue with any letter or number, or an underscore. The details follow
the Unicode XID_* properties.

In comments and strings, anything goes.


-- Thomas
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Richard Damon

On 11/23/17 5:45 PM, Thomas Jollans wrote:

On 23/11/17 23:15, Richard Damon wrote:

My thought is you define a legal only those Unicode characters that via
the defined classification would be normally legal, but perhaps the
first implementation doesn't diagnose many of the illegal combinations.
If that isn't Pythonic, then yes, implementing a fuller classification
would be needed. That might also say normalization questions would need
to be decided too.


You do realise that Python has a perfectly good definition of what's
allowed in an identifier that is thoroughly grounded in the Unicode
standard and works very well, right?


-- Thomas


No, I wasn't aware that Python was already Unicode enabled in the source 
code set. Still fairly new with it, but the fact that people seemed to 
argue about doing it made me think it was allowed yet.


--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Thomas Jollans
On 23/11/17 23:15, Richard Damon wrote:
> 
> My thought is you define a legal only those Unicode characters that via
> the defined classification would be normally legal, but perhaps the
> first implementation doesn't diagnose many of the illegal combinations.
> If that isn't Pythonic, then yes, implementing a fuller classification
> would be needed. That might also say normalization questions would need
> to be decided too.
> 

You do realise that Python has a perfectly good definition of what's
allowed in an identifier that is thoroughly grounded in the Unicode
standard and works very well, right?


-- Thomas
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Richard Damon

On 11/23/17 4:31 PM, Chris Angelico wrote:

On Fri, Nov 24, 2017 at 8:19 AM, Richard Damon  wrote:

On 11/23/17 2:46 PM, Thomas Jollans wrote:

On 23/11/17 19:42, Mikhail V wrote:

I mean for a real practical situation - for example for an average
Python programmer or someone who seeks a programmer job.
And who does not have a 500-key keyboard,

I don't think it's too much to ask for a programmer to have the
technology and expertise necessary to type their own language in its
proper alphabet.


My personal feeling is that the language needs to be fully usable with just
ASCII, so the - character (HYPHEN/MINUS) is the subtraction/negation
operator, not an in-name hyphen. This also means the main library should use
just the ASCII character set.

I do also realize that it could be very useful for programmers who are
programming with other languages as their native, to be able to use words in
their native language for their own symbols, and thus useful to use their
own character sets. Yes, doing so may add difficulty to the programmers, as
they may need to be switching keyboard layouts (especially if not using a
LATIN based language), but that is THEIR decision to do so. It also may make
it harder for outside programmers to hep, but again, that is the teams
decision to make.

The Unicode Standard provides a fairly good classification of the
characters, and it would make sense to define that an character that is
defined as a 'Letter' or a 'Number', and some classes of Punctuation
(connector and dash) be allowed in identifiers.

That's exactly how Python's identifiers are defined (modulo special
handling of some of the ASCII set, for reasons of backward
compatibility).


Fully implementing may be more complicated than it is worth. An interim
simple solution would be just allow ALL (or maybe most, excluding a limited
number of obvious exceptions) of the characters above the ASCII set, with a
warning that only those classified as above are promised to remain valid,
and that other characters, while currently not generating a syntax error,
may do so in the future. It should also be stated that while currently no
character normalization is being done, it may be added in the future, so
identifiers that differ only by code point sequences that are defined as
being equivalent, might in the future not be distinct.

No, that would be a bad idea; some of those characters are more
logically operators or brackets, and some are most definitely
whitespace. Also, it's easier to *expand* the valid character set than
to *restrict* it, so it's better to start with only those characters
that you know for sure make sense, and then add more later. If the
xid_start and xid_continue classes didn't exist, it might be
reasonable to use "Letter, any" and "Number, any" as substitutes; but
those classes DO exist, so Python uses them.

But broadly speaking, yes; it's not hard to allow a bunch of
characters as part of Python source code. Actual language syntax (eg
keywords) is restricted to ASCII and to those symbols that can easily
be typed on most keyboards, but your identifiers are your business.

ChrisA


My thought is you define a legal only those Unicode characters that via 
the defined classification would be normally legal, but perhaps the 
first implementation doesn't diagnose many of the illegal combinations. 
If that isn't Pythonic, then yes, implementing a fuller classification 
would be needed. That might also say normalization questions would need 
to be decided too.


--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Chris Angelico
On Fri, Nov 24, 2017 at 8:19 AM, Richard Damon  wrote:
> On 11/23/17 2:46 PM, Thomas Jollans wrote:
>>
>> On 23/11/17 19:42, Mikhail V wrote:
>>>
>>> I mean for a real practical situation - for example for an average
>>> Python programmer or someone who seeks a programmer job.
>>> And who does not have a 500-key keyboard,
>>
>> I don't think it's too much to ask for a programmer to have the
>> technology and expertise necessary to type their own language in its
>> proper alphabet.
>
>
> My personal feeling is that the language needs to be fully usable with just
> ASCII, so the - character (HYPHEN/MINUS) is the subtraction/negation
> operator, not an in-name hyphen. This also means the main library should use
> just the ASCII character set.
>
> I do also realize that it could be very useful for programmers who are
> programming with other languages as their native, to be able to use words in
> their native language for their own symbols, and thus useful to use their
> own character sets. Yes, doing so may add difficulty to the programmers, as
> they may need to be switching keyboard layouts (especially if not using a
> LATIN based language), but that is THEIR decision to do so. It also may make
> it harder for outside programmers to hep, but again, that is the teams
> decision to make.
>
> The Unicode Standard provides a fairly good classification of the
> characters, and it would make sense to define that an character that is
> defined as a 'Letter' or a 'Number', and some classes of Punctuation
> (connector and dash) be allowed in identifiers.

That's exactly how Python's identifiers are defined (modulo special
handling of some of the ASCII set, for reasons of backward
compatibility).

> Fully implementing may be more complicated than it is worth. An interim
> simple solution would be just allow ALL (or maybe most, excluding a limited
> number of obvious exceptions) of the characters above the ASCII set, with a
> warning that only those classified as above are promised to remain valid,
> and that other characters, while currently not generating a syntax error,
> may do so in the future. It should also be stated that while currently no
> character normalization is being done, it may be added in the future, so
> identifiers that differ only by code point sequences that are defined as
> being equivalent, might in the future not be distinct.

No, that would be a bad idea; some of those characters are more
logically operators or brackets, and some are most definitely
whitespace. Also, it's easier to *expand* the valid character set than
to *restrict* it, so it's better to start with only those characters
that you know for sure make sense, and then add more later. If the
xid_start and xid_continue classes didn't exist, it might be
reasonable to use "Letter, any" and "Number, any" as substitutes; but
those classes DO exist, so Python uses them.

But broadly speaking, yes; it's not hard to allow a bunch of
characters as part of Python source code. Actual language syntax (eg
keywords) is restricted to ASCII and to those symbols that can easily
be typed on most keyboards, but your identifiers are your business.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Richard Damon

On 11/23/17 2:46 PM, Thomas Jollans wrote:

On 23/11/17 19:42, Mikhail V wrote:

I mean for a real practical situation - for example for an average
Python programmer or someone who seeks a programmer job.
And who does not have a 500-key keyboard,

I don't think it's too much to ask for a programmer to have the
technology and expertise necessary to type their own language in its
proper alphabet.


My personal feeling is that the language needs to be fully usable with 
just ASCII, so the - character (HYPHEN/MINUS) is the 
subtraction/negation operator, not an in-name hyphen. This also means 
the main library should use just the ASCII character set.


I do also realize that it could be very useful for programmers who are 
programming with other languages as their native, to be able to use 
words in their native language for their own symbols, and thus useful to 
use their own character sets. Yes, doing so may add difficulty to the 
programmers, as they may need to be switching keyboard layouts 
(especially if not using a LATIN based language), but that is THEIR 
decision to do so. It also may make it harder for outside programmers to 
hep, but again, that is the teams decision to make.


The Unicode Standard provides a fairly good classification of the 
characters, and it would make sense to define that an character that is 
defined as a 'Letter' or a 'Number', and some classes of Punctuation 
(connector and dash) be allowed in identifiers.


Fully implementing may be more complicated than it is worth. An interim 
simple solution would be just allow ALL (or maybe most, excluding a 
limited number of obvious exceptions) of the characters above the ASCII 
set, with a warning that only those classified as above are promised to 
remain valid, and that other characters, while currently not generating 
a syntax error, may do so in the future. It should also be stated that 
while currently no character normalization is being done, it may be 
added in the future, so identifiers that differ only by code point 
sequences that are defined as being equivalent, might in the future not 
be distinct.


Since my native language is English, this isn't that important to me, 
but I do see it as being useful to others with different native tongues. 
The simple implementation shouldn't be that hard, you can just allow 
character codes 0x80 and above as being acceptable in identifiers, with 
the documented warning that the current implementation allows some forms 
that may generate errors in the future. If enough interest is shown, 
adding better classification shouldn't be that hard.


--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Chris Angelico
On Fri, Nov 24, 2017 at 8:02 AM, Mikhail V  wrote:
> On Thu, Nov 23, 2017 at 9:39 PM, Chris Angelico  wrote:
>> On Fri, Nov 24, 2017 at 7:38 AM, Mikhail V  wrote:
>>> I see you manually 'optimise' the look?
>>> I personally would end with something like this:
>>>
>>> def zip_longest(*A, **K):
>>> value = K.get ('fillvalue')
>>> count = len(a) - 1
>>> def sentinel():
>>> nonlocal count
>>> if not count:
>>> raise ZipExhausted
>>> count -= 1
>>> yield  value
>>> fillers = repeat (value)
>>> iterators = [chain (it, sentinel(), fillers) for it in A]
>>> try:
>>> while iterators:
>>> yield tuple (map (next, iterators))
>>> except ZipExhausted:
>>> pass
>>>
>>>
>>> So I would say, my option would be something inbetween.
>>> Note that I tweaked it for proportional font, namely Times New Roman.
>
>> I don't see how the font applies here, but whatever.
>
> For a different font, say CourierNew (monospaced) the tweaking strategy might
> be different.

If you have ANY font-specific "tweaking", you're doing it wrong.
Thanks for making it look worse on everyone else's screen.

>> Which is better? The one-letter names or the longer ones that tie in with 
>> what they're
>> doing?
>
> I think I have answered more or less in previous post, that you cutted off.
> So you were not satisfied?
> But now I am probably not get your 'better' meaning.
> Better for understanding, or purely visually, i.e. less eye-straining?

Which one would you prefer to maintain? Which would you prefer in a code review?

Do you want to have one- and two-letter variable names, or longer and
more descriptive ones?

Seriously? Do I need to wrench this part out of you? This was supposed
to be the EASY question that everyone can agree on, from which I can
then draw my line of argument.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Mikhail V
On Thu, Nov 23, 2017 at 9:39 PM, Chris Angelico  wrote:
> On Fri, Nov 24, 2017 at 7:38 AM, Mikhail V  wrote:
>> I see you manually 'optimise' the look?
>> I personally would end with something like this:
>>
>> def zip_longest(*A, **K):
>> value = K.get ('fillvalue')
>> count = len(a) - 1
>> def sentinel():
>> nonlocal count
>> if not count:
>> raise ZipExhausted
>> count -= 1
>> yield  value
>> fillers = repeat (value)
>> iterators = [chain (it, sentinel(), fillers) for it in A]
>> try:
>> while iterators:
>> yield tuple (map (next, iterators))
>> except ZipExhausted:
>> pass
>>
>>
>> So I would say, my option would be something inbetween.
>> Note that I tweaked it for proportional font, namely Times New Roman.

> I don't see how the font applies here, but whatever.

For a different font, say CourierNew (monospaced) the tweaking strategy might
be different.

> Which is better? The one-letter names or the longer ones that tie in with 
> what they're
> doing?

I think I have answered more or less in previous post, that you cutted off.
So you were not satisfied?
But now I am probably not get your 'better' meaning.
Better for understanding, or purely visually, i.e. less eye-straining?

> Also, why do you have those loose spaces stuck in random places, eg
> before some of the open parentheses but not others?

Is it not allowed? I like how it looks with Times font.


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Chris Angelico
On Fri, Nov 24, 2017 at 7:38 AM, Mikhail V  wrote:
> I see you manually 'optimise' the look?
> I personally would end with something like this:
>
> def zip_longest(*A, **K):
> value = K.get ('fillvalue')
> count = len(a) - 1
> def sentinel():
> nonlocal count
> if not count:
> raise ZipExhausted
> count -= 1
> yield  value
> fillers = repeat (value)
> iterators = [chain (it, sentinel(), fillers) for it in A]
> try:
> while iterators:
> yield tuple (map (next, iterators))
> except ZipExhausted:
> pass
>
>
> So I would say, my option would be something inbetween.
> Note that I tweaked it for proportional font, namely Times New Roman.

I don't see how the font applies here, but whatever. Which is better?
The one-letter names or the longer ones that tie in with what they're
doing?

Also, why do you have those loose spaces stuck in random places, eg
before some of the open parentheses but not others?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Mikhail V
On Thu, Nov 23, 2017 at 8:15 PM, Chris Angelico  wrote:

>
> Let's start with a simpler question. Which of these is better code?
>
> # == Option 1
> class ZipExhausted(Exception):
> pass
>
> def zip_longest(*args, **kwds):
> # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
> fillvalue = kwds.get('fillvalue')
> counter = len(args) - 1
> def sentinel():
> nonlocal counter
> if not counter:
> raise ZipExhausted
> counter -= 1
> yield fillvalue
> fillers = repeat(fillvalue)
> iterators = [chain(it, sentinel(), fillers) for it in args]
> try:
> while iterators:
> yield tuple(map(next, iterators))
> except ZipExhausted:
> pass
>
> # = Option 2
>
> class e(Exception):
> pass
>
> def zl(*a, **k):
> f = f.get('fillvalue')
> c = len(a) - 1
> def s():
> nonlocal c
> if not c:
> raise e
> c -= 1
> yield f
> ff = repeat(f)
> i = [chain(i, s(), ff) for i in args]
> try:
> while i:
> yield tuple(map(next, i))
> except e:
> pass
>
> # 
>
> One of them is cribbed straight from the itertools docs. The other is
> the same functionality with shorter variable names. What makes one of
> them better than the other? Answer me that, and I'll continue.


I see you manually 'optimise' the look?
I personally would end with something like this:

def zip_longest(*A, **K):
value = K.get ('fillvalue')
count = len(a) - 1
def sentinel():
nonlocal count
if not count:
raise ZipExhausted
count -= 1
yield  value
fillers = repeat (value)
iterators = [chain (it, sentinel(), fillers) for it in A]
try:
while iterators:
yield tuple (map (next, iterators))
except ZipExhausted:
pass


So I would say, my option would be something inbetween.
Note that I tweaked it for proportional font, namely Times New Roman.

Particularly I find too narrow lines/words a bit eye-straining at times.
Also self-explanation is important in many cases. But that depends of
what context you cut the example.

But if you only ask which code of two looks better for me,
then, probably Second, but it has some issues for me, e.g. "c" and "e"
almost homoglyhs, too loose 'sieve'-like, short lines.


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Mikhail V
On Thu, Nov 23, 2017 at 8:46 PM, Thomas Jollans  wrote:
> On 23/11/17 19:42, Mikhail V wrote:
>> I mean for a real practical situation - for example for an average
>> Python programmer or someone who seeks a programmer job.
>> And who does not have a 500-key keyboard,
>
> I don't think it's too much to ask for a programmer to have the
> technology and expertise necessary to type their own language in its
> proper alphabet.

And I don't think it is too much of benefit of using two scripts in one
source to compensate the need to constantly switching. Do you
have a method to input e.g. Cyrillic and Latin without switching the
layout? If I just use few extra chars, then I'll bind a keyboard shortcut.
but even a two-language input is annoyance.
And I need to use Cyrillic and Latin constantly, so I know how it feels.


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Karsten Hilbert
On Thu, Nov 23, 2017 at 08:46:01PM +0100, Thomas Jollans wrote:

> > I mean for a real practical situation - for example for an average
> > Python programmer or someone who seeks a programmer job.
> > And who does not have a 500-key keyboard, 
> 
> I don't think it's too much to ask for a programmer to have the
> technology and expertise necessary to type their own language in its
> proper alphabet.

Surely, but it can make reusing code a nightmare.

Using function arguments written in Thai script ?

Understanding, let alone being able to read, code written in Arabic ?

No, thanks.

Karsten
-- 
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Thomas Jollans
On 23/11/17 19:42, Mikhail V wrote:
> I mean for a real practical situation - for example for an average
> Python programmer or someone who seeks a programmer job.
> And who does not have a 500-key keyboard, 

I don't think it's too much to ask for a programmer to have the
technology and expertise necessary to type their own language in its
proper alphabet.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

2017-11-23 Thread Chris Angelico
On Fri, Nov 24, 2017 at 5:42 AM, Mikhail V  wrote:
> Chris A wrote:
>
>>> On Fri, Nov 24, 2017 at 1:10 AM, Mikhail V wrote:
>>>
 Chris A wrote:

 Fortunately for the world, you're not the one who decided which
 characters were permitted in Python identifiers. The ability to use
 non-English words for function/variable names is of huge value; the
 ability to use a hyphen is of some value, but not nearly as much.
>>>
>>> Fortunately for the world we have Chris A. Who knows what is
>>> fortunate and of huge values.
>>> So is there any real world projects example of usage of non-latin scripts
>>> in identifiers? Or is it still only a plan for the new world?
>
>
>> Yes, I've used them personally. And I know other people who have.
>
>
> Oh, I though it would be more impressive showcase for 'huge value'.
> If we drop the benefit of the bare fact that you can do it, or you just
> don't know English, how would describe the practical benefit?
> If you don't know english, then programming at all will be just too hard.
> (or one must define a new whole language specially for some local script)

Let's start with a simpler question. Which of these is better code?

# == Option 1
class ZipExhausted(Exception):
pass

def zip_longest(*args, **kwds):
# zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
fillvalue = kwds.get('fillvalue')
counter = len(args) - 1
def sentinel():
nonlocal counter
if not counter:
raise ZipExhausted
counter -= 1
yield fillvalue
fillers = repeat(fillvalue)
iterators = [chain(it, sentinel(), fillers) for it in args]
try:
while iterators:
yield tuple(map(next, iterators))
except ZipExhausted:
pass

# = Option 2

class e(Exception):
pass

def zl(*a, **k):
f = f.get('fillvalue')
c = len(a) - 1
def s():
nonlocal c
if not c:
raise e
c -= 1
yield f
ff = repeat(f)
i = [chain(i, s(), ff) for i in args]
try:
while i:
yield tuple(map(next, i))
except e:
pass

# 

One of them is cribbed straight from the itertools docs. The other is
the same functionality with shorter variable names. What makes one of
them better than the other? Answer me that, and I'll continue.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list