subject:"Re\: \[beginner\] What's wrong\?"

Re: [beginner] What's wrong?

2016-04-01 Thread Michael Okuntsov


Nevermind. for j in range(1,8) should be for j in range(8).
--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-01 Thread Mark Lawrence via Python-list


On 01/04/2016 23:10, Michael Okuntsov wrote:

Nevermind. for j in range(1,8) should be for j in range(8).


Thank you for your correction, we in Python land greatly appreciate such 
things :)


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-01 Thread sohcahtoa82

On Friday, April 1, 2016 at 3:10:51 PM UTC-7, Michael Okuntsov wrote:
> Nevermind. for j in range(1,8) should be for j in range(8).

I can't tell you how many times I've gotten bit in the ass with that off-by-one 
mistake whenever I use a range that doesn't start at zero.

I know that if I want to loop 10 times and I either want to start at zero or 
just don't care about the actual number, I use `for i in range(10)`.  But if I 
want to loop from 10 to 20, my first instinct is to write `for i in range(10, 
20)`, and then I'm left figuring out why my loop isn't executing the last step.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-01 Thread Mark Lawrence via Python-list


On 01/04/2016 23:44, sohcahto...@gmail.com wrote:

On Friday, April 1, 2016 at 3:10:51 PM UTC-7, Michael Okuntsov wrote:

Nevermind. for j in range(1,8) should be for j in range(8).


I can't tell you how many times I've gotten bit in the ass with that off-by-one 
mistake whenever I use a range that doesn't start at zero.

I know that if I want to loop 10 times and I either want to start at zero or 
just don't care about the actual number, I use `for i in range(10)`.  But if I 
want to loop from 10 to 20, my first instinct is to write `for i in range(10, 
20)`, and then I'm left figuring out why my loop isn't executing the last step.



"First instinct"?  "I expected"?  The Python docs might not be perfect, 
but they were certainly adequate enough to get me going 15 years ago, 
and since then they've improved.  So where is the problem, other than 
failure to RTFM?


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-01 Thread Michael Selik

Humans have always had trouble with this, in many contexts. I remember
being annoyed at folks saying the year 2000 was the first year of the new
millennium, rather than 2001. They'd forgotten the Gregorian calendar
starts from AD 1.

On Fri, Apr 1, 2016, 6:58 PM Mark Lawrence via Python-list <
python-list@python.org> wrote:

> On 01/04/2016 23:44, sohcahto...@gmail.com wrote:
> > On Friday, April 1, 2016 at 3:10:51 PM UTC-7, Michael Okuntsov wrote:
> >> Nevermind. for j in range(1,8) should be for j in range(8).
> >
> > I can't tell you how many times I've gotten bit in the ass with that
> off-by-one mistake whenever I use a range that doesn't start at zero.
> >
> > I know that if I want to loop 10 times and I either want to start at
> zero or just don't care about the actual number, I use `for i in
> range(10)`.  But if I want to loop from 10 to 20, my first instinct is to
> write `for i in range(10, 20)`, and then I'm left figuring out why my loop
> isn't executing the last step.
> >
>
> "First instinct"?  "I expected"?  The Python docs might not be perfect,
> but they were certainly adequate enough to get me going 15 years ago,
> and since then they've improved.  So where is the problem, other than
> failure to RTFM?
>
> --
> My fellow Pythonistas, ask not what our language can do for you, ask
> what you can do for our language.
>
> Mark Lawrence
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-01 Thread Random832

On Fri, Apr 1, 2016, at 19:29, Michael Selik wrote:
> Humans have always had trouble with this, in many contexts. I remember
> being annoyed at folks saying the year 2000 was the first year of the new
> millennium, rather than 2001. They'd forgotten the Gregorian calendar
> starts from AD 1.

Naturally, this means the first millennium was only 999 years long, and
all subsequent millennia were 1000 years long. (Whereas "millennium" is
defined as the set of all years of a given era for a given integer k
where y // 1000 == k. How else would you define it?)

And if you want to get technical, the gregorian calendar starts from
some year no earlier than 1582, depending on the country. The year
numbering system has little to do with the calendar type - your
assertion in fact regards the BC/AD year numbering system, which was
invented by Bede.

The astronomical year-numbering system, which does contain a year zero
(and uses negative numbers rather than a reverse-numbered "BC" era), and
is incidentally used by ISO 8601, was invented by Jacques Cassini in the
17th century.

Rule #1 of being pedantic: There's always someone more pedantic than
you, whose pedantry supports the opposite conclusion.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-01 Thread Michael Selik

On Sat, Apr 2, 2016, 12:28 AM Random832  wrote:

> On Fri, Apr 1, 2016, at 19:29, Michael Selik wrote:
> > Humans have always had trouble with this, in many contexts. I remember
> > being annoyed at folks saying the year 2000 was the first year of the new
> > millennium, rather than 2001. They'd forgotten the Gregorian calendar
> > starts from AD 1.
>
> Naturally, this means the first millennium was only 999 years long, and
> all subsequent millennia were 1000 years long. (Whereas "millennium" is
> defined as the set of all years of a given era for a given integer k
> where y // 1000 == k. How else would you define it?)
>
> And if you want to get technical, the gregorian calendar starts from
> some year no earlier than 1582, depending on the country. The year
> numbering system has little to do with the calendar type - your
> assertion in fact regards the BC/AD year numbering system, which was
> invented by Bede.
>
> The astronomical year-numbering system, which does contain a year zero
> (and uses negative numbers rather than a reverse-numbered "BC" era), and
> is incidentally used by ISO 8601, was invented by Jacques Cassini in the
> 17th century.
>
>
>
> Rule #1 of being pedantic: There's always someone more pedantic than
> you, whose pedantry supports the opposite conclusion.
>

I'll have to remember that one. And thanks for the facts.

>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-01 Thread William Ray Wing


> On Apr 1, 2016, at 6:57 PM, Mark Lawrence via Python-list 
>  wrote:
> 
>> On 01/04/2016 23:44, sohcahto...@gmail.com wrote:
>>> On Friday, April 1, 2016 at 3:10:51 PM UTC-7, Michael Okuntsov wrote:
>>> Nevermind. for j in range(1,8) should be for j in range(8).
>> 
>> I can't tell you how many times I've gotten bit in the ass with that 
>> off-by-one mistake whenever I use a range that doesn't start at zero.
>> 
>> I know that if I want to loop 10 times and I either want to start at zero or 
>> just don't care about the actual number, I use `for i in range(10)`.  But if 
>> I want to loop from 10 to 20, my first instinct is to write `for i in 
>> range(10, 20)`, and then I'm left figuring out why my loop isn't executing 
>> the last step.
> 
> "First instinct"?  "I expected"?  The Python docs might not be perfect, but 
> they were certainly adequate enough to get me going 15 years ago, and since 
> then they've improved.  So where is the problem, other than failure to RTFM?
> 
I've always found it vaguely amusing that the server(s) for just about all the 
technical info at MIT reside behind http://rtfm.mit.edu

Bill

> -- 
> My fellow Pythonistas, ask not what our language can do for you, ask
> what you can do for our language.
> 
> Mark Lawrence
> 
> -- 
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Chris Angelico

On Sat, Apr 2, 2016 at 3:27 PM, Random832  wrote:
> On Fri, Apr 1, 2016, at 19:29, Michael Selik wrote:
>> Humans have always had trouble with this, in many contexts. I remember
>> being annoyed at folks saying the year 2000 was the first year of the new
>> millennium, rather than 2001. They'd forgotten the Gregorian calendar
>> starts from AD 1.
>
> Naturally, this means the first millennium was only 999 years long, and
> all subsequent millennia were 1000 years long. (Whereas "millennium" is
> defined as the set of all years of a given era for a given integer k
> where y // 1000 == k. How else would you define it?)
>
> And if you want to get technical, the gregorian calendar starts from
> some year no earlier than 1582, depending on the country. The year
> numbering system has little to do with the calendar type - your
> assertion in fact regards the BC/AD year numbering system, which was
> invented by Bede.
>
> The astronomical year-numbering system, which does contain a year zero
> (and uses negative numbers rather than a reverse-numbered "BC" era), and
> is incidentally used by ISO 8601, was invented by Jacques Cassini in the
> 17th century.
>

Are you sure? Because I'm pretty sure these folks were already talking about BC.

http://xenohistorian.faithweb.com/holybook/quotes/YK.html

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Michael Selik

On Sat, Apr 2, 2016 at 4:16 AM Chris Angelico  wrote:

> On Sat, Apr 2, 2016 at 3:27 PM, Random832  wrote:
> > On Fri, Apr 1, 2016, at 19:29, Michael Selik wrote:
> >> Humans have always had trouble with this, in many contexts. I remember
> >> being annoyed at folks saying the year 2000 was the first year of the
> new
> >> millennium, rather than 2001. They'd forgotten the Gregorian calendar
> >> starts from AD 1.
> >
> > Naturally, this means the first millennium was only 999 years long, and
> > all subsequent millennia were 1000 years long. (Whereas "millennium" is
> > defined as the set of all years of a given era for a given integer k
> > where y // 1000 == k. How else would you define it?)
> >
> > And if you want to get technical, the gregorian calendar starts from
> > some year no earlier than 1582, depending on the country. The year
> > numbering system has little to do with the calendar type - your
> > assertion in fact regards the BC/AD year numbering system, which was
> > invented by Bede.
> >
> > The astronomical year-numbering system, which does contain a year zero
> > (and uses negative numbers rather than a reverse-numbered "BC" era), and
> > is incidentally used by ISO 8601, was invented by Jacques Cassini in the
> > 17th century.
> >
>
> Are you sure? Because I'm pretty sure these folks were already talking
> about BC.
>
> http://xenohistorian.faithweb.com/holybook/quotes/YK.html
>
>
If they'd only used Unicode, they could have said "þou" in prayer and "ðousand"
for the year.

BTW, I finally know why there are all those "Ye Olde ...".
https://en.wikipedia.org/wiki/Thorn_(letter)
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Chris Angelico

On Sun, Apr 3, 2016 at 1:48 AM, Michael Selik  wrote:
> If they'd only used Unicode, they could have said "þou" in prayer and
> "ðousand" for the year.
>
> BTW, I finally know why there are all those "Ye Olde ...".
> https://en.wikipedia.org/wiki/Thorn_(letter)

Yep! And the letters (thorn and eth) survive in a very few languages
(Icelandic, notably). Fortunately, Python 3 lets you use it in
identifiers.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Marko Rauhamaa

Chris Angelico :

> Yep! And the letters (thorn and eth) survive in a very few languages
> (Icelandic, notably). Fortunately, Python 3 lets you use it in
> identifiers.

While it is fine for Python to support Unicode to its fullest, I don't
think it's a good idea for a programmer to use non-English identifiers.

The (few) keywords are in English anyway. Imagine reading code like
this:

for oppilas in luokka:
if oppilas.hylätty():
oppilas.ilmoita(oppilas.koetulokset)

which looks nauseating whether you are an English-speaker or
Finnish-speaker.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Chris Angelico

On Sun, Apr 3, 2016 at 2:07 AM, Marko Rauhamaa  wrote:
> Chris Angelico :
>
>> Yep! And the letters (thorn and eth) survive in a very few languages
>> (Icelandic, notably). Fortunately, Python 3 lets you use it in
>> identifiers.
>
> While it is fine for Python to support Unicode to its fullest, I don't
> think it's a good idea for a programmer to use non-English identifiers.
>
> The (few) keywords are in English anyway. Imagine reading code like
> this:
>
> for oppilas in luokka:
> if oppilas.hylätty():
> oppilas.ilmoita(oppilas.koetulokset)
>
> which looks nauseating whether you are an English-speaker or
> Finnish-speaker.

I disagree. I've spoken with people who've used that kind of bilingual
hybrid in regular conversation. There's a channel I hang out on that
mainly speaks Turkish, but some sentences are a Turkish-English
hybrid; usually they use Turkish grammar (subject-object-verb), as
that's the native language of most of the people there.

A lot of Python's keywords are derived from English, yes, but once
they've been abbreviated some, and have slid in meaning from their
original words, they become jargon that can plausibly be imported into
other languages. Words like "lambda" aren't English, so other Roman
alphabet languages are at no disadvantage there; words like "def"
might easily acquire back-formation justifications/mnemonics in other
languages. It's only the words that truly are English terms ("while")
that are problematic, and there's only a handful of those to learn.

Of course, there's the whole standard library, which is written in
English. You could translate that without breaking everything, but
it'd be a big job.

The main reason for permitting non-English identifiers is to let
people synchronize on external naming conventions. Suppose you create
a form (web or GUI or something) and ask a human to key in half a
dozen pieces of information, and then do some arithmetic on them. In
English, we can do this kind of thing:

name = input("Object name: ")
length = int(input("Length: "))
width = int(input("Width: "))
height = int(input("Height: "))
volume = length * width * height
print("Volume of %s is: %d" % (name, volume))

Note how every piece of input or output is directly associated with a
keyword, which is used as the identifier in the code. This is
important; when you come to debug code like this (let's assume there's
a lot more of it than this), you can glance at the form, glance at the
code, and not have to maintain a mental translation table. This is why
we use identifiers in the first place - to identify things! Okay. So
far, so good. Let's translate all that into Russian. (I don't speak
Russian, so the actual translation has been done with Google
Translate. Apologies in advance if the Russian text here says
something horribly wrong.)

название = input("Название объекта: ")
длина = int(input("Длина: "))
ширина = int(input("Ширина: "))
высота = int(input("Высота: "))
объем = длина * ширина * высота
print("Объем %s равно %d" % (название, объем))

Its a hybrid of English function names and Russian text strings and
identifiers. But if you force everyone to write their identifiers in
English, all you get is a hybrid of English function names and
identifiers and Russian text strings - no improvement at all! Or, more
likely, you'll get this:

nazvanie = input("Название объекта: ")
dlina = int(input("Длина: "))
shirina = int(input("Ширина: "))
vysota = int(input("Высота: "))
obyem = dlina * shirina * vysota
print("Объем %s равно %d" % (nazvanie, obyem))

Is that an improvement? I don't think so. Far better to let people
write their names in any way that makes sense for their code.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Steven D'Aprano

On Sun, 3 Apr 2016 02:07 am, Marko Rauhamaa wrote:

> I don't think it's a good idea for a programmer to use non-English
> identifiers. 

So you're saying that learning to be a fluent speaker of English is a
pre-requisite of being a programmer?

I'd rather read:

for oppilas in luokka:
if oppilas.hylätty():
oppilas.ilmoita(oppilas.koetulokset)

than either of these:

for cornea in excellence:
if cornea.marooned():
cornea.amuse(cornea.eventualities)

for pupel in clas:
if pupel.abandened():
pupel.plaese(pupel.resolts)

Google translate suggests Marko's code means:

for pupil in class:
if pupil.abandoned():
pupil.please(pupil.results)

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Marko Rauhamaa

Steven D'Aprano :

> I'd rather read:
>
> for oppilas in luokka:
> if oppilas.hylätty():
> oppilas.ilmoita(oppilas.koetulokset)
>
> [...]
>
> Google translate suggests Marko's code means:
>
> for pupil in class:
> if pupil.abandoned():
> pupil.please(pupil.results)

"Hylätty" in this context means "rejected."

"Ilmoita" means "inform, announce, let know."

> So you're saying that learning to be a fluent speaker of English is a
> pre-requisite of being a programmer?

No more than learning Latin is a prerequisite of being a doctor.

Nowadays software companies and communities are international. You never
know who needs to maintain your code. At work, I need to maintain code
that was created in Japan, with coworkers from all over the world. The
Japanese author had had a hard time with English, and made some
awkward naming choices, but had the common sense to use English-only
names in his code.

I also think log file timestamps should be expressed in UTC.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Thomas 'PointedEars' Lahn

Marko Rauhamaa wrote:

> Steven D'Aprano :
>> So you're saying that learning to be a fluent speaker of English is a
>> pre-requisite of being a programmer?
> 
> No more than learning Latin is a prerequisite of being a doctor.

Full ACK.  Probably starting with the Industrial Revolution enabled by the 
improvements of the steam machine in England, English has become the /lingua 
franca/ of technology (even though the French often still disagree, 
preferring words like « ordinateur » and « octet » over “computer” and 
“byte”, respectively¹).  (With the Internet at the latest, then, it has also 
become the /lingua franca/ of science, although Latin terms are common in 
medicine.)

> Nowadays software companies and communities are international. You never
> know who needs to maintain your code. At work, I need to maintain code
> that was created in Japan, with coworkers from all over the world. The
> Japanese author had had a hard time with English, and made some
> awkward naming choices, but had the common sense to use English-only
> names in his code.

One will have a hard time finding a company or community, international or 
not, that does not have at least a basic knowledge of English included in 
what they require of a software developer.

¹  Years ago, I was helping a French colleague with her computer, and
   was surprised to read “Mo” instead of “MB” in the status bar of
   Windows Explorer.
-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Rustom Mody

On Saturday, April 2, 2016 at 10:42:27 PM UTC+5:30, Thomas 'PointedEars' Lahn 
wrote:
> Marko Rauhamaa wrote:
> 
> > Steven D'Aprano :
> >> So you're saying that learning to be a fluent speaker of English is a
> >> pre-requisite of being a programmer?
> > 
> > No more than learning Latin is a prerequisite of being a doctor.
> 
> Full ACK.  Probably starting with the Industrial Revolution enabled by the 
> improvements of the steam machine in England, English has become the /lingua 
> franca/ of technology (even though the French often still disagree, 
> preferring words like « ordinateur » and « octet » over “computer” and 
> “byte”, respectively¹).  (With the Internet at the latest, then, it has also 
> become the /lingua franca/ of science, although Latin terms are common in 
> medicine.)

IMHO the cavalier usage of random alphabet-soup for identifiers
can lead to worse than just aesthetic unpleasantness:
https://en.wikipedia.org/wiki/IDN_homograph_attack

When python went to full unicode identifers it should have also added
pragmas for which blocks the programmer intended to use -- something like
a charset declaration of html.

This way if the programmer says "I want latin and greek"
and then A and Α get mixed up well he asked for it.
If he didn't ask then springing it on him seems unnecessary and uncalled for
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Marko Rauhamaa

Rustom Mody :
> When python went to full unicode identifers it should have also added
> pragmas for which blocks the programmer intended to use -- something
> like a charset declaration of html.

You are being silly.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Terry Reedy


On 4/2/2016 11:07 AM, Marko Rauhamaa wrote:


While it is fine for Python to support Unicode to its fullest, I don't
think it's a good idea for a programmer to use non-English identifiers.


Non-English identifiers can written, at least in romanized versions, in 
ascii.



The (few) keywords are in English anyway. Imagine reading code like
this:

 for oppilas in luokka:
 if oppilas.hylätty():
 oppilas.ilmoita(oppilas.koetulokset)

which looks nauseating whether you are an English-speaker or
Finnish-speaker.


Your sense of nausea is different from others, so speak for yourself (as 
you did in the first sentence -- "I don't ...").  People were happily 
and routinely writing thing like the above (minus the accented char) in 
2.x.  I guess they must have kept your code out of your sight.


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Terry Reedy

On 4/2/2016 12:44 PM, Marko Rauhamaa wrote:

Nowadays software companies and communities are international.

Grade school classrooms, especially pre-high school, are not.

> You never know who needs to maintain your code.

For one-off school assignments, nobody other than the author.

> At work, I need to maintain code

that was created in Japan, with coworkers from all over the world. The
Japanese author had had a hard time with English, and made some
awkward naming choices, but had the common sense to use English-only
names in his code.

Could not have been worse than semi-random ascii like x3tu9.

I also think log file timestamps should be expressed in UTC.

I agree.  Any translation to local time should be in the viewer.  I 
presume this is already true for email and news message timestamps.

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Marko Rauhamaa

Terry Reedy :

> On 4/2/2016 12:44 PM, Marko Rauhamaa wrote:
>
>> Nowadays software companies and communities are international.
>
> Grade school classrooms, especially pre-high school, are not.

Parenthetically, English teachers in Finland have been happy with how
teenage boys' English grades have gone up with the advent of online
gaming.

   High school boys get more top scores in English than girls. According
   to a recent master's thesis, the most important causal factor for the
   boys' success is spending a lot of time playing computer games. An
   English language professor wants to raise awareness about the role of
   games for language skills.

   http://yle.fi/uutiset/pojat_kiilaavat_tyttojen_ohi_englannin_
   kielessa_tietokonepelien_ansiosta/5450679>

Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Michael Selik

On Sat, Apr 2, 2016, 3:40 PM Marko Rauhamaa  wrote:

> Terry Reedy :
>
> > On 4/2/2016 12:44 PM, Marko Rauhamaa wrote:
> >
> >> Nowadays software companies and communities are international.
> >
> > Grade school classrooms, especially pre-high school, are not.
>
> Parenthetically, English teachers in Finland have been happy with how
> teenage boys' English grades have gone up with the advent of online
> gaming.
>
>High school boys get more top scores in English than girls. According
>to a recent master's thesis, the most important causal factor for the
>boys' success is spending a lot of time playing computer games. An
>English language professor wants to raise awareness about the role of
>games for language skills.
>

Gaming also helps your reaction time. Normally 0.3 ms, but 0.1 ms for top
gamers. And fighter pilots.

>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Steven D'Aprano

On Sun, 3 Apr 2016 03:12 am, Thomas 'PointedEars' Lahn wrote:

> Marko Rauhamaa wrote:
> 
>> Steven D'Aprano :
>>> So you're saying that learning to be a fluent speaker of English is a
>>> pre-requisite of being a programmer?
>> 
>> No more than learning Latin is a prerequisite of being a doctor.
> 
> Full ACK.  Probably starting with the Industrial Revolution enabled by the
> improvements of the steam machine in England, English has become the
> /lingua franca/ of technology (even though the French often still
> disagree, preferring words like « ordinateur » and « octet » over
> “computer” and “byte”, respectively¹).  (With the Internet at the latest,
> then, it has also become the /lingua franca/ of science, although Latin
> terms are common in medicine.)

And this is a BAD THING. Monoculture is harmful, and science and technology
is becoming a monoculture: Anglo-American language expressing
Anglo-American ideas, regardless of the nationality of the scientist or
engineer.

During the heyday of the scientific revolution, the sciences and mathematics
were much diverse. Depending on your field, the professional scientist
needed at least a working knowledge of German, French, English and Latin,
possibly some Greek and Russian. Likewise for engineering.

I don't think that it is a coincidence that the great scientific theories
like relativity (both of them), quantum mechanics, evolution by natural
selection and continental drift had time to mature in smaller, national
communities before diffusing out to broader international communities.

Fortunately at least some people are aware of the problem and doing
something about it:

https://blog.stackoverflow.com/2014/02/cant-we-all-be-reasonable-and-speak-english/

Unlike the rest of us, Stackoverflow have actually run the numbers: 

10% of the world's programmers are in China
1.4% of their visits come from China

so either Chinese developers are so brilliant and knowledgeable that they
have no need of Stackoverflow, or they're unable to make use of it because
they cannot participate in English-only forums.

>> Nowadays software companies and communities are international. You never
>> know who needs to maintain your code. At work, I need to maintain code
>> that was created in Japan, with coworkers from all over the world. The
>> Japanese author had had a hard time with English, and made some
>> awkward naming choices, but had the common sense to use English-only
>> names in his code.
> 
> One will have a hard time finding a company or community, international or
> not, that does not have at least a basic knowledge of English included in
> what they require of a software developer.

Particularly if one keeps a Euro-centric perspective and doesn't look to
Asia or Brazil.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Mark Lawrence via Python-list


On 02/04/2016 17:31, Dennis Lee Bieber wrote:

On Sat, 2 Apr 2016 19:15:36 +1100, Chris Angelico 
declaimed the following:


On Sat, Apr 2, 2016 at 3:27 PM, Random832  wrote:

On Fri, Apr 1, 2016, at 19:29, Michael Selik wrote:

Humans have always had trouble with this, in many contexts. I remember
being annoyed at folks saying the year 2000 was the first year of the new
millennium, rather than 2001. They'd forgotten the Gregorian calendar
starts from AD 1.


Naturally, this means the first millennium was only 999 years long, and
all subsequent millennia were 1000 years long. (Whereas "millennium" is
defined as the set of all years of a given era for a given integer k
where y // 1000 == k. How else would you define it?)

And if you want to get technical, the gregorian calendar starts from
some year no earlier than 1582, depending on the country. The year
numbering system has little to do with the calendar type - your
assertion in fact regards the BC/AD year numbering system, which was
invented by Bede.

The astronomical year-numbering system, which does contain a year zero
(and uses negative numbers rather than a reverse-numbered "BC" era), and
is incidentally used by ISO 8601, was invented by Jacques Cassini in the
17th century.



Are you sure? Because I'm pretty sure these folks were already talking about BC.


Bede's BC/AD goes back to circa 700AD. It is the use of negative years
for astronomical counting that is circa 1650AD


http://xenohistorian.faithweb.com/holybook/quotes/YK.html


And that I'll take as something suited for the first of April... It's
almost on par with an old story (in Asimov's I think) on why the pyramids
were behind schedule -- among other things, the pile of government mandated
documentation, on clay tablets of course, was becoming larger than the
pyramid being built; the older records (on the bottom of the stack) were
decomposing from the pressure, etc. If I recall, they discover cuneiform as
more condense than hieroglyphics, and then learn of papyrus/ink (but then
have to support an entire industry of workers to transcribe the old clay
tablets...)




Here we go again, yet another completely useless thread that is 
irrelevant to the Python programming language.  Hardly surprising that 
the bots don't bother any more.  Are any of the bots still alive?


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Mark Lawrence via Python-list


On 03/04/2016 00:49, Steven D'Aprano wrote:

On Sun, 3 Apr 2016 03:12 am, Thomas 'PointedEars' Lahn wrote:


Marko Rauhamaa wrote:


Steven D'Aprano :

So you're saying that learning to be a fluent speaker of English is a
pre-requisite of being a programmer?


No more than learning Latin is a prerequisite of being a doctor.


Full ACK.  Probably starting with the Industrial Revolution enabled by the
improvements of the steam machine in England, English has become the
/lingua franca/ of technology (even though the French often still
disagree, preferring words like « ordinateur » and « octet » over
“computer” and “byte”, respectively¹).  (With the Internet at the latest,
then, it has also become the /lingua franca/ of science, although Latin
terms are common in medicine.)


And this is a BAD THING. Monoculture is harmful, and science and technology
is becoming a monoculture: Anglo-American language expressing
Anglo-American ideas, regardless of the nationality of the scientist or
engineer.

During the heyday of the scientific revolution, the sciences and mathematics
were much diverse. Depending on your field, the professional scientist
needed at least a working knowledge of German, French, English and Latin,
possibly some Greek and Russian. Likewise for engineering.

I don't think that it is a coincidence that the great scientific theories
like relativity (both of them), quantum mechanics, evolution by natural
selection and continental drift had time to mature in smaller, national
communities before diffusing out to broader international communities.

Fortunately at least some people are aware of the problem and doing
something about it:

https://blog.stackoverflow.com/2014/02/cant-we-all-be-reasonable-and-speak-english/

Unlike the rest of us, Stackoverflow have actually run the numbers:

10% of the world's programmers are in China
1.4% of their visits come from China

so either Chinese developers are so brilliant and knowledgeable that they
have no need of Stackoverflow, or they're unable to make use of it because
they cannot participate in English-only forums.



Nowadays software companies and communities are international. You never
know who needs to maintain your code. At work, I need to maintain code
that was created in Japan, with coworkers from all over the world. The
Japanese author had had a hard time with English, and made some
awkward naming choices, but had the common sense to use English-only
names in his code.


One will have a hard time finding a company or community, international or
not, that does not have at least a basic knowledge of English included in
what they require of a software developer.


Particularly if one keeps a Euro-centric perspective and doesn't look to
Asia or Brazil.



My mum was from Tredegar.  She was very upset because English newspaper 
correspondents were biased against her "boys", and because the selectors 
never even put her into the squad, let alone the starting lineup.


Of course this is completely irrelevant on the Python programming main 
mailing list, but it appears that any old crap is acceptable in the year 
2016.


A Bot, a Bot, any kingdom for a Bot.

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Steven D'Aprano

On Sun, 3 Apr 2016 07:42 am, Michael Selik wrote:

> Gaming also helps your reaction time. Normally 0.3 ms, but 0.1 ms for top
> gamers. And fighter pilots.

Does gaming help reaction time, or do only people with fast reaction times
become top gamers?

Personally, in my experience gaming hurts reaction time. I ask people a
question, and they don't reply for a week or at all, because they're too
busy playing games all day.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Mark Lawrence via Python-list


On 03/04/2016 01:48, Steven D'Aprano wrote:

On Sun, 3 Apr 2016 07:42 am, Michael Selik wrote:


Gaming also helps your reaction time. Normally 0.3 ms, but 0.1 ms for top
gamers. And fighter pilots.


Does gaming help reaction time, or do only people with fast reaction times
become top gamers?

Personally, in my experience gaming hurts reaction time. I ask people a
question, and they don't reply for a week or at all, because they're too
busy playing games all day.



I must agree.  When you're trying to get the ball away, and 23 stone of 
bone and muscle smashes into you, that slows your reaction time.  I am 
of course referring to the sport of rugby, not that silly "World 
Series", which takes part in only one country, where for some reason 
unknown to me they wear huge quantities of armour and need oxygen masks 
after they've run a few yards.  What would happen to the poor little 
darlings if they had to spend the entire match on the pitch?


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Gregory Ewing


Chris Angelico wrote:

Yep! And the letters (thorn and eth) survive in a very few languages
(Icelandic, notably). Fortunately, Python 3 lets you use it in
identifiers.


This suggests an elegant solution to the problem of whether
"python" should refer to Python 2 or Python 3. The Python 3
link should be "pyþon"!

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Gregory Ewing


Thomas 'PointedEars' Lahn wrote:
even though the French often still disagree, 
preferring words like « ordinateur » and « octet » over “computer” and 
“byte”, respectively


To be fair, "octet" is a slightly more precise term than
"byte", meaning exactly 8 bits (whereas "byte" could
theoretically mean something else depending on the
context).

There's no excuse for "ordinateur", though. :-)

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Steven D'Aprano

On Sun, 3 Apr 2016 02:57 pm, Gregory Ewing wrote:

> Thomas 'PointedEars' Lahn wrote:
>> even though the French often still disagree,
>> preferring words like « ordinateur » and « octet » over “computer” and
>> “byte”, respectively
> 
> To be fair, "octet" is a slightly more precise term than
> "byte", meaning exactly 8 bits (whereas "byte" could
> theoretically mean something else depending on the
> context).

"Theoretically"?

http://stackoverflow.com/questions/2098149/what-platforms-have-something-other-than-8-bit-char






-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-02 Thread Michael Torrie

Mark, your messages are showing up to the list as being from "python,"
at least on my email.  Any reason for this?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Thomas 'PointedEars' Lahn

Michael Torrie wrote:

> Mark, your messages are showing up to the list as being from "python,"
> at least on my email.  Any reason for this?

Depends on which Mark you are addressing and how you are reading e-mail.

The messages of Mark Lawrence, for example, appear to me as technically 
correct as can be expected from a botched Mail-to-News interface; in 
particular, their “From” header fields are correct.

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Thomas 'PointedEars' Lahn

Rustom Mody wrote:

> On Saturday, April 2, 2016 at 10:42:27 PM UTC+5:30, Thomas 'PointedEars'
> Lahn wrote:
>> Marko Rauhamaa wrote:
>> > Steven D'Aprano :
>> >> So you're saying that learning to be a fluent speaker of English is a
>> >> pre-requisite of being a programmer?
>> > 
>> > No more than learning Latin is a prerequisite of being a doctor.
>> 
>> Full ACK.  Probably starting with the Industrial Revolution enabled by
>> the improvements of the steam machine in England, English has become the
>> /lingua franca/ of technology (even though the French often still
>> disagree, preferring words like « ordinateur » and « octet » over
>> “computer” and
>> “byte”, respectively¹).  (With the Internet at the latest, then, it has
>> also become the /lingua franca/ of science, although Latin terms are
>> common in medicine.)
> 
> IMHO the cavalier usage of random alphabet-soup for identifiers

Straw man.  Nobody has suggested that.  Suggested were words in natural 
languages other than English as (parts of) names in Python programs.

The suggestion was rejected by some (including me) on the grounds that 
source code is not written only for the person writing it, but also for 
other developers to read, and that English is the /lingua franca/ of 
software development at least.  So it is reasonable to expect a software 
developer to understand English, and more software developers are going to 
understand the source code if it is written in English.

Another argument that was made in favor of English-language names (albeit on 
the grounds of “nausea” instead of the logical reason of practicality) is 
that the (Python) programming language’s keywords (e.g., False, None, True, 
and, as, assert [1]) and built-in identifiers (e.g., NotImplemented, 
Ellipsis, abs, all, int, float, complex, iterator [2]) are (abbreviations or 
concatenations of) *English* words; therefore, mixing keywords with names
in a natural language other than English causes source code to be more 
difficult to read than an all-English source code (string values 
notwithstanding).  This is particularly true with Python because a lot of 
(well-written) Python code can easily be read as if it were pseudocode.  (I 
would not be surprised at all to learn that this was Guido van Rossum’s 
intention.)

As for the “Chinese” argument, I did some research recently, indicating that 
it is a statistical fallacy:

From personal experience, I can say that I had no great difficulty 
communicating in English with my Chinese flatmates and classmates at a 
German technical university when all of us were studying computer science 
there 16 years ago.  It was natural.  At least the boys even preferred self-
chosen English first names for themselves (e.g., in instant messaging) 
since, as they explained to me, their original names were difficult to 
pronounce correctly for Europeans (or Europeans might mistakenly call them 
by their family name since it would come first), and to type on European 
keyboards (although I observed them to be proficient in using IMEs when 
chatting with their folks back home).

[1] 
[2] 

> can lead to worse than just aesthetic unpleasantness:
> https://en.wikipedia.org/wiki/IDN_homograph_attack

Relevance?

> When python went to full unicode identifers it should have also added
> pragmas for which blocks the programmer intended to use -- something like
> a charset declaration of html.
> 
> This way if the programmer says "I want latin and greek"
> and then A and Α get mixed up well he asked for it.
> If he didn't ask then springing it on him seems unnecessary and uncalled
> for

Nonsense.

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread alister

On Sun, 03 Apr 2016 02:04:05 +0100, Mark Lawrence wrote:

> On 03/04/2016 01:48, Steven D'Aprano wrote:
>> On Sun, 3 Apr 2016 07:42 am, Michael Selik wrote:
>>
>>> Gaming also helps your reaction time. Normally 0.3 ms, but 0.1 ms for
>>> top gamers. And fighter pilots.
>>
>> Does gaming help reaction time, or do only people with fast reaction
>> times become top gamers?
>>
>> Personally, in my experience gaming hurts reaction time. I ask people a
>> question, and they don't reply for a week or at all, because they're
>> too busy playing games all day.
>>
>>
> I must agree.  When you're trying to get the ball away, and 23 stone of
> bone and muscle smashes into you, that slows your reaction time.  I am
> of course referring to the sport of rugby, not that silly "World
> Series", which takes part in only one country, where for some reason
> unknown to me they wear huge quantities of armour and need oxygen masks
> after they've run a few yards.  What would happen to the poor little
> darlings if they had to spend the entire match on the pitch?

while i agree with your sentiments you have a few minor inacuracies

the "World Series" has nothing to do with Poofs In Pads, it is actually 
Rounders. 



-- 
Real programmers don't write in BASIC.  Actually, no programmers write in
BASIC after reaching puberty.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Rustom Mody

On Sunday, April 3, 2016 at 5:17:36 PM UTC+5:30, Thomas 'PointedEars' Lahn 
wrote:
> Rustom Mody wrote:
> 
> > On Saturday, April 2, 2016 at 10:42:27 PM UTC+5:30, Thomas 'PointedEars'
> > Lahn wrote:
> >> Marko Rauhamaa wrote:
> >> > Steven D'Aprano :
> >> >> So you're saying that learning to be a fluent speaker of English is a
> >> >> pre-requisite of being a programmer?
> >> > 
> >> > No more than learning Latin is a prerequisite of being a doctor.
> >> 
> >> Full ACK.  Probably starting with the Industrial Revolution enabled by
> >> the improvements of the steam machine in England, English has become the
> >> /lingua franca/ of technology (even though the French often still
> >> disagree, preferring words like « ordinateur » and « octet » over
> >> “computer” and
> >> “byte”, respectively¹).  (With the Internet at the latest, then, it has
> >> also become the /lingua franca/ of science, although Latin terms are
> >> common in medicine.)
> > 
> > IMHO the cavalier usage of random alphabet-soup for identifiers
> 
> Straw man.  Nobody has suggested that.  Suggested were words in natural 
> languages other than English as (parts of) names in Python programs.
> 
> The suggestion was rejected by some (including me) on the grounds that 
> source code is not written only for the person writing it, but also for 
> other developers to read, and that English is the /lingua franca/ of 
> software development at least.  So it is reasonable to expect a software 
> developer to understand English, and more software developers are going to 
> understand the source code if it is written in English.
> 
> Another argument that was made in favor of English-language names (albeit on 
> the grounds of “nausea” instead of the logical reason of practicality) is 
> that the (Python) programming language’s keywords (e.g., False, None, True, 
> and, as, assert [1]) and built-in identifiers (e.g., NotImplemented, 
> Ellipsis, abs, all, int, float, complex, iterator [2]) are (abbreviations or 
> concatenations of) *English* words; therefore, mixing keywords with names
> in a natural language other than English causes source code to be more 
> difficult to read than an all-English source code (string values 
> notwithstanding).  This is particularly true with Python because a lot of 
> (well-written) Python code can easily be read as if it were pseudocode.  (I 
> would not be surprised at all to learn that this was Guido van Rossum’s 
> intention.)
> 
> As for the “Chinese” argument, I did some research recently, indicating that 
> it is a statistical fallacy:
> 
> 
> 
> 
> From personal experience, I can say that I had no great difficulty 
> communicating in English with my Chinese flatmates and classmates at a 
> German technical university when all of us were studying computer science 
> there 16 years ago.  It was natural.  At least the boys even preferred self-
> chosen English first names for themselves (e.g., in instant messaging) 
> since, as they explained to me, their original names were difficult to 
> pronounce correctly for Europeans (or Europeans might mistakenly call them 
> by their family name since it would come first), and to type on European 
> keyboards (although I observed them to be proficient in using IMEs when 
> chatting with their folks back home).
> 
> 
> [1] 
> [2] 
> 
> > can lead to worse than just aesthetic unpleasantness:
> > https://en.wikipedia.org/wiki/IDN_homograph_attack
> 
> Relevance?
> 
> > When python went to full unicode identifers it should have also added
> > pragmas for which blocks the programmer intended to use -- something like
> > a charset declaration of html.
> > 
> > This way if the programmer says "I want latin and greek"
> > and then A and Α get mixed up well he asked for it.
> > If he didn't ask then springing it on him seems unnecessary and uncalled
> > for
> 
> Nonsense.

Some misunderstanding of what I said it looks
[Guessing also from Marko's "...silly..."]

So here are some examples to illustrate what I am saying:

Example 1 -- Ligatures:

Python3 gets it right
>>> ﬂag = 1
>>> flag
1

Whereas haskell gets it wrong:
Prelude> let ﬂag = 1
Prelude> flag

:3:1: Not in scope: ‘flag’
Prelude> ﬂag
1
Prelude> 

Example 2 Case Sensitivity
Scheme¹ gets it right

> (define a 1)
> A
1
> a
1

Python gets it wrong
>>> a=1
>>> A
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'A' is not defined
>>> a

[Likewise filenames windows gets right; Unix wrong]

Unicode Identifiers in the spirit of IDN homograph attack.
Every language that 'supports' unicode gets it wrong

Python3
>>> A=1
>>> Α
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'Α' is not defined
>>> A
1

Re: [beginner] What's wrong?

2016-04-03 Thread Rustom Mody

On Sunday, April 3, 2016 at 5:19:33 AM UTC+5:30, Steven D'Aprano wrote:
> On Sun, 3 Apr 2016 03:12 am, Thomas 'PointedEars' Lahn wrote:
> 
> > Marko Rauhamaa wrote:
> > 
> >> Steven D'Aprano :
> >>> So you're saying that learning to be a fluent speaker of English is a
> >>> pre-requisite of being a programmer?
> >> 
> >> No more than learning Latin is a prerequisite of being a doctor.
> > 
> > Full ACK.  Probably starting with the Industrial Revolution enabled by the
> > improvements of the steam machine in England, English has become the
> > /lingua franca/ of technology (even though the French often still
> > disagree, preferring words like << ordinateur >> and << octet >> over
> > "computer" and "byte", respectively¹).  (With the Internet at the latest,
> > then, it has also become the /lingua franca/ of science, although Latin
> > terms are common in medicine.)
> 
> And this is a BAD THING. Monoculture is harmful, and science and technology
> is becoming a monoculture: Anglo-American language expressing
> Anglo-American ideas, regardless of the nationality of the scientist or
> engineer.

I think you are ending making the opposite point of what you seem to want to 
make
Yeah... ok monoculture is a bad thing.
Is python(3) helping towards a 'polyculture'?

To see this consider some app like Word or Gimp that has significant 
functionality and has a history over 20 years.

So let us say some 10 years ago it was internationalized.
This consists of 
1. Rewriting the 'strings' into gettext (or whatever) form along with other
program reorgs
2. Translators actually translating the 'strings'

Or take a modern OS like Windows or Ubuntu -- from the first install screen
we can pick a language and then it will be localized to that

To really localize python one would have to
1. Localize the keywords
2. Localize all module names
3. Localize all the help strings
4. Localize the entire stuff up at https://docs.python.org/3/
5. ...

That is probably one or two orders of magnitude more work than
localizing gimp or Word

So if this is the full goal how far does
"You can now spell (or misspell) your python identifiers in any language of 
your choice"
go towards that goal?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Dan Sommers

On Sun, 03 Apr 2016 07:30:47 -0700, Rustom Mody wrote:

> So here are some examples to illustrate what I am saying:

[A vs a, A vs A, ﬂag vs flag, etc.]

Are identifiers text or bytes? or something else entirely that takes
natural language rules and the appearance of the glyphs into account?

I, for one, am very happy that identifiers are more like bytes than like
they are like text.  The rules for equality for sequences of bytes are
well-defined and unambiguous.  The rules for equality for text are not.
Do I have to know the details of every human (and some non-human)
language, not to mention their typographical conventions (e.g.,
ligatures) just to determine whether two identifiers are the same?

Yes, it's marginally annoying, and a security hole waiting to happen,
than A and A often look very much alike.  It's also troubling that I, a
native English speaker with some knowledge of a random selection of
other languages, should know whether e and é are the same, or whether ij
and ĳ are the same, and that it might depend on the fonts that happen to
have been used to render them.

And where does it end?  If ﬂag and flag are the same, then are Omega and
Ω the same?

In English (and many other languages), it is wrong to spell my first
name with an initial lower case letter.  Therefore, Dan and dan are not,
and should not be, the same identifier.

ObPython:  if my identifiers are case-insensitive, then what about the
language's keywords?  Can I spell class and for as Class and For?

I understand that in some use cases, ﬂag and flag represent the same
English word, but please don't extend that to identifiers in my
software.

Dan
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Rustom Mody

On Sunday, April 3, 2016 at 8:58:59 PM UTC+5:30, Dan Sommers wrote:
> On Sun, 03 Apr 2016 07:30:47 -0700, Rustom Mody wrote:
> 
> > So here are some examples to illustrate what I am saying:
> 
> [A vs a, A vs A, ﬂag vs flag, etc.]

> I understand that in some use cases, ﬂag and flag represent the same
> English word, but please don't extend that to identifiers in my
> software.

I wonder once again if you are getting my point opposite to the one I am making.
With ASCII there were problems like O vs 0 -- niggling but small.

With Unicode its a gigantic pandora box.
Python by allowing unicode identifiers without restraint has made grief for
unsuspecting programmers.

That is why my original suggestion that there should have been alongside this
'brave new world', a pragma wherein a programmer can EXPLICITLY declare
#language Greek
Then he is knowingly opting into possible clashes between A and Α
But not between A and А.

[And if you think the above is a philosophical disquisition on Aristotle's
law of identity: "A is A" you just proved my point that unconstrained Unicode
identifiers is a mess]
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Rustom Mody

On Sunday, April 3, 2016 at 8:58:59 PM UTC+5:30, Dan Sommers wrote:
> Yes, it's marginally annoying, and a security hole waiting to happen,
> that A and A often look very much alike.


"A security hole waiting to happen" = "Marginally annoying"

Frankly I find this juxtaposition alarming

Personal note: I once was idiot enough to have root with password root123
and transferring some files to a friend ... over ssh...
Lost my entire installation in a matter of minutes
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Larry Martell

On Sun, Apr 3, 2016 at 11:46 AM, Rustom Mody  wrote:
> Personal note: I once was idiot enough to have root with password root123

I changed my password to "incorrect," so whenever I forget it the
computer will say, "Your password is incorrect."
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Chris Angelico

On Mon, Apr 4, 2016 at 1:46 AM, Rustom Mody  wrote:
> On Sunday, April 3, 2016 at 8:58:59 PM UTC+5:30, Dan Sommers wrote:
>> Yes, it's marginally annoying, and a security hole waiting to happen,
>> that A and A often look very much alike.
>
>
> "A security hole waiting to happen" = "Marginally annoying"
>
> Frankly I find this juxtaposition alarming
>
> Personal note: I once was idiot enough to have root with password root123
> and transferring some files to a friend ... over ssh...
> Lost my entire installation in a matter of minutes

Exactly why did you have root ssh access with a password?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Dan Sommers

On Sun, 03 Apr 2016 08:46:59 -0700, Rustom Mody wrote:

> On Sunday, April 3, 2016 at 8:58:59 PM UTC+5:30, Dan Sommers wrote:
>> Yes, it's marginally annoying, and a security hole waiting to happen,
>> that A and A often look very much alike.
> 
> "A security hole waiting to happen" = "Marginally annoying"
> 
> Frankly I find this juxtaposition alarming

Sorry about that.

I didn't mean to equate the two.  I meant to point out that the fact
that A and A look alike can be one, or both, of those things.  Perhaps I
should have used "or" instead of "and."
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Michael Okuntsov


03.04.2016 20:52, Rustom Mody пишет:


To really localize python one would have to
1. Localize the keywords
2. Localize all module names
3. Localize all the help strings
4. Localize the entire stuff up at https://docs.python.org/3/
5. ...

That is probably one or two orders of magnitude more work than
localizing gimp or Word

So if this is the full goal how far does
"You can now spell (or misspell) your python identifiers in any language of your 
choice"
go towards that goal?



As an OP, can I participate in the discussion? Here in Russia we have a 
monstrous bookkeeping system called 1C-Predpriyatiye that is used by 
almost all firms and organizations, from kiosks to huge factories. This 
system has a Basic-like language with keywords, module names etc. 
localized in Russian language, but it has also English, German, and 
Ukrainian localizations. I don't want to say that common programming 
languages should be like this, but here we have an example that it can 
be done.

--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Chris Angelico

On Mon, Apr 4, 2016 at 2:24 AM, Michael Okuntsov
 wrote:
> As an OP, can I participate in the discussion? Here in Russia we have a
> monstrous bookkeeping system called 1C-Predpriyatiye that is used by almost
> all firms and organizations, from kiosks to huge factories. This system has
> a Basic-like language with keywords, module names etc. localized in Russian
> language, but it has also English, German, and Ukrainian localizations. I
> don't want to say that common programming languages should be like this, but
> here we have an example that it can be done.

Absolutely you can participate! And thank you. That's exactly the sort
of thing I'm talking about; you should be able to script that in
Russian if your business is primarily Russian.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Dan Sommers

On Sun, 03 Apr 2016 08:39:02 -0700, Rustom Mody wrote:

> On Sunday, April 3, 2016 at 8:58:59 PM UTC+5:30, Dan Sommers wrote:
>> On Sun, 03 Apr 2016 07:30:47 -0700, Rustom Mody wrote:
>> 
>> > So here are some examples to illustrate what I am saying:
>> 
>> [A vs a, A vs A, ﬂag vs flag, etc.]
> 
>> I understand that in some use cases, ﬂag and flag represent the same
>> English word, but please don't extend that to identifiers in my
>> software.

> I wonder once again if you are getting my point opposite to the one I
> am making.  With ASCII there were problems like O vs 0 -- niggling but
> small.
> 
> With Unicode its a gigantic pandora box.  Python by allowing unicode
> identifiers without restraint has made grief for unsuspecting
> programmers.

What about the A vs a case, which comes up even with ASCII-only
characters?  If those are the same, then I, as a reader of Python code,
have to understand all the rules about ß (which I think have changed
over time), and potentially þ and others.

> That is why my original suggestion that there should have been alongside this
> 'brave new world', a pragma wherein a programmer can EXPLICITLY declare
> #language Greek
> Then he is knowingly opting into possible clashes between A and Α
> But not between A and А.

If I declared #language Greek, then I'd expect an identifier like A to
be rejected by the compiler.  That said, I don't know if that sort of
distinction is as clear cut in every language supported by Unicode.

And just to cause trouble (because that's the way I feel today), can I
declare

#γλώσσα Ελληνική

;-)

> [And if you think the above is a philosophical disquisition on
> Aristotle's law of identity: "A is A" you just proved my point that
> unconstrained Unicode identifiers is a mess]

Can we take a "we're all adults here" approach?  For the same reason
that adults don't use identifiers like xl0, x10, xlO, and xl0 anywhere
near each other, shouldn't we also not use A and A anywhere near each
other?  I certainly don't want the language itself to [try to] reject
x10 and xIO because they look too much alike in many fonts.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Chris Angelico

On Mon, Apr 4, 2016 at 2:22 AM, Dan Sommers  wrote:
> What about the A vs a case, which comes up even with ASCII-only
> characters?  If those are the same, then I, as a reader of Python code,
> have to understand all the rules about ß (which I think have changed
> over time), and potentially þ and others.

And Iİıi, and Σσς, and (if you want completeness) ſ too. And various
other case conversion rules. It's not possible to case-fold perfectly
without knowing what language something is.

This, coupled with the extremely useful case distinction between
"Classes" and "instances", means I'm very much glad Python is case
sensitive. "base = Base()" is perfectly legal and meaningful, no
matter what language you translate those words into (well, as long as
it's bicameral - otherwise you need to adorn one of them somehow, but
you'd have to anyway).

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Rustom Mody

On Sunday, April 3, 2016 at 9:30:40 PM UTC+5:30, Chris Angelico wrote:
> Exactly why did you have root ssh access with a password?

Umm... Dont exactly remember.
Probably it was not strictly necessary.
Combination of carelessness, stupidity, hurry

Brings me to...

On Sunday, April 3, 2016 at 9:41:11 PM UTC+5:30, Dan Sommers wrote:
> On Sun, 03 Apr 2016 08:46:59 -0700, Rustom Mody wrote:
> 
> > On Sunday, April 3, 2016 at 8:58:59 PM UTC+5:30, Dan Sommers wrote:
> >> Yes, it's marginally annoying, and a security hole waiting to happen,
> >> that A and A often look very much alike.
> > 
> > "A security hole waiting to happen" = "Marginally annoying"
> > 
> > Frankly I find this juxtaposition alarming
> 
> Sorry about that.
> 
> I didn't mean to equate the two.  I meant to point out that the fact
> that A and A look alike can be one, or both, of those things.  Perhaps I
> should have used "or" instead of "and."

Chill! No offence.

Just that when you have the above ingredients (carelessness, stupidity, 
hurry) multiplied by a GHz clock, it makes for spicy security
incidents(!).

I just meant to say that "Just a lil security incident" is not a helpful 
attitude to foster
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Rustom Mody

On Sunday, April 3, 2016 at 9:56:24 PM UTC+5:30, Dan Sommers wrote:
> On Sun, 03 Apr 2016 08:39:02 -0700, Rustom Mody wrote:
> 
> > On Sunday, April 3, 2016 at 8:58:59 PM UTC+5:30, Dan Sommers wrote:
> >> On Sun, 03 Apr 2016 07:30:47 -0700, Rustom Mody wrote:
> >> 
> >> > So here are some examples to illustrate what I am saying:
> >> 
> >> [A vs a, A vs A, ﬂag vs flag, etc.]
> > 
> >> I understand that in some use cases, ﬂag and flag represent the same
> >> English word, but please don't extend that to identifiers in my
> >> software.
> 
> > I wonder once again if you are getting my point opposite to the one I
> > am making.  With ASCII there were problems like O vs 0 -- niggling but
> > small.
> > 
> > With Unicode its a gigantic pandora box.  Python by allowing unicode
> > identifiers without restraint has made grief for unsuspecting
> > programmers.
> 
> What about the A vs a case, which comes up even with ASCII-only
> characters?  If those are the same, then I, as a reader of Python code,
> have to understand all the rules about ß (which I think have changed
> over time), and potentially þ and others.

Dont get your point.
If you know German then these rules should be clear enough to you
If not youve probably got bigger problems reading that code anyway

As illustration, here is Marko's code few posts back:

for oppilas in luokka:
if oppilas.hylätty():
oppilas.ilmoita(oppilas.koetulokset) 

Does it make sense to you?

> 
> > That is why my original suggestion that there should have been alongside 
> > this
> > 'brave new world', a pragma wherein a programmer can EXPLICITLY declare
> > #language Greek
> > Then he is knowingly opting into possible clashes between A and Α
> > But not between A and А.
> 
> If I declared #language Greek, then I'd expect an identifier like A to
> be rejected by the compiler.  That said, I don't know if that sort of
> distinction is as clear cut in every language supported by Unicode.
> 
> And just to cause trouble (because that's the way I feel today), can I
> declare
> 
> #γλώσσα Ελληνική
> 
> ;-)
> 
> > [And if you think the above is a philosophical disquisition on
> > Aristotle's law of identity: "A is A" you just proved my point that
> > unconstrained Unicode identifiers is a mess]
> 
> Can we take a "we're all adults here" approach?

Who's the 'we' we are talking about?

> For the same reason
> that adults don't use identifiers like xl0, x10, xlO, and xl0 anywhere
> near each other, shouldn't we also not use A and A anywhere near each
> other?  I certainly don't want the language itself to [try to] reject
> x10 and xIO because they look too much alike in many fonts.

When Kernighan and Ritchie wrote C there was no problem with gets.
Then suddenly, decades later the problem exploded.

What happened?

Here's an analysis:
Security means two almost completely unrelated concepts
- protection against shooting oneself in the foot (remember the 'protected' 
   keyword of C++ ?)
- protection against intelligent, capable, motivated criminals
Lets call them security-s (against stupidity) and security-c (against criminals)

Security-c didnt figure because computers were anyway physically secured and 
there was no much internet to speak of.
gets was provided exactly on your principle of 'consenting-adults' -- if you
use it you know what you are using.

Then suddenly computers became net-facing and their servers could be written by
'consenting' (to whom?) adults using gets.

Voila -- Security has just become a lucrative profession!

I believe python's situation of laissez-faire unicode is similarly 
trouble-inviting.

While I personally dont know enough about security to be able to demonstrate a
full sequence of events, here's a little fun I had with Chris:

https://mail.python.org/pipermail/python-list/2014-May/672413.html

Do you not think this could be tailored into something more sinister and
dangerous?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Chris Angelico

On Mon, Apr 4, 2016 at 3:18 AM, Rustom Mody  wrote:
> While I personally dont know enough about security to be able to demonstrate a
> full sequence of events, here's a little fun I had with Chris:
>
> https://mail.python.org/pipermail/python-list/2014-May/672413.html
>
> Do you not think this could be tailored into something more sinister and
> dangerous?

I honestly don't know what you're proving there. You didn't import a
file called "1.py"; you just created a file with a non-ASCII name and
used a non-ASCII identifier to import it. In other words, you did
exactly what Unicode should allow: names in any language.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-03 Thread Dan Sommers

On Sun, 03 Apr 2016 10:18:45 -0700, Rustom Mody wrote:

> On Sunday, April 3, 2016 at 9:56:24 PM UTC+5:30, Dan Sommers wrote:
>> On Sun, 03 Apr 2016 08:39:02 -0700, Rustom Mody wrote:
>> 
>> > On Sunday, April 3, 2016 at 8:58:59 PM UTC+5:30, Dan Sommers wrote:
>> >> On Sun, 03 Apr 2016 07:30:47 -0700, Rustom Mody wrote:
>> >> 
>> >> > So here are some examples to illustrate what I am saying:
>> >> 
>> >> [A vs a, A vs A, ﬂag vs flag, etc.]
>> > 
>> >> I understand that in some use cases, ﬂag and flag represent the same
>> >> English word, but please don't extend that to identifiers in my
>> >> software.
>> 
>> > I wonder once again if you are getting my point opposite to the one I
>> > am making.  With ASCII there were problems like O vs 0 -- niggling but
>> > small.
>> > 
>> > With Unicode its a gigantic pandora box.  Python by allowing unicode
>> > identifiers without restraint has made grief for unsuspecting
>> > programmers.
>> 
>> What about the A vs a case, which comes up even with ASCII-only
>> characters?  If those are the same, then I, as a reader of Python code,
>> have to understand all the rules about ß (which I think have changed
>> over time), and potentially þ and others.
> 
> Dont get your point.
> If you know German then these rules should be clear enough to you
> If not youve probably got bigger problems reading that code anyway

My point is that case sensitivity is good.  I was disagreeing with your
point about scheme getting A vs a "right" and Python and C and Unix
getting it "wrong."

My larger point, and my experience, is that case sensitivity is easier
for to handle than case insensitivity.  Most of the time, the same
letter's capital and small renditions look different from each other (A
vs a, Q vs q, and even Þ and þ is no worse than O and o), and there are
no context sensitive conversion rules to worry about.

> As illustration, here is Marko's code few posts back:
> 
> for oppilas in luokka:
> if oppilas.hylätty():
> oppilas.ilmoita(oppilas.koetulokset) 
> 
> Does it make sense to you?

It makes enough sense to recognize the idiom:  for each item in a
collection that satisfies a predicate, call a method on the item.

My point here is that while the identifiers themselves can be enormously
helpful to someone seeing a block of code for the first time or
maintaining it five years later, it's just as important to recognize
quickly that one identifier is not the same as another one, or that a
particular identifier only appears once or only in certain syntactical
constructs.

If the above code were written a little differently, we'd be having a
completely different discussion:

for list in object:
if list.clear():
list.pop(list.append)

>> Can we take a "we're all adults here" approach?
> 
> Who's the 'we' we are talking about?

The community, who has accepted Python as a case-sensitive language and
knows better than to use identifiers that look too much alike or are
otherwise deliberatly mis-leading.

>> For the same reason
>> that adults don't use identifiers like xl0, x10, xlO, and xl0 anywhere
>> near each other, shouldn't we also not use A and A anywhere near each
>> other?  I certainly don't want the language itself to [try to] reject
>> x10 and xIO because they look too much alike in many fonts.
> 
> When Kernighan and Ritchie wrote C there was no problem with gets.
> Then suddenly, decades later the problem exploded.

When Kernighan and Ritchie wrote C there *was* a problem with gets.

> What happened?

The problem was no longer isolated to taking down one Unix process or a
single machine, or discovering passwords on that one machine.

> Here's an analysis:
> Security means two almost completely unrelated concepts
> - protection against shooting oneself in the foot (remember the 'protected' 
>keyword of C++ ?)
> - protection against intelligent, capable, motivated criminals
> Lets call them security-s (against stupidity) and security-c (against 
> criminals)
> 
> Security-c didnt figure because computers were anyway physically secured and 
> there was no much internet to speak of.
> gets was provided exactly on your principle of 'consenting-adults' -- if you
> use it you know what you are using.
> 
> Then suddenly computers became net-facing and their servers could be
> written by 'consenting' (to whom?) adults using gets.
> 
> Voila -- Security has just become a lucrative profession!

I can't prevent insecure web servers, or unknowing users.  Allowing or
disallowing A and A and А to coexist in the source code doesn't matter.

> I believe python's situation of laissez-faire unicode is similarly
> trouble-inviting.

I'm not sure I agree, but I didn't timing attacks on cryptographic
algorithms or devices reading passwords from air-gapped computers
coming, either.

I do know that complexity is also a source of bugs and security risks.
Allowing or disallowing certain unicode code points in identifiers, and
declaring that identifiers c

Re: [beginner] What's wrong?

2016-04-03 Thread Dan Sommers

On Sun, 03 Apr 2016 09:49:03 -0700, Rustom Mody wrote:

> On Sunday, April 3, 2016 at 9:41:11 PM UTC+5:30, Dan Sommers wrote:
>> On Sun, 03 Apr 2016 08:46:59 -0700, Rustom Mody wrote:
>> 
>> > On Sunday, April 3, 2016 at 8:58:59 PM UTC+5:30, Dan Sommers wrote:
>> >> Yes, it's marginally annoying, and a security hole waiting to happen,
>> >> that A and A often look very much alike.
>> > 
>> > "A security hole waiting to happen" = "Marginally annoying"
>> > 
>> > Frankly I find this juxtaposition alarming
>> 
>> Sorry about that.
>> 
>> I didn't mean to equate the two.  I meant to point out that the fact
>> that A and A look alike can be one, or both, of those things.  Perhaps I
>> should have used "or" instead of "and."
> 
> Chill! No offence.

I'm chilled.  :-)

No offense taken.  I am arguably overly sensitive to putting forth an
argument that isn't clear and concise (because I've also been known to
derail the proceedings until I can get my head around someone else's
argument).

> Just that when you have the above ingredients (carelessness,
> stupidity, hurry) multiplied by a GHz clock, it makes for spicy
> security incidents(!).  I just meant to say that "Just a lil security
> incident" is not a helpful attitude to foster

On this we agree.  :-)
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-04 Thread Mark Lawrence via Python-list


On 02/04/2016 23:49, Michael Torrie wrote:

Mark, your messages are showing up to the list as being from "python,"
at least on my email.  Any reason for this?



Assuming that you're referring to me, frankly I haven't a clue.  I read 
this list with Thunderbird on Windows, I hit "reply" to something, I 
type, I hit "send", job done.  Thereafter, as far as I'm concerned, a 
miracle occurs and hundreds if not thousands of subscribers get to see 
my reply.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-04 Thread BartC


On 04/04/2016 15:04, Mark Lawrence wrote:

On 02/04/2016 23:49, Michael Torrie wrote:

Mark, your messages are showing up to the list as being from "python,"
at least on my email.  Any reason for this?



Assuming that you're referring to me, frankly I haven't a clue.  I read
this list with Thunderbird on Windows, I hit "reply" to something, I
type, I hit "send", job done.  Thereafter, as far as I'm concerned, a
miracle occurs and hundreds if not thousands of subscribers get to see
my reply.


On the same setup, I click 'Followup' to reply to a post, which sends to 
the group.


Other options on the dropdown box below the Followup button are Reply 
and Reply All.


Clicking Reply (on your post) starts an email to you, while Reply All 
starts an email to you and to the group.


--
Bartc



--
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-06 Thread Thomas 'PointedEars' Lahn

Rustom Mody wrote:

> On Sunday, April 3, 2016 at 5:17:36 PM UTC+5:30, Thomas 'PointedEars' Lahn
> wrote:
>> Rustom Mody wrote:
>> > When python went to full unicode identifers it should have also added
>> > pragmas for which blocks the programmer intended to use -- something
>> > like a charset declaration of html.
>> > 
>> > This way if the programmer says "I want latin and greek"
>> > and then A and Α get mixed up well he asked for it.
>> > If he didn't ask then springing it on him seems unnecessary and
>> > uncalled for
>> 
>> Nonsense.
> 
> Some misunderstanding of what I said it looks
> [Guessing also from Marko's "...silly..."]

First of all, while bandwidth might not be precious anymore to some, free 
time still is.  So please trim your quotations to the relevant minimum, to 
the parts you are actually referring to, and summarize properly if 
necessary.  For if you continue this mindbogglingly stupid full-quoting, 
this is going to be my last reply to you for a long time.  You have been 
warned.

> So here are some examples to illustrate what I am saying:
> 
> Example 1 -- Ligatures:
> 
> Python3 gets it right
 ﬂag = 1
 flag
> 1

Fascinating; confirmed with

| $ python3 
| Python 3.4.4 (default, Jan  5 2016, 15:35:18) 
| [GCC 5.3.1 20160101] on linux
| […]

I do not think this is correct, though.  Different Unicode code sequences, 
after normalization, should result in different symbols.

> Whereas haskell gets it wrong:
> Prelude> let ﬂag = 1
> Prelude> flag
> 
> :3:1: Not in scope: ‘flag’
> Prelude> ﬂag
> 1
> Prelude>

I think Haskell gets it right here, while Py3k does not.  The “ﬂ” is not to 
be decomposed to “fl”.

> Example 2 Case Sensitivity
> Scheme¹ gets it right
> 
>> (define a 1)
>> A
> 1
>> a
> 1

So Scheme is case-insensitive there.  So is (Visual) Basic.  That does not 
make it (any) better.

> Python gets it wrong
 a=1
 A
> Traceback (most recent call last):
>   File "", line 1, in 
> NameError: name 'A' is not defined

This is not wrong; it is just different.  And given that identifiers 
starting with uppercase ought to be class names in Python (and other OOPLs 
that are case-sensitive there), and that a class name serves in constructor 
calls (in Python, instantiating a class is otherwise indistinguishable from 
a function call), it makes sense that the (maybe local) variable “a” should 
be different from the (probably global) class “A”.

> [Likewise filenames windows gets right; Unix wrong]

Utter nonsense.  Apparently you are blissfully unaware of how much grief it 
has caused WinDOS lusers and users alike over the years that Micro$~1 
decided in their infinite wisdom that letter case was not important.

Example: By contrast to previous versions, FAT32 supports long filenames 
(VFAT).  Go try changing a long filename from uppercase (“Really Long 
Filename.txt”) to partial lowercase (“Really long filename.txt”).  It does 
not work, you get an error, because the underlying “short filename” is the 
same as it is has to be case-insensitive for backwards compatibility 
(“REALLY~1.TXT”)  First you have to rename the file so that its name results 
in a different “short filename” (“REALLY~2.TXT”).  Then you have to rename 
it again to get the proper letter case (by which the “short filename” might 
either become “REALLY~1.TXT” again or “REALLY~3.TXT”).

> Unicode Identifiers in the spirit of IDN homograph attack.
> Every language that 'supports' unicode gets it wrong

NAK, see above.

> Python3
 A=1
 Α
> Traceback (most recent call last):
>   File "", line 1, in 
> NameError: name 'Α' is not defined
 A
> 1
> 
> Can you make out why A both is and is not defined?

Fallacy.  “A” is _not_ both defined and not defined.  There is only one “A”.

However, given the proper font, I might see at a glance what is wrong there.  
In fact, in my Konsole[tm] where the default font is “Courier 10 Pitch” I 
clearly see what is wrong there.  “A” (U+0041 LATIN CAPITAL LETTER A) is 
displayed using that serif font where the letter has a serif to the left at 
cap height and serifs left and right on the baseline, while “Α” (U+0391 
GREEK CAPITAL LETTER ALPHA) is displayed using a sans-serif font, where also 
the cap height is considerably higher.

> When the language does not support it eg python2 the behavior is better

NAK.  Being able to use Unicode strings verbatim in a program without having 
to declare them is infinitely useful.  Unicode identifiers appear to be 
merely a (happy?) side effect of that.

> The notion of 'variable' in programming language is inherently based on
> that of 'identifier'.

ACK.

> With ASCII the problems are minor: Case-distinct identifiers are distinct
> -- they dont IDENTIFY.

I do not think this is a problem.

> This contradicts standard English usage and practice 

No, it does not.  English distinguishes between proper *nouns* and proper 
*names* (the latter can be the former).  For example,

Re: [beginner] What's wrong?

2016-04-06 Thread Chris Angelico

On Thu, Apr 7, 2016 at 5:56 AM, Thomas 'PointedEars' Lahn
 wrote:
>> Example 1 -- Ligatures:
>>
>> Python3 gets it right
> ﬂag = 1
> flag
>> 1
>
> Fascinating; confirmed with
>
> | $ python3
> | Python 3.4.4 (default, Jan  5 2016, 15:35:18)
> | [GCC 5.3.1 20160101] on linux
> | […]
>
> I do not think this is correct, though.  Different Unicode code sequences,
> after normalization, should result in different symbols.
>
>> Whereas haskell gets it wrong:
>> Prelude> let ﬂag = 1
>> Prelude> flag
>>
>> :3:1: Not in scope: ‘flag’
>> Prelude> ﬂag
>> 1
>> Prelude>
>
> I think Haskell gets it right here, while Py3k does not.  The “ﬂ” is not to
> be decomposed to “fl”.

Unicode disagrees with you. It is decomposed exactly that way.

>>> unicodedata.normalize("NFKD","ﬂ")
'fl'

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-08 Thread sohcahtoa82

On Friday, April 1, 2016 at 3:57:40 PM UTC-7, Mark Lawrence wrote:
> On 01/04/2016 23:44, sohcahto...@gmail.com wrote:
> > On Friday, April 1, 2016 at 3:10:51 PM UTC-7, Michael Okuntsov wrote:
> >> Nevermind. for j in range(1,8) should be for j in range(8).
> >
> > I can't tell you how many times I've gotten bit in the ass with that 
> > off-by-one mistake whenever I use a range that doesn't start at zero.
> >
> > I know that if I want to loop 10 times and I either want to start at zero 
> > or just don't care about the actual number, I use `for i in range(10)`.  
> > But if I want to loop from 10 to 20, my first instinct is to write `for i 
> > in range(10, 20)`, and then I'm left figuring out why my loop isn't 
> > executing the last step.
> >
> 
> "First instinct"?  "I expected"?  The Python docs might not be perfect, 
> but they were certainly adequate enough to get me going 15 years ago, 
> and since then they've improved.  So where is the problem, other than 
> failure to RTFM?
> 
> -- 
> My fellow Pythonistas, ask not what our language can do for you, ask
> what you can do for our language.
> 
> Mark Lawrence

Holy hell, why such an aggressive tone?

I understand how range(x, y) works.  It's just a simple mistake that I 
frequently do it wrong and have to correct it after the first time I run it.  
It's not like I'm saying that the implementation needs to change.  I'm just 
saying that if I want to loop from 10 to 20, my first thought is to use 
range(10, 20).  It is slightly unintuitive.

*YES*, I know it is wrong.  *YES*, I understand why the correct usage would be 
range(10, 21) to get that list from 10 to 20.

Get off your high horse.  Not everybody is like you and has been using Python 
for 15 years and apparently never makes mistakes.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [beginner] What's wrong?

2016-04-08 Thread Mark Lawrence via Python-list


On 08/04/2016 23:59, sohcahto...@gmail.com wrote:

On Friday, April 1, 2016 at 3:57:40 PM UTC-7, Mark Lawrence wrote:

On 01/04/2016 23:44, sohcahto...@gmail.com wrote:

On Friday, April 1, 2016 at 3:10:51 PM UTC-7, Michael Okuntsov wrote:

Nevermind. for j in range(1,8) should be for j in range(8).


I can't tell you how many times I've gotten bit in the ass with that off-by-one 
mistake whenever I use a range that doesn't start at zero.

I know that if I want to loop 10 times and I either want to start at zero or 
just don't care about the actual number, I use `for i in range(10)`.  But if I 
want to loop from 10 to 20, my first instinct is to write `for i in range(10, 
20)`, and then I'm left figuring out why my loop isn't executing the last step.



"First instinct"?  "I expected"?  The Python docs might not be perfect,
but they were certainly adequate enough to get me going 15 years ago,
and since then they've improved.  So where is the problem, other than
failure to RTFM?

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence


Holy hell, why such an aggressive tone?

I understand how range(x, y) works.  It's just a simple mistake that I 
frequently do it wrong and have to correct it after the first time I run it.  
It's not like I'm saying that the implementation needs to change.  I'm just 
saying that if I want to loop from 10 to 20, my first thought is to use 
range(10, 20).  It is slightly unintuitive.

*YES*, I know it is wrong.  *YES*, I understand why the correct usage would be 
range(10, 21) to get that list from 10 to 20.

Get off your high horse.  Not everybody is like you and has been using Python 
for 15 years and apparently never makes mistakes.



*plonk*

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-06 Thread Steven D'Aprano

On Thu, 7 Apr 2016 05:56 am, Thomas 'PointedEars' Lahn wrote:

> Rustom Mody wrote:

>> So here are some examples to illustrate what I am saying:
>> 
>> Example 1 -- Ligatures:
>> 
>> Python3 gets it right
> ﬂag = 1
> flag
>> 1

Python identifiers are intentionally normalised to reduce security issues,
or at least confusion and annoyance, due to visually-identical identifiers
being treated as different.

Unicode has technical standards dealing with identifiers:

http://www.unicode.org/reports/tr31/

and visual spoofing and confusables:

http://www.unicode.org/reports/tr39/

I don't believe that CPython goes to the full extreme of checking for mixed
script confusables, but it does partially mitigate the problem by
normalising identifiers.

Unfortunately PEP 3131 leaves a number of questions open. Presumably they
were answered in the implementation, but they aren't documented in the PEP.

https://www.python.org/dev/peps/pep-3131/

> Fascinating; confirmed with
> 
> | $ python3
> | Python 3.4.4 (default, Jan  5 2016, 15:35:18)
> | [GCC 5.3.1 20160101] on linux
> | […]
> 
> I do not think this is correct, though.  Different Unicode code sequences,
> after normalization, should result in different symbols.

I think you are confused about normalisation. By definition, normalising
different Unicode code sequences may result in the same symbols, since that
is what normalisation means.

Consider two distinct strings which nevertheless look identical:

py> a = "\N{LATIN SMALL LETTER U}\N{COMBINING DIAERESIS}"
py> b = "\N{LATIN SMALL LETTER U WITH DIAERESIS}"
py> a == b
False
py> print(a, b)
ü ü

The purpose of normalisation is to turn one into the other:

py> unicodedata.normalize('NFKC', a) == b  # compose 2 code points --> 1
True
py> unicodedata.normalize('NFKD', b) == a  # decompose 1 code point --> 2
True

In the case of the fl ligature, normalisation splits the ligature into
individual 'f' and 'l' code points regardless of whether you compose or
decompose:

py> unicodedata.normalize('NFKC', "ﬂag") == "flag"
True
py> unicodedata.normalize('NFKD', "ﬂag") == "flag"
True

That's using the combatability composition form. Using the default
composition form leaves the ligature unchanged.

Note that UTS #39 (security mechanisms) suggests that identifiers should be
normalised using NFKC.

[...]
> I think Haskell gets it right here, while Py3k does not.  The “ﬂ” is not
> to be decomposed to “fl”.

The Unicode consortium seems to disagree with you. Table 1 of UTS #39 (see
link above) includes "Characters that cannot occur in strings normalized to
NFKC" in the Restricted category, that is, characters which should not be
used in identifiers. ﬂ cannot occur in such normalised strings, and so it
is classified as Restricted and should not be used in identifiers.

I'm not entirely sure just how closely Python's identifiers follow the
standard, but I think that the intention is to follow something close to
"UAX31-R4. Equivalent Normalized Identifiers":

http://www.unicode.org/reports/tr31/#R4

[Rustom] 
>> Python gets it wrong
> a=1
> A
>> Traceback (most recent call last):
>>   File "", line 1, in 
>> NameError: name 'A' is not defined
> 
> This is not wrong; it is just different.

I agree with Thomas here. Case-insensitivity is a choice, and I don't think
it is a good choice for programming identifiers. Being able to make case
distinctions between (let's say):

SPAM  # a constant, or at least constant-by-convention
Spam  # a class or type
spam  # an instance

is useful.

[Rustom]
>> With ASCII the problems are minor: Case-distinct identifiers are distinct
>> -- they dont IDENTIFY.
> 
> I do not think this is a problem.
> 
>> This contradicts standard English usage and practice
> 
> No, it does not.

I agree with Thomas here too. Although it is rare for case to make a
distinction in English, it does happen. As the old joke goes:

Capitalisation is the difference between helping my Uncle Jack off a horse,
and helping my uncle jack off a horse.

So even in English, capitalisation can make a semantic difference.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-06 Thread Marko Rauhamaa

Steven D'Aprano :

> So even in English, capitalisation can make a semantic difference.

It can even make a pronunciation difference: polish vs Polish.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-07 Thread Peter Pearson

On Thu, 07 Apr 2016 11:37:50 +1000, Steven D'Aprano wrote:
> On Thu, 7 Apr 2016 05:56 am, Thomas 'PointedEars' Lahn wrote:
>> Rustom Mody wrote:
>
>>> So here are some examples to illustrate what I am saying:
>>> 
>>> Example 1 -- Ligatures:
>>> 
>>> Python3 gets it right
>> ﬂag = 1
>> flag
>>> 1
[snip]
>> 
>> I do not think this is correct, though.  Different Unicode code sequences,
>> after normalization, should result in different symbols.
>
> I think you are confused about normalisation. By definition, normalising
> different Unicode code sequences may result in the same symbols, since that
> is what normalisation means.
>
> Consider two distinct strings which nevertheless look identical:
>
> py> a = "\N{LATIN SMALL LETTER U}\N{COMBINING DIAERESIS}"
> py> b = "\N{LATIN SMALL LETTER U WITH DIAERESIS}"
> py> a == b
> False
> py> print(a, b)
> ü ü
>
>
> The purpose of normalisation is to turn one into the other:
>
> py> unicodedata.normalize('NFKC', a) == b  # compose 2 code points --> 1
> True
> py> unicodedata.normalize('NFKD', b) == a  # decompose 1 code point --> 2
> True

It's all great fun until someone loses an eye.

Seriously, it's cute how neatly normalisation works when you're
watching closely and using it in the circumstances for which it was
intended, but that hardly proves that these practices won't cause much
trouble when they're used more casually and nobody's watching closely.
Considering how much energy good software engineers spend eschewing
unnecessary complexity, do we really want to embrace the prospect of
having different things look identical?  (A relevant reference point:
mixtures of spaces and tabs in Python indentation.)

[snip]
> The Unicode consortium seems to disagree with you.

The Unicode consortium was certifiably insane when it went into the
typesetting business.  The pile-of-poo character was just frosting on
the cake.

(Sorry to leave you with that image.)

-- 
To email me, substitute nowhere->runbox, invalid->com.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-07 Thread Chris Angelico

On Fri, Apr 8, 2016 at 2:51 AM, Peter Pearson  wrote:
> The pile-of-poo character was just frosting on
> the cake.
>
> (Sorry to leave you with that image.)

No. You're not even a little bit sorry.

You're an evil, evil man. And funny.

ChrisA
who knows that its codepoint is 1F4A9 without looking it up
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-07 Thread Rustom Mody

On Thursday, April 7, 2016 at 10:22:18 PM UTC+5:30, Peter Pearson wrote:
> On Thu, 07 Apr 2016 11:37:50 +1000, Steven D'Aprano wrote:
> > On Thu, 7 Apr 2016 05:56 am, Thomas 'PointedEars' Lahn wrote:
> >> Rustom Mody wrote:
> >
> >>> So here are some examples to illustrate what I am saying:
> >>> 
> >>> Example 1 -- Ligatures:
> >>> 
> >>> Python3 gets it right
> >> ﬂag = 1
> >> flag
> >>> 1
> [snip]
> >> 
> >> I do not think this is correct, though.  Different Unicode code sequences,
> >> after normalization, should result in different symbols.
> >
> > I think you are confused about normalisation. By definition, normalising
> > different Unicode code sequences may result in the same symbols, since that
> > is what normalisation means.
> >
> > Consider two distinct strings which nevertheless look identical:
> >
> > py> a = "\N{LATIN SMALL LETTER U}\N{COMBINING DIAERESIS}"
> > py> b = "\N{LATIN SMALL LETTER U WITH DIAERESIS}"
> > py> a == b
> > False
> > py> print(a, b)
> > ü ü
> >
> >
> > The purpose of normalisation is to turn one into the other:
> >
> > py> unicodedata.normalize('NFKC', a) == b  # compose 2 code points --> 1
> > True
> > py> unicodedata.normalize('NFKD', b) == a  # decompose 1 code point --> 2
> > True
> 
> It's all great fun until someone loses an eye.
> 
> Seriously, it's cute how neatly normalisation works when you're
> watching closely and using it in the circumstances for which it was
> intended, but that hardly proves that these practices won't cause much
> trouble when they're used more casually and nobody's watching closely.
> Considering how much energy good software engineers spend eschewing
> unnecessary complexity, do we really want to embrace the prospect of
> having different things look identical?  (A relevant reference point:
> mixtures of spaces and tabs in Python indentation.)

That kind of sums up my position.
To be a casual user of unicode is one thing
To support it is another -- unicode strings in python3 -- ok so far
To mix up these two is a third without enough thought or consideration --
unicode identifiers is likely a security hole waiting to happen...

No I am not clever/criminal enough to know how to write a text that is visually
close to 
print "Hello World"
but is internally closer to
rm -rf /

For me this:
 >>> Α = 1
>>> A = 2
>>> Α + 1 == A 
True
>>> 


is cure enough that I am not amused

[The only reason I brought up case distinction is that this is in the same 
direction and way worse than that]

If python had been more serious about embracing the brave new world of
unicode it should have looked in this direction:
http://blog.languager.org/2014/04/unicoded-python.html

Also here I suggest a classification of unicode, that, while not
official or even formalizable is (I believe) helpful
http://blog.languager.org/2015/03/whimsical-unicode.html

Specifically as far as I am concerned if python were to throw back say
a ligature in an identifier as a syntax error -- exactly what python2 does --
I think it would be perfectly fine and a more sane choice
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-07 Thread Rustom Mody

On Friday, April 8, 2016 at 10:13:16 AM UTC+5:30, Rustom Mody wrote:
> No I am not clever/criminal enough to know how to write a text that is 
> visually
> close to 
> print "Hello World"
> but is internally closer to
> rm -rf /
> 
> For me this:
>  >>> Α = 1
> >>> A = 2
> >>> Α + 1 == A 
> True
> >>> 
> 
> 
> is cure enough that I am not amused

Um... "cute" was the intention
[Or is it cuʇe ?]
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-07 Thread Chris Angelico

On Fri, Apr 8, 2016 at 2:43 PM, Rustom Mody  wrote:
> No I am not clever/criminal enough to know how to write a text that is 
> visually
> close to
> print "Hello World"
> but is internally closer to
> rm -rf /
>
> For me this:
>  >>> Α = 1
 A = 2
 Α + 1 == A
> True

>
>
> is cure enough that I am not amused

To me, the above is a contrived example. And you can contrive examples
that are just as confusing while still being ASCII-only, like
swimmer/swirnmer in many fonts, or I and l, or any number of other
visually-confusing glyphs. I propose that we ban the letters 'r' and
'l' from identifiers, to ensure that people can't mess with
themselves.

> Specifically as far as I am concerned if python were to throw back say
> a ligature in an identifier as a syntax error -- exactly what python2 does --
> I think it would be perfectly fine and a more sane choice

The ligature is handled straight-forwardly: it gets decomposed into
its component letters. I'm not seeing a problem here.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-07 Thread Steven D'Aprano

On Fri, 8 Apr 2016 02:51 am, Peter Pearson wrote:

> Seriously, it's cute how neatly normalisation works when you're
> watching closely and using it in the circumstances for which it was
> intended, but that hardly proves that these practices won't cause much
> trouble when they're used more casually and nobody's watching closely.
> Considering how much energy good software engineers spend eschewing
> unnecessary complexity, 

Maybe so, but it's not good software engineers we have to worry about, but
the other 99.9% :-)

> do we really want to embrace the prospect of 
> having different things look identical?

You mean like ASCII identifiers? I'm afraid it's about fifty years too late
to ban identifiers using O and 0, or l, I and 1, or rn and m.

Or for that matter:

a = akjhvciwfdwkejfc2qweoduycwldvqspjcwuhoqwe9fhlcjbqvcbhsiauy37wkg() + 100
b = 100 + akjhvciwfdwkejfc2qweoduycwldvqspjcwuhoqew9fhlcjbqvcbhsiauy37wkg()

How easily can you tell them apart at a glance?

The reality is that we trust our coders not to deliberately mess us about.
As the Obfuscated C and the Underhanded C contest prove, you don't need
Unicode to hide hostile code. In fact, the use of Unicode confusables in an
otherwise all-ASCII file is a dead giveaway that something fishy is going
on.

I think that, beyond normalisation, the compiler need not be too concerned
by confusables. I wouldn't *object* to the compiler raising a warning if it
detected confusable identifiers, or mixed script identifiers, but I think
that's more the job for a linter or human code review.

> (A relevant reference point: 
> mixtures of spaces and tabs in Python indentation.)

Most editors have an option to display whitespace, and tabs and spaces look
different. Typically the tab is shown with an arrow, and the space by a
dot. If people *still* confuse them, the issue is easily managed by a
combination of "well don't do that" and TabError.

> [snip]
>> The Unicode consortium seems to disagree with you.
> 
> 
> 
> The Unicode consortium was certifiably insane when it went into the
> typesetting business.

They are not, and never have been, in the typesetting business. Perhaps
characters are not the only things easily confused *wink*

(Although some members of the consortium may be. But the consortium itself
isn't.)

> The pile-of-poo character was just frosting on 
> the cake.

Blame the Japanese mobile phone companies for that. When you pay your
membership fee, you get to object to the addition of characters too.
(Anyone, I think, can propose a new character, but only members get to
choose which proposals are accepted.)

But really, why should we object? Is "pile-of-poo" any more silly than any
of the other dingbats, graphics characters, and other non-alphabetical
characters? Unicode is not just for "letters of the alphabet".

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-07 Thread Chris Angelico

On Fri, Apr 8, 2016 at 4:00 PM, Steven D'Aprano  wrote:
> Or for that matter:
>
> a = akjhvciwfdwkejfc2qweoduycwldvqspjcwuhoqwe9fhlcjbqvcbhsiauy37wkg() + 100
> b = 100 + akjhvciwfdwkejfc2qweoduycwldvqspjcwuhoqew9fhlcjbqvcbhsiauy37wkg()
>
> How easily can you tell them apart at a glance?

Ouch! Can't even align them top and bottom. This is evil.

> I think that, beyond normalisation, the compiler need not be too concerned
> by confusables. I wouldn't *object* to the compiler raising a warning if it
> detected confusable identifiers, or mixed script identifiers, but I think
> that's more the job for a linter or human code review.

The compiler should treat as identical anything that an editor should
reasonably treat as identical. I'm not sure whether multiple combining
characters on a single base character are forced into some order prior
to comparison or are kept in the order they were typed, but my gut
feeling is that they should be considered identical.

> They are not, and never have been, in the typesetting business. Perhaps
> characters are not the only things easily confused *wink*

Peter is definitely a character. So are you. QUITE a character. :)

> But really, why should we object? Is "pile-of-poo" any more silly than any
> of the other dingbats, graphics characters, and other non-alphabetical
> characters? Unicode is not just for "letters of the alphabet".

It's less silly than "ZERO-WIDTH NON-BREAKING SPACE", which isn't a
space at all, it's a joiner. Go figure.

(History's a wonderful thing, ain't it? So's backward compatibility
and a guarantee that names will never be changed.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-08 Thread Peter Pearson

On Fri, 08 Apr 2016 16:00:10 +1000, Steven D'Aprano  wrote:
> On Fri, 8 Apr 2016 02:51 am, Peter Pearson wrote:
>> 
>> The Unicode consortium was certifiably insane when it went into the
>> typesetting business.
>
> They are not, and never have been, in the typesetting business. Perhaps
> characters are not the only things easily confused *wink*

Defining codepoints that deal with appearance but not with meaning is
going into the typesetting business.  Examples: ligatures, and spaces of
varying widths with specific typesetting properties like being non-breaking.

Typesetting done in MS Word using such Unicode codepoints will never
be more than a goofy approximation to real typesetting (e.g., TeX), but
it will cost a huge amount of everybody's time, with the current discussion
of ligatures in variable names being just a straw in the wind.  Getting
all the world's writing systems into a single, coherent standard was
an extraordinarily ambitious, monumental undertaking, and I'm baffled
that the urge to broaden its scope in this irrelevant direction was
entertained at all.

(Should this have been in cranky-geezer font?)

-- 
To email me, substitute nowhere->runbox, invalid->com.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-08 Thread Marko Rauhamaa

Peter Pearson :

> On Fri, 08 Apr 2016 16:00:10 +1000, Steven D'Aprano  
> wrote:
>> They are not, and never have been, in the typesetting business.
>> Perhaps characters are not the only things easily confused *wink*
>
> Defining codepoints that deal with appearance but not with meaning is
> going into the typesetting business. Examples: ligatures, and spaces
> of varying widths with specific typesetting properties like being
> non-breaking.
>
> Typesetting done in MS Word using such Unicode codepoints will never
> be more than a goofy approximation to real typesetting (e.g., TeX),
> but it will cost a huge amount of everybody's time, with the current
> discussion of ligatures in variable names being just a straw in the
> wind. Getting all the world's writing systems into a single, coherent
> standard was an extraordinarily ambitious, monumental undertaking, and
> I'm baffled that the urge to broaden its scope in this irrelevant
> direction was entertained at all.

I agree completely but at the same time have a lot of understanding for
the reasons why Unicode had to become such a mess. Part of it is
historical, part of it is political, yet part of it is in the
unavoidable messiness of trying to define what a character is.

For example, is "ä" one character or two: "a" plus "¨"? Is "i" one
character of two: "ı" plus "˙"? Is writing linear or two-dimensional?

Unicode heroically and definitively solved the problems ASCII had posed
but introduced a bag of new, trickier problems.

(As for ligatures, I understand that there might be quite a bit of
legacy software that dedicated code points and code pages for ligatures.
Translating that legacy software to Unicode was made more
straightforward by introducing analogous codepoints to Unicode. Unicode
has quite many such codepoints: µ, K, Ω etc.)

Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-08 Thread Chris Angelico

On Sat, Apr 9, 2016 at 3:44 AM, Marko Rauhamaa  wrote:
> Unicode heroically and definitively solved the problems ASCII had posed
> but introduced a bag of new, trickier problems.
>
> (As for ligatures, I understand that there might be quite a bit of
> legacy software that dedicated code points and code pages for ligatures.
> Translating that legacy software to Unicode was made more
> straightforward by introducing analogous codepoints to Unicode. Unicode
> has quite many such codepoints: µ, K, Ω etc.)

More specifically, Unicode solved the problems that *codepages* had
posed. And one of the principles of its design was that every
character in every legacy encoding had a direct representation as a
Unicode codepoint, allowing bidirectional transcoding for
compatibility. Perhaps if Unicode had existed from the dawn of
computing, we'd have less characters; but backward compatibility is
way too important to let a narrow purity argument sway it.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-08 Thread Rustom Mody

On Friday, April 8, 2016 at 10:24:17 AM UTC+5:30, Chris Angelico wrote:
> On Fri, Apr 8, 2016 at 2:43 PM, Rustom Mody  wrote:
> > No I am not clever/criminal enough to know how to write a text that is 
> > visually
> > close to
> > print "Hello World"
> > but is internally closer to
> > rm -rf /
> >
> > For me this:
> >  >>> Α = 1
>  A = 2
>  Α + 1 == A
> > True
> 
> >
> >
> > is cure enough that I am not amused
> 
> To me, the above is a contrived example. And you can contrive examples
> that are just as confusing while still being ASCII-only, like
> swimmer/swirnmer in many fonts, or I and l, or any number of other
> visually-confusing glyphs. I propose that we ban the letters 'r' and
> 'l' from identifiers, to ensure that people can't mess with
> themselves.

swirnmer and swimmer are distinguished by squiting a bit
А and A only by digging down into the hex.
If you categorize them as similar/same... well I am not arguing...
will come to you when I am short of straw...


> 
> > Specifically as far as I am concerned if python were to throw back say
> > a ligature in an identifier as a syntax error -- exactly what python2 does 
> > --
> > I think it would be perfectly fine and a more sane choice
> 
> The ligature is handled straight-forwardly: it gets decomposed into
> its component letters. I'm not seeing a problem here.

Yes... there is no problem... HERE [I did say python gets this right that
haskell for example gets wrong]
Whats wrong is the whole approach of swallowing gobs of characters that
need not be legal at all and then getting indigestion:

Note the "non-normative" in
https://docs.python.org/3/reference/lexical_analysis.html#identifiers

If a language reference is not normative what is?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-08 Thread Peter Pearson

On Sat, 9 Apr 2016 03:50:16 +1000, Chris Angelico  wrote:
> On Sat, Apr 9, 2016 at 3:44 AM, Marko Rauhamaa  wrote:
[snip]
>> (As for ligatures, I understand that there might be quite a bit of
>> legacy software that dedicated code points and code pages for ligatures.
>> Translating that legacy software to Unicode was made more
>> straightforward by introducing analogous codepoints to Unicode. Unicode
>> has quite many such codepoints: µ, K, Ω etc.)
>
> More specifically, Unicode solved the problems that *codepages* had
> posed. And one of the principles of its design was that every
> character in every legacy encoding had a direct representation as a
> Unicode codepoint, allowing bidirectional transcoding for
> compatibility. Perhaps if Unicode had existed from the dawn of
> computing, we'd have less characters; but backward compatibility is
> way too important to let a narrow purity argument sway it.

I guess with that historical perspective the current situation
seems almost inevitable.  Thanks.  And thanks to Steven D'Aprano
for other relevant insights.

-- 
To email me, substitute nowhere->runbox, invalid->com.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-08 Thread Rustom Mody

On Friday, April 8, 2016 at 11:14:21 PM UTC+5:30, Marko Rauhamaa wrote:
> Peter Pearson :
> 
> > On Fri, 08 Apr 2016 16:00:10 +1000, Steven D'Aprano  wrote:
> >> They are not, and never have been, in the typesetting business.
> >> Perhaps characters are not the only things easily confused *wink*
> >
> > Defining codepoints that deal with appearance but not with meaning is
> > going into the typesetting business. Examples: ligatures, and spaces
> > of varying widths with specific typesetting properties like being
> > non-breaking.
> >
> > Typesetting done in MS Word using such Unicode codepoints will never
> > be more than a goofy approximation to real typesetting (e.g., TeX),
> > but it will cost a huge amount of everybody's time, with the current
> > discussion of ligatures in variable names being just a straw in the
> > wind. Getting all the world's writing systems into a single, coherent
> > standard was an extraordinarily ambitious, monumental undertaking, and
> > I'm baffled that the urge to broaden its scope in this irrelevant
> > direction was entertained at all.
> 
> I agree completely but at the same time have a lot of understanding for
> the reasons why Unicode had to become such a mess. Part of it is
> historical, part of it is political, yet part of it is in the
> unavoidable messiness of trying to define what a character is.

There are standards and standards.
Just because they are standard does not make them useful, well-designed,
reasonable etc..

Its reasonably likely that all our keyboards start QWERT...
 Doesn't make it a sane design.

Likewise using NFKC to define the equivalence relation on identifiers
is analogous to saying: Since QWERTY has been in use for over a hundred years
its a perfectly good design. Just because NFKC has the stamp of the unicode
consortium it does not straightaway make it useful for all purposes
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-08 Thread Rustom Mody

On Friday, April 8, 2016 at 11:33:38 PM UTC+5:30, Peter Pearson wrote:
> On Sat, 9 Apr 2016 03:50:16 +1000, Chris Angelico wrote:
> > On Sat, Apr 9, 2016 at 3:44 AM, Marko Rauhamaa  wrote:
> [snip]
> >> (As for ligatures, I understand that there might be quite a bit of
> >> legacy software that dedicated code points and code pages for ligatures.
> >> Translating that legacy software to Unicode was made more
> >> straightforward by introducing analogous codepoints to Unicode. Unicode
> >> has quite many such codepoints: µ, K, Ω etc.)
> >
> > More specifically, Unicode solved the problems that *codepages* had
> > posed. And one of the principles of its design was that every
> > character in every legacy encoding had a direct representation as a
> > Unicode codepoint, allowing bidirectional transcoding for
> > compatibility. Perhaps if Unicode had existed from the dawn of
> > computing, we'd have less characters; but backward compatibility is
> > way too important to let a narrow purity argument sway it.
> 
> I guess with that historical perspective the current situation
> seems almost inevitable.  Thanks.  And thanks to Steven D'Aprano
> for other relevant insights.

Strange view
In fact the unicode standard itself encourages not using the standard in its
entirety

5.12 Deprecation

In the Unicode Standard, the term deprecation is used somewhat differently than 
it is in some other standards. Deprecation is used to mean that a character or 
other feature is strongly discouraged from use. This should not, however, be 
taken as indicating that anything has been removed from the standard, nor that 
anything is planned for removal from the standard. Any such change is 
constrained by the Unicode Consortium Stability Policies [Stability].

For the Unicode Character Database, there are two important types of 
deprecation to be noted. First, an encoded character may be deprecated. Second, 
a character property may be deprecated.

When an encoded character is strongly discouraged from use, it is given the 
property value Deprecated=True. The Deprecated property is a binary property 
defined specifically to carry this information about Unicode characters. Very 
few characters are ever formally deprecated this way; it is not enough that a 
character be uncommon, obsolete, disliked, or not preferred. Only those few 
characters which have been determined by the UTC to have serious architectural 
defects or which have been determined to cause significant implementation 
problems are ever deprecated. Even in the most severe cases, such as the 
deprecated format control characters (U+206A..U+206F), an encoded character is 
never removed from the standard. Furthermore, although deprecated characters 
are strongly discouraged from use, and should be avoided in favor of other, 
more appropriate mechanisms, they may occur in data. Conformant implementations 
of Unicode processes such a Unicode normalization must handle even deprecated 
characters correctly.

I read this as saying that -- in addition to officially deprecated chars --
there ARE "uncommon, obsolete, disliked, or not preferred" chars
which sensible users should avoid using even though unicode as a standard is
compelled to keep supporting

Which translates into
- python as a language *implementing* unicode (eg in strings) needs to
do it completely if it is to be standard compliant
- python as a *user* of unicode (eg in identifiers) can (and IMHO should)
use better judgement
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-08 Thread Rustom Mody

Adding link

On Friday, April 8, 2016 at 11:48:07 PM UTC+5:30, Rustom Mody wrote:

> 5.12 Deprecation
> 
> In the Unicode Standard, the term deprecation is used somewhat differently 
> than it is in some other standards. Deprecation is used to mean that a 
> character or other feature is strongly discouraged from use. This should not, 
> however, be taken as indicating that anything has been removed from the 
> standard, nor that anything is planned for removal from the standard. Any 
> such change is constrained by the Unicode Consortium Stability Policies 
> [Stability].
> 
> For the Unicode Character Database, there are two important types of 
> deprecation to be noted. First, an encoded character may be deprecated. 
> Second, a character property may be deprecated.
> 
> When an encoded character is strongly discouraged from use, it is given the 
> property value Deprecated=True. The Deprecated property is a binary property 
> defined specifically to carry this information about Unicode characters. Very 
> few characters are ever formally deprecated this way; it is not enough that a 
> character be uncommon, obsolete, disliked, or not preferred. Only those few 
> characters which have been determined by the UTC to have serious 
> architectural defects or which have been determined to cause significant 
> implementation problems are ever deprecated. Even in the most severe cases, 
> such as the deprecated format control characters (U+206A..U+206F), an encoded 
> character is never removed from the standard. Furthermore, although 
> deprecated characters are strongly discouraged from use, and should be 
> avoided in favor of other, more appropriate mechanisms, they may occur in 
> data. Conformant implementations of Unicode processes such a Unicode 
> normalization must handle even deprec
 ated characters correctly.



Link: http://unicode.org/reports/tr44/#Deprecation
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-08 Thread Steven D'Aprano

On Sat, 9 Apr 2016 03:21 am, Peter Pearson wrote:

> On Fri, 08 Apr 2016 16:00:10 +1000, Steven D'Aprano 
> wrote:
>> On Fri, 8 Apr 2016 02:51 am, Peter Pearson wrote:
>>> 
>>> The Unicode consortium was certifiably insane when it went into the
>>> typesetting business.
>>
>> They are not, and never have been, in the typesetting business. Perhaps
>> characters are not the only things easily confused *wink*
> 
> Defining codepoints that deal with appearance but not with meaning is
> going into the typesetting business.  Examples: ligatures, and spaces of
> varying widths with specific typesetting properties like being
> non-breaking.

Both of which are covered by the requirement that Unicode is capable of
representing legacy encodings/code pages.

Examples: MacRoman contains fl and fi ligatures, and NBSP. 

Non-breaking space is not so much a typesetting property as a semantic
property, that is, it deals with *meaning* (exactly what you suggested it
doesn't deal with). It is a space which doesn't break words.

Ligatures are a good example -- the Unicode consortium have explicitly
refused to add other ligatures beyond the handful needed for backwards
compatibility because they maintain that it is a typesetting issue that is
best handled by the font. There's even a FAQ about that very issue, and I
quote:

"The existing ligatures exist basically for compatibility and round-tripping
with non-Unicode character sets. Their use is discouraged. No more will be
encoded in any circumstances."

http://www.unicode.org/faq/ligature_digraph.html#Lig2

Unicode currently contains something of the order of one hundred and ten
thousand defined code points. I'm sure that if you went through the entire
list, with a sufficiently loose definition of "typesetting", you could
probably find some that exist only for presentation, and aren't covered by
the legacy encoding clause. So what? One swallow does not mean the season
is spring. Unicode makes an explicit rejection of being responsible for
typesetting. See their discussion on presentation forms:

http://www.unicode.org/faq/ligature_digraph.html#PForms

But I will grant you that sometimes there's a grey area between presentation
and semantics, and the Unicode consortium has to make a decision one way or
another. Those decisions may not always be completely consistent, and may
be driven by political and/or popular demand.

E.g. the Consortium explicitly state that stylistic issues such as bold,
italic, superscript etc are up to the layout engine or markup, and
shouldn't be part of the Unicode character set. They insist that they only
show representative glyphs for code points, and that font designers and
vendors are free (within certain limits) to modify the presentation as
desired. Nevertheless, there are specialist characters with distinct
formatting, and variant selectors for specifying a specific glyph, and
emoji modifiers for specifying skin tone.

But when you get down to fundamentals, character sets and alphabets have
always blurred the line between presentation and meaning. W ("double-u")
was, once upon a time, UU and & (ampersand) started off as a ligature
of "et" (Latin for "and"). There are always going to be cases where
well-meaning people can agree to disagree on whether or not adding the
character to Unicode was justified or not.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-08 Thread Marko Rauhamaa

Steven D'Aprano :

> But when you get down to fundamentals, character sets and alphabets have
> always blurred the line between presentation and meaning. W ("double-u")
> was, once upon a time, UU

But as every Finnish-speaker now knows, "w" is only an old-fashioned
typographic variant of the glyph "v". We still have people who write
"Wirtanen" or "Waltari" to make their last names look respectable and
19th-centrury-ish.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-09 Thread alister

On Fri, 08 Apr 2016 20:20:02 -0400, Dennis Lee Bieber wrote:

> On Fri, 8 Apr 2016 11:04:53 -0700 (PDT), Rustom Mody
>  declaimed the following:
> 
>>Its reasonably likely that all our keyboards start QWERT...
>> Doesn't make it a sane design.
>>
>   It was a sane design -- for early mechanical typewrites. It 
fulfills
> its goal of slowing down a typist to reduce jamming print-heads at the
> platen.* And since so many of us who had formal touch typing training
> probably learned on said mechanical typewriters, it hangs around.
> Fortunately, even though the typewriters at school had European
> dead-keys, we were plain English and I never had to pick them up.
> 
>   For a few years I did have problems with ()... They were on 
different
> keys (8 and 9, respectively) on old typewriters (the type that also had
> no 1) vs IBM Selectrics (never used by be) and computer terminals...
> 
> 
> 
> * Except I kept jamming two letters of my last name... I and E are
> reached with the same finger on opposite hands, which made a fast
> stroke-pair (compare moving the same finger on both hands to moving
> different fingers).


the design of qwerty was not to "Slow" the typist bu to ensure that the 
hammers for letters commonly used together are spaced widely apart, 
reducing the portion of trier travel arc were the could jam.
I and E are actually such a pair which is why they are at opposite ends 
of the hammer rack (I doubt that is the correct technical term).
they are on opposite hands to make typing of them faster.
unfortunately as you found it is still possible to jam them if they are 
hit almost simultaneously




-- 
There's a trick to the Graceful Exit.  It begins with the vision to
recognize when a job, a life stage, a relationship is over -- and to let
go.  It means leaving what's over without denying its validity or its
past importance in our lives.  It involves a sense of future, a belief
that every exit line is an entry, that we are moving on, rather than out.
The trick of retiring well may be the trick of living well.  It's hard to
recognize that life isn't a holding action, but a process.  It's hard to
learn that we don't leave the best parts of ourselves behind, back in the
dugout or the office. We own what we learned back there.  The experiences
and the growth are grafted onto our lives.  And when we exit, we can take
ourselves along -- quite gracefully.
-- Ellen Goodman
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-09 Thread Ben Bacarisse

alister  writes:

> 
> the design of qwerty was not to "Slow" the typist bu to ensure that the 
> hammers for letters commonly used together are spaced widely apart, 
> reducing the portion of trier travel arc were the could jam.
> I and E are actually such a pair which is why they are at opposite ends 
> of the hammer rack (I doubt that is the correct technical term).
> they are on opposite hands to make typing of them faster.
> unfortunately as you found it is still possible to jam them if they are 
> hit almost simultaneously
> 

The problem with that theory is that 'er/re' (this is e and r in either
order) is the 3rd most common pair in English but have been placed
together.  ou and et (in either order) are the 15th and 22nd most common
and they are separated by only one hammer position.  On the other hand,
the QWERTY layout puts jk together, but they almost never appear
together in English text.

-- 
Ben.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-09 Thread Ben Bacarisse

Ben Bacarisse  writes:

> alister  writes:
> 
>> 
>> the design of qwerty was not to "Slow" the typist bu to ensure that the 
>> hammers for letters commonly used together are spaced widely apart, 
>> reducing the portion of trier travel arc were the could jam.
>> I and E are actually such a pair which is why they are at opposite ends 
>> of the hammer rack (I doubt that is the correct technical term).
>> they are on opposite hands to make typing of them faster.
>> unfortunately as you found it is still possible to jam them if they are 
>> hit almost simultaneously
>> 
>
> The problem with that theory is that 'er/re' (this is e and r in either
> order) is the 3rd most common pair in English but have been placed
> together.  ou and et (in either order) are the 15th and 22nd most common
> and they are separated by only one hammer position.  On the other hand,
> the QWERTY layout puts jk together, but they almost never appear
> together in English text.

This last part came out muddled.  It's obviously wise to put infrequent
combinations together (like jk), but j and k are both also rare letters
so putting them together represents a wasted opportunity for meeting the
supposed design objective.  Swapping, say, k and r, or splitting jk but
putting e in the middle would surely result in a net gain of "hammer
separation".

-- 
Ben.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-09 Thread Rustom Mody

On Saturday, April 9, 2016 at 7:14:05 PM UTC+5:30, Ben Bacarisse wrote:
> The problem with that theory is that 'er/re' (this is e and r in either
> order) is the 3rd most common pair in English but have been placed
> together.  ou and et (in either order) are the 15th and 22nd most common
> and they are separated by only one hammer position.  On the other hand,
> the QWERTY layout puts jk together, but they almost never appear
> together in English text.

Where do you get this (kind of) statistical data?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-09 Thread Ben Bacarisse

Rustom Mody  writes:

> On Saturday, April 9, 2016 at 7:14:05 PM UTC+5:30, Ben Bacarisse wrote:
>> The problem with that theory is that 'er/re' (this is e and r in either
>> order) is the 3rd most common pair in English but have been placed
>> together.  ou and et (in either order) are the 15th and 22nd most common
>> and they are separated by only one hammer position.  On the other hand,
>> the QWERTY layout puts jk together, but they almost never appear
>> together in English text.
>
> Where do you get this (kind of) statistical data?

It was generated by counting the pairs found in a corpus of texts taken
from Project Gutenberg.  The numbers do very depending on what you pick
(for the complete works of Mark Twain er/re is second, for example), and
the none of the texts are very modern (because of the source) but I
doubt that matters too much.

-- 
Ben.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-09 Thread Mark Lawrence via Python-list


On 09/04/2016 17:08, Rustom Mody wrote:

On Saturday, April 9, 2016 at 7:14:05 PM UTC+5:30, Ben Bacarisse wrote:

The problem with that theory is that 'er/re' (this is e and r in either
order) is the 3rd most common pair in English but have been placed
together.  ou and et (in either order) are the 15th and 22nd most common
and they are separated by only one hammer position.  On the other hand,
the QWERTY layout puts jk together, but they almost never appear
together in English text.


Where do you get this (kind of) statistical data?



Again, where is the relevance to Python in this discussion, as we're on 
the main Python mailing list?  Please can the moderators take this stuff 
out, it is getting beyond the pale.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-09 Thread Stephen Hansen

On Sat, Apr 9, 2016, at 12:25 PM, Mark Lawrence via Python-list wrote:
> Again, where is the relevance to Python in this discussion, as we're on 
> the main Python mailing list?  Please can the moderators take this stuff 
> out, it is getting beyond the pale.

You need to come to grip with the fact that python-list is only
moderated in the vaguest sense of the word. 

Quote:

https://www.python.org/community/lists/
"Pretty much anything Python-related is fair game for discussion, and
the group is even fairly tolerant of off-topic digressions; there have
been entertaining discussions of topics such as floating point, good
software design, and other programming languages such as Lisp and
Forth."

If you don't like it, sorry. We all have our burdens to bear.

--S
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-10 Thread Gregory Ewing


Ben Bacarisse wrote:

The problem with that theory is that 'er/re' (this is e and r in either
order) is the 3rd most common pair in English but have been placed
together.


No, they haven't. The order of the characters in the type
basket goes down the slanted columns of keys, so E and R
are separated by D and C.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: Unicode normalisation [was Re: [beginner] What's wrong?]

2016-04-10 Thread Gregory Ewing


Steven D'Aprano :


But when you get down to fundamentals, character sets and alphabets have
always blurred the line between presentation and meaning. W ("double-u")
was, once upon a time, UU


And before that, it was VV, because the Romans used V the
way we now use U, and didn't have a letter U.

When U first appeared, it was just a cursive style of writing
a V. According to this, it wasn't until the 18th century that
the English alphabet got both U and V as separate letters:

http://boards.straightdope.com/sdmb/showthread.php?t=147677

Apparently "uu"/"vv" came to be known as "double u" prior to
that, and the name has persisted.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

From email addresses sometimes strange on this list - was Re: [beginner] What's wrong?

2016-04-04 Thread Michael Torrie

On 04/04/2016 08:04 AM, Mark Lawrence via Python-list wrote:
> On 02/04/2016 23:49, Michael Torrie wrote:
>> Mark, your messages are showing up to the list as being from "python,"
>> at least on my email.  Any reason for this?
>>
> 
> Assuming that you're referring to me, frankly I haven't a clue.  I read 
> this list with Thunderbird on Windows, I hit "reply" to something, I 
> type, I hit "send", job done.  Thereafter, as far as I'm concerned, a 
> miracle occurs and hundreds if not thousands of subscribers get to see 
> my reply.

Interesting.  The problem is definitely not on your end at all, though I
first noticed this with your recent posts. Other posts are showing up a
bit weirdly too.  The problem appears to be partly in my Thunderbird
client, and partly the mailing list gateway.  And maybe Gmail is
screwing things up too. Usenet-orginating posts look fine.  For example:

From: Marko Rauhamaa 
Newsgroups: comp.lang.python

Whereas email ones are sometimes looking like this:

From: Mark Lawrence via Python-list 
Reply-To: Mark Lawrence 

Thunderbird on my machine is only seeing the From email address
(python-list@python.org) and I must have that in my address list
somewhere as "python."

What's odder is that my own messages show up as "From:
torr...@gmail.com" and not "via Python-list ".

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: From email addresses sometimes strange on this list - was Re: [beginner] What's wrong?

2016-04-04 Thread Chris Angelico

On Tue, Apr 5, 2016 at 8:55 AM, Michael Torrie  wrote:
> Usenet-orginating posts look fine.  For example:
>
> From: Marko Rauhamaa 
> Newsgroups: comp.lang.python
>
> Whereas email ones are sometimes looking like this:
>
> From: Mark Lawrence via Python-list 
> Reply-To: Mark Lawrence 

O That probably explains it. It's because of Yahoo and mailing
lists. Yahoo did stuff that breaks stuff, so Mailman breaks stuff
differently to make sure that only Yahoo people get messed up a bit.
It means their names and addresses get slightly obscured, but delivery
works.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: From email addresses sometimes strange on this list - was Re: [beginner] What's wrong?

2016-04-04 Thread Michael Torrie

On 04/04/2016 04:58 PM, Chris Angelico wrote:
> O That probably explains it. It's because of Yahoo and mailing
> lists. Yahoo did stuff that breaks stuff, so Mailman breaks stuff
> differently to make sure that only Yahoo people get messed up a bit.
> It means their names and addresses get slightly obscured, but delivery
> works.

That explains it! The other folks with messages like that are coming
from Yahoo as well.  I can live with it.

Thank you!
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: From email addresses sometimes strange on this list - was Re: [beginner] What's wrong?

2016-04-08 Thread Cameron Simpson


On 05Apr2016 08:58, Chris Angelico  wrote:

On Tue, Apr 5, 2016 at 8:55 AM, Michael Torrie  wrote:

Usenet-orginating posts look fine.  For example:

From: Marko Rauhamaa 
Newsgroups: comp.lang.python

Whereas email ones are sometimes looking like this:

From: Mark Lawrence via Python-list 
Reply-To: Mark Lawrence 


O That probably explains it. It's because of Yahoo and mailing
lists. Yahoo did stuff that breaks stuff, so Mailman breaks stuff
differently to make sure that only Yahoo people get messed up a bit.
It means their names and addresses get slightly obscured, but delivery
works.


It is yahoo and mailman and a funky spec called DKIM or DMARC (related, not 
identical).  This makes a signature related to the originating host, and if 
mailman forwarded the message unchanged the signature would break - people 
honouring it would decide the mailman hosts were forging Mark's email.


Fortunately you can fix all this up on receipt, which is why I wasn't noticing 
this myself (I had in the past, and wrote myself a recipe for repair - my mail 
folders contain the reapired messages).


For Mark's messages I am using these mailfiler rules (the latter I think):

 from:s/.*/$reply_to/
   X-Yahoo-Newman-Id:/.
   from:python-list@python.org,python-id...@python.org,tu...@python.org

 from:s/.*/$reply_to/
   DKIM-Signature:/.
   from:python-list@python.org,python-id...@python.org,tu...@python.org

which just replaces the contents of the From: line with the contents of the 
Reply-To: line for this kind of message via the python lists.


Yahoo do something equivalent but more agressive to lists hosted on yahoo 
itself, such as sed-users. For that I have a couple of scripts - fix-dkim-from:


 https://bitbucket.org/cameron_simpson/css/src/tip/bin/fix-dkim-from

which is a sed script, and fix-dkim-from-swap:

 https://bitbucket.org/cameron_simpson/css/src/tip/bin/fix-dkim-from-swap

The former works on messages whose From: header is enough - it can be reversed 
in place. The latter is for messages where the from isn't enough, but there is 
another header contianing the original (default "X-Original-From").


You can use these in systems like procmail, eg:

 :0whf
 * from:.*
 | fix-dkim-from

It is annoying, and I'm happy to help people utilise these recipes if possible.  
Most all-in-one mail readers (Thunderbird, GMail, Apple Mail etc) are a bit too 
dumb, but if you can do your mail collection separately from your reader you 
can usually insert something in the processing.


It is nowhere near as annoying as the usenet<->mail gateway which is eating 
message-ids; that is truly uncivilised.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list

QWERTY was not designed to intentionally slow typists down (was: Unicode normalisation [was Re: [beginner] What's wrong?])

2016-04-08 Thread Ben Finney

Dennis Lee Bieber writes:

> [The QWERTY keyboard layout] was a sane design -- for early mechanical
> typewrites. It fulfills its goal of slowing down a typist to reduce
> jamming print-heads at the platen.

This is an often-repeated myth, with citations back as far as the 1970s.
It is false.

The design is intended to reduce jamming the print heads together, but
the goal of this is not to reduce speed, but to enable *fast* typing.

It aims to maximise the frequency in which (English-language) text has
consecutive letters alternating either side of the middle of the
keyboard. This should thus reduce collisions of nearby heads — and hence
*increase* the effective typing speed that can be achieved on such a
mechanical typewriter.

The degree to which this maximum was achieved is arguable. Certainly the
relevance to keyboards today, with no connection from the layout to
whether print heads will jam, is negligible.

What is not arguable is that there is no evidence the design had any
intention of *slowing* typists in any way. Quite the opposite, in fact.

http://www.straightdope.com/columns/read/221/was-the-qwerty-keyboard-purposely-designed-to-slow-typists>,
and other links from the Wikipedia article
https://en.wikipedia.org/wiki/QWERTY#History_and_purposes>, should
allow interested people to get the facts right on this canard.

--
\ “I used to think that the brain was the most wonderful organ in |
`\ my body. Then I realized who was telling me this.” —Emo Philips |
_o__) |
Ben Finney

--
https://mail.python.org/mailman/listinfo/python-list

Re: QWERTY was not designed to intentionally slow typists down (was: Unicode normalisation [was Re: [beginner] What's wrong?])

2016-04-08 Thread Steven D'Aprano

On Sat, 9 Apr 2016 10:43 am, Ben Finney wrote:

> Dennis Lee Bieber  writes:
> 
>> [The QWERTY keyboard layout] was a sane design -- for early mechanical
>> typewrites. It fulfills its goal of slowing down a typist to reduce
>> jamming print-heads at the platen.
> 
> This is an often-repeated myth, with citations back as far as the 1970s.
> It is false.
> 
> The design is intended to reduce jamming the print heads together, but
> the goal of this is not to reduce speed, but to enable *fast* typing.

And how did it enable fast typing? By *slowing down the typist*, and thus
having fewer jams.

Honestly, I have the greatest respect for the Straight Dope, but this is one
of those times when they miss the forest for the trees. The conventional
wisdom about typewriters isn't wrong -- or at least there's no evidence
that it's wrong.

As far as I can, *every single* argument against the conventional wisdom
comes down to an argument that it is ridiculous or silly that anyone might
have wanted to slow typing down. For example, Wikipedia links to this page:

http://www.smithsonianmag.com/arts-culture/fact-of-fiction-the-legend-of-the-qwerty-keyboard-49863249/?no-ist

which quotes researchers:

“The speed of Morse receiver should be equal to the Morse sender, of course.
If Sholes really arranged the keyboard to slow down the operator, the
operator became unable to catch up the Morse sender. We don’t believe that
Sholes had such a nonsense intention during his development of
Type-Writer.”

This is merely argument from personal incredibility:

http://rationalwiki.org/wiki/Argument_from_incredulity

and is trivially answerable: how well do you think the receiver can keep up
with the sender if they have to stop every few dozen keystrokes to unjam
the typewriter?

Wikipedia states:

"Contrary to popular belief, the QWERTY layout was not designed to slow the
typist down,[3]"

with the footnote [3] linking to

http://www.maltron.com/media/lillian_kditee_001.pdf

which clearly and prominently states in the THIRD paragraph:

"It has been said of the Sholes letter layout [QWERTY] this it would
probably have been chosen if the objective was to find the least
efficient -- in terms of learning time and speed achievable -- and the most
error producing character arrangement. This is not surprising when one
considers that a team of people spent one year developing this layout so
that it should provide THE GREATEST INHIBITION TO FAST KEYING. [Emphasis
added.] This was no Machiavellian plot, but necessary because the mechanism
of the early typewriters required slow operation."

This is the power of the "slowing typists down is a myth" meme: same
Wikipedia contributor takes an article which *clearly and obviously*
repeats the conventional narrative that QWERTY was designed to decrease the
number of key presses per second, and uses that to defend the counter-myth
that QWERTY wasn't designed to decrease the number of key presses per
second!

These are the historical facts:

- early typewriters had varying layouts, some of which allow much more rapid
keying than QWERTY;

- early typewriters were prone to frequent and difficult jamming;

- Sholes spend significant time developing a layout which reduced the number
of jams by intentionally moving frequently typed characters far apart,
which has the effect of slowing down the rate at which the typist can hit
keys;

- which results in greater typing speed do to a reduced number of jams.

In other words the conventional story.

Jams have such a massively negative effect on typing speed that reducing the
number of jams gives you a *huge* win on overall speed even if the rate of
keying is significantly lower. At first glance, it may seem paradoxical,
but it's not. Which is faster?

- typing at a steady speed of (lets say) 100 words per minute;

- typing in bursts of (say) 200 wpm for a minute, followed by three minutes
of 0 wpm.

The second case averages half the speed of the first, even though the typist
is hitting keys at a faster rate. This shouldn't be surprising to any car
driver who has raced from one red light to the next, only to be caught up
and even overtaken by somebody driving at a more sedate speed who caught
nothing but green lights. Or to anyone who has heard the story of the
Tortoise and the Hare.

The moral of QWERTY is "less haste, more speed".

The myth of the "QWERTY myth" is based on the idea that people are unable to
distinguish between peak speed and average speed. But ironically, in my
experience, it's only those repeating the myth who seem confused by that
difference (as in the quote from the Smithsonian above). Most people don't
need the conventional narrative explained:

"Speed up typing by slowing the typist down? Yeah, that makes sense. When I
try to do things in a rush, I make more mistakes and end up taking longer
than I otherwise would have. This is exactly the same sort of principle."

while others, like our dear Cecil from the Straight Dope, wrongly imagine
that o

Re: QWERTY was not designed to intentionally slow typists down (was: Unicode normalisation [was Re: [beginner] What's wrong?])

2016-04-09 Thread Random832

On Fri, Apr 8, 2016, at 23:28, Steven D'Aprano wrote:
> And how did it enable fast typing? By *slowing down the typist*, and thus
> having fewer jams.

Er, no? The point is that type bars that are closer together collide
more easily *at the same actual typing speed* than ones that are further
apart - For Q to collide with P, they would have to both be nearly all
the way to the platen at the same time, whereas Q can collide with A
even a mere millimeter from the basket (or anywhere in between).

I don't understand where this idea that alternating hands makes you
slows you down came from in the first place... I suspect it's people who
haven't really thought for a minute about the physical process of typing
(to type "ec" you have to physically move your left hand, to type "en"
your right hand can already be moving into place while your left hand
presses the first key. The former is clearly slower than the latter.)
This goes double for hunt-and-peck typing, where you have to move your
whole hand to press _any_ two keys on the same hand.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: QWERTY was not designed to intentionally slow typists down (was: Unicode normalisation [was Re: [beginner] What's wrong?])

2016-04-09 Thread Random832

On Fri, Apr 8, 2016, at 23:28, Steven D'Aprano wrote:
> This is the power of the "slowing typists down is a myth" meme: same
> Wikipedia contributor takes an article which *clearly and obviously*
> repeats the conventional narrative that QWERTY was designed to
> decrease the number of key presses per second, and uses that to defend
> the counter-myth that QWERTY wasn't designed to decrease the number of
> key presses per second!

Er, the footnote is clearly and obviously being used to cite the claim
that that is popularly believed, not the claim that it's incorrect.

> These are the historical facts:

> - Sholes spend significant time developing a layout which reduced the
>   number of jams by intentionally moving frequently typed characters
>   far apart, which has the effect of slowing down the rate at which
>   the typist can hit keys;

"Moving characters far apart has the effect of slowing down the rate at
which the typist can hit keys" is neither a fact nor historical. Keys
that are further apart *can be hit faster without jamming* due to the
specifics of the type-basket mechanism, and there's no reason to think
that they can't be hit with at least equal speed by the typist.

Take a typewriter. Press Q and A (right next to each other) at the same
time, and observe the distance from the type basket where the jam
occurs. Now press Q and P (on the opposite side of the basket from each
other) and observe where the jam occurs.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: QWERTY was not designed to intentionally slow typists down (was: Unicode normalisation [was Re: [beginner] What's wrong?])

2016-04-09 Thread pyotr filipivich

Dennis Lee Bieber  on Sat, 09 Apr 2016 14:52:50
-0400 typed in comp.lang.python  the following:
>On Sat, 09 Apr 2016 11:44:48 -0400, Random832 
>declaimed the following:
>
>>I don't understand where this idea that alternating hands makes you
>>slows you down came from in the first place... I suspect it's people who
>
>   It's not (to my mind) the alternation that slows one down. It's the
>combination of putting common letters under weak fingers and some
>combinationS that require the same hand/finger to slow one down.
>
>aspect a is on the weakest left finger, with the s on a finger that
>many people have trouble moving independently from the middle finger (hmm,
>I seem to be okay moving the ring finger, but moving the middle finger
>tends to drag the ring with it). p is the weakest finger of the right hand.
>e&c use the same finger of the left hand, t is the strongest finger but one
>is coming off the lower-row reach of middle-finger c.
>
>deaf   is all left hand, and the de is the same finger... earth except
>for the h is also all left hand, and rt are the same finger.
>
>   I suspect for any argument for one side, a corresponding counter can be
>made for the other side. There are only 5.5 vowels (the .5 is Y) in
>English, so they are likely more prevalent than the 20-odd consonants when
>taking singly. Yet A is on the weakest finger on the weakest (for most of
>the populace) hand. IOU OTOH are in a fast three-finger roll -- and worse,
>IO is fairly common (all the ***ion endings).

ASINTOER are the top eight English letters (not in any order, it
is just that "A Sin To Err" is easy to remember.
--  
pyotr filipivich
The fears of one class of men are not the measure of the rights of another. 
-- George Bancroft
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: QWERTY was not designed to intentionally slow typists down (was: Unicode normalisation [was Re: [beginner] What's wrong?])

2016-04-10 Thread Ian Kelly

On Sat, Apr 9, 2016 at 9:09 PM, pyotr filipivich  wrote:
> ASINTOER are the top eight English letters (not in any order, it
> is just that "A Sin To Err" is easy to remember.

What's so hard to remember about ETA OIN SHRDLU? Plus that even gives
you the top twelve. :-)
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: QWERTY was not designed to intentionally slow typists down (was: Unicode normalisation [was Re: [beginner] What's wrong?])

2016-04-10 Thread pyotr filipivich

Ian Kelly  on Sun, 10 Apr 2016 07:43:13 -0600
typed in comp.lang.python  the following:
>On Sat, Apr 9, 2016 at 9:09 PM, pyotr filipivich  wrote:
>> ASINTOER are the top eight English letters (not in any order, it
>> is just that "A Sin To Err" is easy to remember.
>
>What's so hard to remember about ETA OIN SHRDLU? Plus that even gives
>you the top twelve. :-)

Depends on what you're looking for, I suppose.  In this case,
those eight get encoded differently than the other 20 characters.
--  
pyotr filipivich
The fears of one class of men are not the measure of the rights of another. 
-- George Bancroft
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: QWERTY was not designed to intentionally slow typists down (was: Unicode normalisation [was Re: [beginner] What's wrong?])

2016-04-17 Thread Steven D'Aprano

Oh no, it's the thread that wouldn't die! *wink*

On Sun, 10 Apr 2016 01:53 am, Random832 wrote:

> On Fri, Apr 8, 2016, at 23:28, Steven D'Aprano wrote:
>> This is the power of the "slowing typists down is a myth" meme: same
>> Wikipedia contributor takes an article which *clearly and obviously*
>> repeats the conventional narrative that QWERTY was designed to
>> decrease the number of key presses per second, and uses that to defend
>> the counter-myth that QWERTY wasn't designed to decrease the number of
>> key presses per second!
> 
> Er, the footnote is clearly and obviously being used to cite the claim
> that that is popularly believed, not the claim that it's incorrect.

That's not clear nor obvious to me. But I won't quibble, I'll accept that as
a plausible interpretation.

>> These are the historical facts:
> 
>> - Sholes spend significant time developing a layout which reduced the
>>   number of jams by intentionally moving frequently typed characters
>>   far apart, which has the effect of slowing down the rate at which
>>   the typist can hit keys;
> 
> "Moving characters far apart has the effect of slowing down the rate at
> which the typist can hit keys" is neither a fact nor historical.

Actually, yes it is. At least, according to this website:

http://www.mit.edu/~jcb/Dvorak/history.html

  [quote]
  Because typists at that time used the "hunt-and-peck" method,
  Sholes's arrangement increased the time it took for the typists
  to hit the keys for common two-letter combinations enough to
  ensure that each type bar had time to fall back sufficiently
  far to be out of the way before the next one came up.
  [end quote]

The QWERTY layout was first sold in 1873 while the first known use of
ten-fingered typing was in 1878, and touch-typing wasn't invented for
another decade, in 1888.

So I think it is pretty clear that *at the time QWERTY was invented*
it slowed down the rate at which keys were pressed, thus allowing an
overall greater typing speed thanks to the reduced jamming.

Short of a signed memo from Shole himself, commenting one way or another, I
don't think we're going to find anything more definitive.

Even though QWERTY wasn't designed with touch-typing in mind, it's
interesting to look at some of the weaknesses of the system. It is almost
as if it had been designed to make touch-typing as inefficient as
possible :-) Just consider the home keys. The home keys require the least
amount of finger or hand movement, and are therefore the fastest to reach.
With QWERTY, the eight home keys only cover a fraction over a quarter of
all key presses: ASDF JKL; have frequencies of

8.12% 6.28% 4.32% 2.30% 0.10% 0.69% 3.98% and effectively 0%

making a total of 25.79%. If you also include G and H as "virtual
home-keys", that rises to 33.74%.

But that's far less than the obvious tactic of using the most common
letters ETAOIN as the home keys, which would cover 51.18% just from those
eight keys alone. The 19th century Blickensderfer typewriter used a similar
layout, with the ten home keys DHIATENSOR as the home keys. This would
allow the typist to make just under 74% of all alphabetical key presses
without moving the hands.

https://en.wikipedia.org/wiki/Blickensderfer_typewriter

Letter frequencies taken from here:

http://www.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html

> Keys 
> that are further apart *can be hit faster without jamming* due to the
> specifics of the type-basket mechanism, and there's no reason to think
> that they can't be hit with at least equal speed by the typist.

You may be correct about that specific issue when it comes to touch typing,
but touch typing was 15 years in the future when Sholes invented QWERTY.
And unlike Guido, he didn't have a time-machine :-)

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: QWERTY was not designed to intentionally slow typists down (was: Unicode normalisation [was Re: [beginner] What's wrong?])

2016-04-17 Thread Random832

On Sun, Apr 17, 2016, at 21:39, Steven D'Aprano wrote:
> Oh no, it's the thread that wouldn't die! *wink*
>
> Actually, yes it is. At least, according to this website:
> 
> http://www.mit.edu/~jcb/Dvorak/history.html

I'd really rather see an instance of the claim not associated with
Dvorak marketing. It only holds up as an obvious inference from the
nature of how typing works if we assume *one*-finger hunt-and-peck
rather than two-finger. Your website describes two-finger as the method
that was being replaced by the 1878 introduction of ten-finger typing.

> The QWERTY layout was first sold in 1873 while the first known use of
> ten-fingered typing was in 1878, and touch-typing wasn't invented for
> another decade, in 1888.

Two-finger hunt-and-peck is sufficient for placing keys on opposite
hands to speed typing up rather than slow it down.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: QWERTY was not designed to intentionally slow typists down (was: Unicode normalisation [was Re: [beginner] What's wrong?])

2016-04-17 Thread Chris Angelico

On Mon, Apr 18, 2016 at 11:39 AM, Steven D'Aprano  wrote:
> With QWERTY, the eight home keys only cover a fraction over a quarter of
> all key presses: ASDF JKL; have frequencies of
>
> 8.12% 6.28% 4.32% 2.30% 0.10% 0.69% 3.98% and effectively 0%
>
> making a total of 25.79%. If you also include G and H as "virtual
> home-keys", that rises to 33.74%.

Hey, that's a little unfair. Remember, lots of people still have to
write C code, so the semicolon is an important character! :) In fact,
skimming the CPython source code (grouped by file extension) shows
that C code has more semicolons than j's or k's:

a 3.19% s 3.26% d 1.90% f 1.76% g 0.95% h 0.89% j 0.36% k 0.35% l 2.62% ; 1.40%

for a total of 16.69% of characters coming from the home row.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: QWERTY was not designed to intentionally slow typists down (was: Unicode normalisation [was Re: [beginner] What's wrong?])

2016-04-18 Thread Steven D'Aprano

On Monday 18 April 2016 12:01, Random832 wrote:

> On Sun, Apr 17, 2016, at 21:39, Steven D'Aprano wrote:
>> Oh no, it's the thread that wouldn't die! *wink*
>>
>> Actually, yes it is. At least, according to this website:
>> 
>> http://www.mit.edu/~jcb/Dvorak/history.html
> 
> I'd really rather see an instance of the claim not associated with
> Dvorak marketing. 

So would I, but this is hardly a Dvorak *marketing*. The author even points 
out that the famous case-study done by the US Navy was "biased, and at 
worst, fabricated".

http://www.mit.edu/~jcb/Dvorak/

And he too repeats the canard that "Contrary to popular opinion" QWERTY 
wasn't designed to slow typists down. (Even though he later goes on to 
support the popular opinion.)

You can also read the article in Reason magazine:

http://reason.com/archives/1996/06/01/typing-errors

You can skip the entire first page -- it is almost entirely a screed against 
government regulation and a defence of the all-mighty free-market. But the 
article goes through some fairly compelling evidence that Dvorak keyboards 
are barely more efficient that QWERTY, and that there was plenty of 
competition in type-writers in the late 1800s.

I don't agree with the Reason article that they have disproven the 
conventional wisdom that QWERTY won the typewriter wars due to luck and 
path-dependence. The authors are (in my opinion) overly keen to dismiss 
path-dependence, for instance taking it as self-evidently true that the use 
of QWERTY in the US would have no influence over other countries' choice in 
key layout. But it does support the contention that, at the time, QWERTY was 
faster than the alternatives.

Unfortunately, what it doesn't talk about is whether or not the alternate 
layouts had fewer jams.

Wikipedia's article on QWERTY shows the various designs used by Sholes and 
Remington, leading to the modern layout

https://en.wikipedia.org/wiki/QWERTY

One serious problem for discussion is that the QWERTY keyboard we use now is 
*not* the same as that designed by Sholes. For instance, one anomaly is that 
two very common digraphs, ER and RE, are right next to each other. But 
that's not how Sholes laid out the keys. On his keyboard, the top row was 
initially AEI.?Y then changed to QWE.TY. Failure to recognise this leads to 
errors like this blogger's claim that it is "wrong" that QWERTY was designed 
to break apart common digraphs:

http://yasuoka.blogspot.com.au/2006/08/sholes-discovered-that-many-
english.html

Even on a modern keyboard, out of the ten most common digraphs:

th he in er an re nd at on nt

only er/re use consecutive keys, and five out of the ten use alternate 
hands. Move the R back to its original position, and there are none with 
consecutive keys and seven with alternate hands.

> It only holds up as an obvious inference from the
> nature of how typing works if we assume *one*-finger hunt-and-peck
> rather than two-finger.

I don't agree, but neither can I prove it conclusively.

> Your website describes two-finger as the method
> that was being replaced by the 1878 introduction of ten-finger typing.
> 
>> The QWERTY layout was first sold in 1873 while the first known use of
>> ten-fingered typing was in 1878, and touch-typing wasn't invented for
>> another decade, in 1888.
> 
> Two-finger hunt-and-peck is sufficient for placing keys on opposite
> hands to speed typing up rather than slow it down.

Correct, once you take into account jamming. That's the whole point of 
separating the keys. But consider common letter combinations that can be 
typed by the one hand: QWERTY has a significant number of quite long words 
that can be typed with one hand, the *left* hand. That's actually quite 
harmful for both typing speed and accuracy.

Anyway, you seem to have ignored (or perhaps you just have nothing to say) 
my comments about the home keys. It seems clear to me that even with two-
finger typing, a layout that puts ETAOIN on the home keys, such as the 
Blickensderfer typewriter, would minimize the distance travelled by the 
fingers and improve typing speed -- but only so long as the problem of 
jamming was solved.

Interestingly, Wikipedia makes it clear that in the 19th century, the 
problem of jamming arms was already solved by doing away with the arms and 
using a wheel or a ball.

-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list

1 2 >

1 - 100 of 101 matches

Mail list logo