subject:"RE\: PEP 3131\: Supporting Non\-ASCII Identifiers"

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-20 Thread rurpy

On May 17, 5:03 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> On May 16, 6:38 pm, [EMAIL PROTECTED] wrote:
> > Are you worried that some 3rd-party package you have
> > included in your software will have some non-ascii identifiers
> > buried in it somewhere?  Surely that is easy to check for?
> > Far easier that checking that it doesn't have some trojan
> > code it it, it seems to me.
>
> What do you mean, "check for"?  If, say, numeric starts using math
> characters (as has been suggested), I'm not exactly going to stop
> using numeric.  It'll still be a lot better than nothing, just
> slightly less better than it used to be.

The PEP explicitly states that no non-ascii identifiers
will be permitted in the standard library.  The opinions
expressed here seems almost unamimous that non-ascii
identifiers are a bad idea in any sort of shared public
code.  Why do you think the occurance of non-ascii
identifiers in Numpy is likely?

> > > And I'm often not creating a stack trace procedure, I'm using the
> > > built-in python procedure.
> >
> > > And I'm often dealing with mailing lists, Usenet, etc where I don't
> > > know ahead of time what the other end's display capabilities are, how
> > > to fix them if they don't display what I'm trying to send, whether
> > > intervening systems will mangle things, etc.
> >
> > I think we all are in this position.  I always send plain
> > text mail to mailing lists, people I don't know etc.  But
> > that doesn't mean that email software should be contrainted
> > to only 7-bit plain text, no attachements!  I frequently use
> > such capabilities when they are appropriate.
>
> Sure.  But when you're talking about maintaining code, there's a very
> high value to having all the existing tools work with it whether
> they're wide-character aware or not.

I agree.  On Windows I often use Notepad to edit
python files.  (There goes my credibility! :-)
So I don't like tab-only indent proposals that assume
I can set tabs to be an arbitrary number of spaces.
But tab-only indentation would affect every python
program and every python programmer.

In the case of non-ascii identifiers, the potential
gains are so big for non-english spreakers, and (IMO)
the difficulty of working with non-ascii identifiers
times the probibility of having to work with them,
so low, that the former clearly outweighs the latter.

> > If your response is, "yes, but look at the problems html
> > email, virus infected, attachements etc cause", the situation
> > is not the same.  You have little control over what kind of
> > email people send you but you do have control over what
> > code, libraries, patches, you choose to use in your
> > software.
> >
> > If you want to use ascii-only, do it!  Nobody is making
> > you deal with non-ascii code if you don't want to.
>
> Yes.  But it's not like this makes things so horribly awful that it's
> worth my time to reimplement large external libraries.  I remain at -0
> on the proposal;

> it'll cause some headaches for the majority of
> current Python programmers, but it may have some benefits to a
> sizeable minority

This is the crux of the matter I think.  That
non-ascii identifiers will spead like a virus, infecting
program after program until every piece of Python code
is nothing but a mass of wreathing unintellagible non-
ascii characters.  (OK, maybe I am overstating a little. :-)

I (and I think other proponents) don't think this is
likely to happen, and the the benefits to non-english
speakers of being able to write maintainable code far
outweigh the very rare case when it does occur.

> and may help bring in new coders.  And it's not
> going to cause flaming catastrophic death or anything.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-20 Thread Christophe Cavalaria

Istvan Albert wrote:

> On May 19, 3:33 am, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> 
>> >That would be invalid syntax since the third line is an assignment
>> > with target identifiers separated only by spaces.
>>
>> Plus, the identifier starts with a number (even though ６ is not DIGIT
>> SIX, but FULLWIDTH DIGIT SIX, it's still of category Nd, and can't
>> start an identifier).
> 
> Actually both of these issues point to the real problem with this PEP.
> 
> I knew about them (note that the colon is also missing) alas I
> couldn't fix them.
> My editor would could not remove a space or add a colon anymore, it
> would immediately change the rest of the characters to something
> crazy.
> 
> (Of course now someone might feel compelled to state that this is an
> editor problem but I digress, the reality is that features need to
> adapt to reality, moreso had I used a different editor I'd be still
> unable to write these characters).

The reality is that the few users who care about having chinese in their
code *will* be using an editor that supports them.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Istvan Albert

On May 19, 3:33 am, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:

> >That would be invalid syntax since the third line is an assignment
> > with target identifiers separated only by spaces.
>
> Plus, the identifier starts with a number (even though ６ is not DIGIT
> SIX, but FULLWIDTH DIGIT SIX, it's still of category Nd, and can't
> start an identifier).

Actually both of these issues point to the real problem with this PEP.

I knew about them (note that the colon is also missing) alas I
couldn't fix them.
My editor would could not remove a space or add a colon anymore, it
would immediately change the rest of the characters to something
crazy.

(Of course now someone might feel compelled to state that this is an
editor problem but I digress, the reality is that features need to
adapt to reality, moreso had I used a different editor I'd be still
unable to write these characters).

i.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Peter Maas

Martin v. Löwis wrote:
> Python code is written by many people in the world who are not familiar
> with the English language, or even well-acquainted with the Latin
> writing system.

I believe that there is a not a single programmer in the world who doesn't
know ASCII. It isn't hard to learn the latin alphabet and you have to know
it anyway to use the keywords and the other ASCII characters to write numbers,
punctuation etc. Most non-western alphabets have ASCII transcription rules
and contain ASCII as a subset. On the other hand non-ascii identifiers
lead to fragmentation and less understanding in the programming world so I
don't like them. I also don't like non-ascii domain names where the same
arguments apply.

Let the data be expressed with Unicode but the logic with ASCII.

-- 
Regards/Gruesse,

Peter Maas, Aachen
E-mail 'cGV0ZXIubWFhc0B1dGlsb2cuZGU=\n'.decode('base64')
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread René Fleschenberg

Martin v. Löwis schrieb:
>>> Then get tools that match your working environment.
>> Integration with existing tools *is* something that a PEP should
>> consider. This one does not do that sufficiently, IMO.
> 
> What specific tools should be discussed, and what specific problems
> do you expect?

Systems that cannot display code parts correctly. I expect problems with
unreadable tracebacks, for example.

Also: Are existing tools that somehow process Python source code e.g. to
test wether it meets certain criteria (pylint & co) or to aid in
creating documentation (epydoc & co) fully unicode-ready?

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread René Fleschenberg

Martin v. Löwis schrieb:
> I've reported this before, but happily do it again: I have lived many
> years without knowing what a "hub" is, and what "to pass" means if
> it's not the opposite of "to fail". Yet, I have used their technical
> meanings correctly all these years.

I was not speaking of the more general (non-technical) meanings, but of
the technical ones. The claim which I challenged was that people learn
just the "use" (syntax) but not the "meaning" (semantics) of these
terms. I think you are actually supporting my argument ;)

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Richard Hanson

On Fri, 18 May 2007 06:28:03 +0200, Martin v. Löwis wrote:

[excellent as always exposition by Martin]

Thanks, Martin. 

> P.S. Anybody who wants to play with generating visualisations
> of the PEP, here are the functions I used:

[code snippets]

Thanks for those functions, too -- I've been exploring with them and
am slowly coming to some understanding.

 -- Richard Hanson

"To many native-English-speaking developers well versed in other
programming environments, Python is *already* a foreign language --
judging by the posts here in c.l.py over the years." ;-)
__

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Martin v. Löwis

>> But you're making a strawman argument by using extended ASCII
>> characters that would work anyhow. How about debugging this (I wonder
>> will it even make it through?) :
>>
>> class ６자회담관련론조
>>６자회 = 0
>>６자회담관련 고귀 명=10
> 
>That would be invalid syntax since the third line is an assignment
> with target identifiers separated only by spaces.

Plus, the identifier starts with a number (even though ６ is not DIGIT
SIX, but FULLWIDTH DIGIT SIX, it's still of category Nd, and can't
start an identifier).

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Martin v. Löwis

> Providing a method that would translate an arbitrary string into a
> valid Python identifier would be helpful. It would be even more
> helpful if it could provide a way of converting untranslatable
> characters. However, I suspect that the translate (normalize?) routine
> in the unicode module will do.

Not at all. Unicode normalization only unifies different "spellings"
of the same character.

For transliteration, no simple algorithm exists, as it generally depends
on the language. However, if you just want any kind of ASCII string,
you can use the Unicode error handlers (PEP 293). For example, the
program

import unicodedata, codecs

def namereplace(exc):
if isinstance(exc,
   (UnicodeEncodeError, UnicodeTranslateError)):
s = u""
for c in exc.object[exc.start:exc.end]:
s += "N_"+unicode(unicodedata.name(c).replace(" ","_"))+"_"
return (s, exc.end)
else:
raise TypeError("can't handle %s" % exc.__name__)

codecs.register_error("namereplace", namereplace)

print u"Schl\xfcssel".encode("ascii", "namereplace")

prints SchlN_LATIN_SMALL_LETTER_U_WITH_DIAERESIS_ssel.

HTH,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Javier Bezos

<@yahoo.com> escribió:

>> > Perhaps, but the treatment by your mail/news software plus the
>> > delightful Google Groups of the original text (which seemed intact in
>> > the original, although I don't have the fonts for the content) would
>> > suggest that not just social or cultural issues would be involved.
>>
>> The fact my Outlook changed the text is irrelevant
>> for something related to Python.
>
> On the contrary, it cuts to the heart of the problem.  There are
> hundreds of tools out there that programmers use, and mailing lists
> are certainly an incredibly valuable tool--introducing a change that
> makes code more likely to be silently mangled seems like a negative.

In such a case, the Python indentation should be
rejected (quite interesting you removed from my
post the part mentioning it). I can promise there
are Korean groups and there are no problems at
all in using Hangul (the Korean writing).

Javier
-
http://www.texytipografia.com 


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Gregor Horvath

[EMAIL PROTECTED] schrieb:

> opposed.  But dismissing the fact that Outlook and other quite common
> tools may have severe problems with code seems naive (or disingenuous,
> but I don't think that's the case here).

Of course there is broken software out there. There are even editors 
that mix tabs and spaces ;-) Python did not introduce braces to solve 
this problem but encouraged to use appropriate tools. It seems to work 
for 99% of us. Same here.
It is the 21st century. Tools that destroy Unicode byte streams are 
seriously broken. Face it. You can not halt progress because of some 
broken software. Fix or drop it instead.

I do not think that this will be a big problem because only a very small 
fraction of specialized local code will use Unicode identifiers anyway.

Unicode strings and comments are allowed today and I didn't heard of a 
single issue of destroyed strings because of bad editors, although I 
guess that Unicode strings in code are way more common than Unicode 
identifiers would ever be.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread John Roth

On May 13, 9:44 am, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> PEP 1 specifies that PEP authors need to collect feedback from the
> community. As the author of PEP 3131, I'd like to encourage comments
> to the PEP included below, either here (comp.lang.python), or to
> [EMAIL PROTECTED]
>
> In summary, this PEP proposes to allow non-ASCII letters as
> identifiers in Python. If the PEP is accepted, the following
> identifiers would also become valid as class, function, or
> variable names: Löffelstiel, changé, ошибка, or 売り場
> (hoping that the latter one means "counter").

I notice that Guido has approved it, so I'm looking at what it would
take to support it for Python FIT. The actual issue (for me) is
translating labels for cell columns (and similar) into Python
identifiers. After looking at the firestorm, I've come to the
conclusion that the old methods need to be retained not only for
backwards compatability but also for people who want to translate
existing fixtures.

The guidelines in PEP 3131 for standard library code appear to be
adequate for code that's going to be contributed to the community. I
will most likely emphasize those in my documentation.

Providing a method that would translate an arbitrary string into a
valid Python identifier would be helpful. It would be even more
helpful if it could provide a way of converting untranslatable
characters. However, I suspect that the translate (normalize?) routine
in the unicode module will do.

John Roth
Phthon FIT

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Neil Hodgson

Istvan Albert:

> But you're making a strawman argument by using extended ASCII
> characters that would work anyhow. How about debugging this (I wonder
> will it even make it through?) :
> 
> class ６자회담관련론조
>６자회 = 0
>６자회담관련 고귀 명=10

That would be invalid syntax since the third line is an assignment 
with target identifiers separated only by spaces.

Neil
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Paul Boddie

Gregor Horvath wrote:
> Paul Boddie schrieb:
>
> > Perhaps, but the treatment by your mail/news software plus the
> > delightful Google Groups of the original text (which seemed intact in
> > the original, although I don't have the fonts for the content) would
> > suggest that not just social or cultural issues would be involved.
>
> I do not see the point.
> If my editor or newsreader does display the text correctly or not is no
> difference for me, since I do not understand a word of it anyway. It's a
> meaningless stream of bits for me.

But if your editor doesn't even bother to preserve those bits
correctly, it makes a big difference. When ６자회담관련론조 becomes 6???
because someone's tool did the equivalent of
unicode_obj.encode("iso-8859-1", "replace"), then the stream of bits
really does become meaningless. (We'll see if the former identifier
even resembles what I've just pasted later on, or whether it resembles
the latter.)

> It's save to assume that for people who are finding this meaningful
> their setup will display it correctly. Otherwise they could not work
> with their computer anyway.

Sure, it's all about "editor discipline" or "tool discipline" just as
I wrote. I'm in favour of the PEP, generally, but I worry about the
long explanations required when people find that their programs are
now ill-formed because someone made a quick edit in a bad editor.

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread [EMAIL PROTECTED]

On May 18, 1:47 pm, "Javier Bezos" <[EMAIL PROTECTED]> wrote:
> >> This question is more or less what a Korean who doesn't
> >> speak English would ask if he had to debug a program
> >> written in English.
>
> > Perhaps, but the treatment by your mail/news software plus the
> > delightful Google Groups of the original text (which seemed intact in
> > the original, although I don't have the fonts for the content) would
> > suggest that not just social or cultural issues would be involved.
>
> The fact my Outlook changed the text is irrelevant
> for something related to Python.

On the contrary, it cuts to the heart of the problem.  There are
hundreds of tools out there that programmers use, and mailing lists
are certainly an incredibly valuable tool--introducing a change that
makes code more likely to be silently mangled seems like a negative.

Of course, there are other benefits to the PEP, so I'm only barely
opposed.  But dismissing the fact that Outlook and other quite common
tools may have severe problems with code seems naive (or disingenuous,
but I don't think that's the case here).

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Javier Bezos


>> This question is more or less what a Korean who doesn't
>> speak English would ask if he had to debug a program
>> written in English.
>
> Perhaps, but the treatment by your mail/news software plus the
> delightful Google Groups of the original text (which seemed intact in
> the original, although I don't have the fonts for the content) would
> suggest that not just social or cultural issues would be involved.

The fact my Outlook changed the text is irrelevant
for something related to Python. And just remember
how Google mangled the intentation of Python code
some time ago. This was a technical issue which has
been solved, and no doubt my laziness (I didn't
switch to Unicode) won't prevent non-ASCII identifiers
be properly showed in general.

Javier
-
http://www.texytipografia.com



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Gregor Horvath

Paul Boddie schrieb:

> Perhaps, but the treatment by your mail/news software plus the
> delightful Google Groups of the original text (which seemed intact in
> the original, although I don't have the fonts for the content) would
> suggest that not just social or cultural issues would be involved.

I do not see the point.
If my editor or newsreader does display the text correctly or not is no 
difference for me, since I do not understand a word of it anyway. It's a 
meaningless stream of bits for me.
It's save to assume that for people who are finding this meaningful 
their setup will display it correctly. Otherwise they could not work 
with their computer anyway.

Until now I did not find a single Computer in my German domain who 
cannot display: ß.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Paul Boddie

On 18 Mai, 18:42, "Javier Bezos" <[EMAIL PROTECTED]> wrote:
> "Istvan Albert" <[EMAIL PROTECTED]> escribió:
>
> > How about debugging this (I wonder will it even make it through?) :
>
> > class 6???
>
>  >   6?? = 0
>  >  6? ?? ?=10
>
> This question is more or less what a Korean who doesn't
> speak English would ask if he had to debug a program
> written in English.

Perhaps, but the treatment by your mail/news software plus the
delightful Google Groups of the original text (which seemed intact in
the original, although I don't have the fonts for the content) would
suggest that not just social or cultural issues would be involved.
It's already more difficult than it ought to be to explain to people
why they have trouble printing text to the console, for example, and
if one considers issues with badly configured text editors putting the
wrong character values into programs, even if Python complains about
it, there's still going to be some explaining to do.

One thing that some people already dislike about Python is the
"editing discipline" required. Although I don't have much time for
people whose coding "skills" involve random edits using badly
configured editors, trashing the indentation and the appearance of the
code (regardless of the language involved), we do need to consider the
need to bring people "up to speed" gracefully by encouraging the
proper use of tools, and so on, all without making it seem really
difficult and discouraging people from learning the language.

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Gregor Horvath

Istvan Albert schrieb:
> On May 17, 2:30 pm, Gregor Horvath <[EMAIL PROTECTED]> wrote:
> 
>> Is there any difference for you in debugging this code snippets?
> 
>> class Türstock(object):
> 
> Of course there is, how do I type the ü ? (I can copy/paste for
> example, but that gets old quick).
> 

I doubt that you can debug the code without Unicode chars. It seems that 
you do no understand German and therefore you do not know what the 
purpose of this program is.
Can you tell me if there is an error in the snippet without Unicode?

I would refuse to try do debug a program that I do not understand. 
Avoiding Unicode does not help a bit in this regard.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Javier Bezos

"Istvan Albert" <[EMAIL PROTECTED]> escribió:

> How about debugging this (I wonder will it even make it through?) :
>
> class 6???
 >   6?? = 0
 >  6? ?? ?=10

This question is more or less what a Korean who doesn't
speak English would ask if he had to debug a program
written in English.

> (I don't know what it means, just copied over some words
> from a japanese news site,

A Japanese speaking Korean, it seems. :-)

Javier
--
http://www.texytipografia.com 


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Istvan Albert

On May 17, 2:30 pm, Gregor Horvath <[EMAIL PROTECTED]> wrote:

> Is there any difference for you in debugging this code snippets?

> class Türstock(object):

Of course there is, how do I type the ü ? (I can copy/paste for
example, but that gets old quick).

But you're making a strawman argument by using extended ASCII
characters that would work anyhow. How about debugging this (I wonder
will it even make it through?) :

class ６자회담관련론조
   ６자회 = 0
   ６자회담관련 고귀 명=10

(I don't know what it means, just copied over some words from a
japanese news site, but the first thing it did it messed up my editor,
would not type the colon anymore)

i.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Gregor Horvath

Hendrik van Rooyen schrieb:

> I suppose that this "one language track" - mindedness of mine
> is why I find the mix of keywords and German or Afrikaans so 
> abhorrent - I cannot really help it, it feels as if I am eating a 
> sandwich, and that I bite on a stone in the bread. - It just jars.

Please come to Vienna and learn the local slang.
You would be surprised how beautiful and expressive a language mixed up 
of a lot of very different languages can be. Same for music. It's the 
secret of success of the music from Vienna. It's just a mix up of all 
the different cultures once living in a big multicultural kingdom.

A mix up of Python key words and German identifiers feels very natural 
for me. I live in cultural diversity and richness and love it.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Thomas Bellman

=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?= <[EMAIL PROTECTED]> wrote:

>> 3) Is or will there be a definitive and exhaustive listing (with
>> bitmap representations of the glyphs to avoid the font issues) of the
>> glyphs that the PEP 3131 would allow in identifiers? (Does this
>> question even make sense?)

> As for the list I generated in HTML: It might be possible to
> make it include bitmaps instead of HTML character references,
> but doing so is a licensing problem, as you need a license
> for a font that has all these characters. If you want to
> lookup a specific character, I recommend to go to the Unicode
> code charts, at

> http://www.unicode.org/charts/

My understanding is also that there are several east-asian
characters that display quite differently depending on whether
you are in Japan, Taiwan or mainland China.  So much differently
that for example a Japanese person will not be able to recognize
a character rendered in the Taiwanese or mainland Chinese way.


-- 
Thomas Bellman,   Lysator Computer Club,   Linköping University,  Sweden
"Adde parvum parvo magnus acervus erit"   ! bellman @ lysator.liu.se
  (From The Mythical Man-Month)   ! Make Love -- Nicht Wahr!
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Laurent Pointal

Long and interresting discussion with different point of view.

Personnaly, even if the PEP goes (and its accepted), I'll continue to use
identifiers as currently. But I understand those who wants to be able to
use chars in their own language.

* for people which are not expert developers (non-pros, or in learning
context), to be able to use names having meaning, and for pro developers
wanting to give a clear domain specific meaning - mainly for languages non
based on latin characters where the problem must be exacerbated.
They can already use unicode in strings (including documentation ones).

* for exchanging with other programing languages having such identifiers...
when they are really used (I include binding of table/column names in
relational dataabses).

* (not read, but I think present) this will allow developers to lock the
code so that it could not be easily taken/delocalized anywhere by anybody.


In the discussion I've seen that problem of mixing chars having different
unicode number but same representation (ex. omega) is resolved (use of an
unicode attribute linked to representation AFAIU).

I've seen (on fclp) post about speed, it should be verified, I'm not sure we
will loose speed with unicode identifiers.

On the unicode editing, we have in 2007 enough correct editors supporting
unicode (I configure my Windows/Linux editors to use utf-8 by default).


I join concern in possibility to read code from a project which may use such
identifiers (i dont read cyrillic, neither kanji or hindi) but, this will
just give freedom to users.

This can be a pain for me in some case, but is this a valuable argument so
to forbid this for other people which feel the need ?


IMHO what we should have if the PEP goes on:

* reworking on traceback to have a general option (like -T) to ensure
tracebacks prints only pure ascii, to avoid encoding problem when
displaying errors on terminals.

* a possibility to specify for modules that they must *define* only
ascii-based names, like a from __futur__ import asciionly. To be able to
enforce this policy in projects which request it.

* and, as many wrote, enforce that standard Python libraries use only ascii
identifiers.



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Torsten Bronger

Hallöchen!

Laurent Pointal writes:

> [...]
>
> Personnaly, even if the PEP goes (and its accepted), I'll continue
> to use identifiers as currently. [...]

Me too (mostly), although I do like the PEP.  While many people have
pointed out possible issues of the PEP, only few have tried to
estimate its actual impact.  I don't think that it will do harm to
Python code because the programmers will know when it's appropriate
to use it.  The potential trouble is too obvious for being ignored
accidentally.  And in the case of a bad programmer, you have more
serious problems than flawed identifier names, really.

But for private utilities for example, such identifiers are really a
nice thing to have.  The same is true for teaching in some cases.
And the small simulation program in my thesis would have been better
with some α and φ.  At least, the program would be closer to the
equations in the text then.

> [...]
>
> * a possibility to specify for modules that they must *define*
> only ascii-based names, like a from __futur__ import asciionly. To
> be able to enforce this policy in projects which request it.

Please don't.  We're all adults.  If a maintainer is really
concerned about such a thing, he should write a trivial program that
ensures it.  After all, there are some other coding guidelines too
that could be enforced this way but aren't, for good reason.

Tschö,
Torsten.

-- 
Torsten Bronger, aquisgrana, europa vetus
  Jabber ID: [EMAIL PROTECTED]
  (See http://ime.webhop.org for ICQ, MSN, etc.)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Torsten Bronger

Hallöchen!

Martin v. Löwis writes:

>> In <[EMAIL PROTECTED]>, Nick Craig-Wood
>> wrote:
>> 
>>> My initial reaction is that it would be cool to use all those
>>> great symbols.  A variable called OHM etc!
>> 
>> This is a nice candidate for homoglyph confusion.  There's the
>> Greek letter omega (U+03A9) Ω and the SI unit symbol (U+2126) Ω,
>> and I think some omegas in the mathematical symbols area too.
>
> Under the PEP, identifiers are converted to normal form NFC, and
> we have
>
> py> unicodedata.normalize("NFC", u"\u2126")
> u'\u03a9'
>
> So, OHM SIGN compares equal to GREEK CAPITAL LETTER OMEGA. It can't
> be confused with it - it is equal to it by the proposed language
> semantics.

So different unicode sequences in the source code can denote the
same identifier?

Tschö,
Torsten.

-- 
Torsten Bronger, aquisgrana, europa vetus
  Jabber ID: [EMAIL PROTECTED]
  (See http://ime.webhop.org for ICQ, MSN, etc.)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Paul Rubin

"Martin v. Löwis" <[EMAIL PROTECTED]> writes:
> If you doubt the claim, please indicate which of these three aspects
> you doubt:
> 1. there are programmers which desire to defined classes and functions
>with names in their native language.
> 2. those developers find the code clearer and more maintainable than
>if they had to use English names.
> 3. code clarity and maintainability is important.

I think it can damage clarity and maintainability and if there's so
much demand for it then I'd propose this compromise: non-ascii
identifiers are allowed but they produce a compiler warning message
(including from eval and exec).  You can suppress the warning message
with a command line option.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Hendrik van Rooyen

"Sion Arrowsmith" <[EMAIL PROTECTED]> wrote:

Hvr:
>>Would not like it at all, for the same reason I don't like re's -
>>It looks like random samples out of alphabet soup to me.
>
>What I meant was, would the use of "foreign" identifiers look so
>horrible to you if the core language had fewer English keywords?
>(Perhaps Perl, with its line-noise, was a poor choice of example.
>Maybe Lisp would be better, but I'm not so sure of my Lisp as to
>make such an assertion for it.)

I suppose it would jar less - but I avoid such languages, as the whole
thing kind of jars - I am not on the python group for nothing..

: - )

- Hendrik

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Hendrik van Rooyen

"Hendrik van Rooyen" <[EMAIL PROTECTED]> wrote:

> 
> 
> > > Now look me in the eye and tell me that you find
> > > the mix of proper German and English keywords
> > > beautiful.
> > 
> > I can't admit that, but I find that using German
> > class and method names is beautiful. The rest around
> > it (keywords and names from the standard library)
> > are not English - they are Python.
> > 
MvL:
> > (look me in the eye and tell me that "def" is
> > an English word, or that "getattr" is one)
> > 
> 
HvR:
> LOL - true - but a broken down assembler programmer like me
> does not use getattr - and def is short for define, and for and while
> and in are not German.

After an intense session of omphaloscopy, I would like another bite 
at this cherry.

I think my problem is something like this - when I see a line of code
like:

def frobnitz():

I do not actually see the word "def" - I see something like:

define a function with no arguments called frobnitz

This "expansion" process is involuntary, and immediate in my mind.

And this is immediately followed by an irritated reaction, like:

WTF is frobnitz? What is it supposed to do? What Idiot wrote this?

Similarly, when I encounter the word "getattr" - it is immediately
expanded to "get attribute" and this "expansion" is kind of
dependant on another thing, namely that my mind is in "English
mode" - I refer here to something that only happens rarely, but
with devastating effect, experienced only by people who can read
more than one language - I am referring to the phenomenon that you 
look at an unfamiliar piece of writing on say a signboard, with the 
wrong language "switch" set in your mind - and you cannot read it,
it makes no sense for a second or two - until you kind of step back 
mentally and have a more deliberate look at it, when it becomes 
obvious that its not say English, but Afrikaans, or German, or vice 
versa.

So in a sense, I can look you in the eye and assert that "def" and 
"getattr" are in fact English words...  (for me, that is)

I suppose that this "one language track" - mindedness of mine
is why I find the mix of keywords and German or Afrikaans so 
abhorrent - I cannot really help it, it feels as if I am eating a 
sandwich, and that I bite on a stone in the bread. - It just jars.

Good luck with your PEP - I don't support it, but it is unlikely
that the Python-dev crowd and GvR would be swayed much
by the opinions of the egregious HvR.

Aesthetics aside, I think that the practical maintenance problems
(especially remote maintenance) is the rock on which this
ship could founder.

- Hendrik

--
Philip Larkin (English Poet) :
They fuck you up, your mom and dad -
They do not mean to, but they do.
They fill you with the faults they had,
and add some extra, just for you.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Paul Rubin

"Martin v. Löwis" <[EMAIL PROTECTED]> writes:
> > Integration with existing tools *is* something that a PEP should
> > consider. This one does not do that sufficiently, IMO.
> What specific tools should be discussed, and what specific problems
> do you expect?

Emacs, whose unicode support is still pretty weak.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Paul Rubin

"Martin v. Löwis" <[EMAIL PROTECTED]> writes:
> Now I understand it is meaning 12 in Merriam-Webster's dictionary,
> a) "to decline to bid, double, or redouble in a card game", or b)
> "to let something go by without accepting or taking
> advantage of it".

I never thought of it as having that meaning.  I thought of it in the
sense of going by something without stopping, like "I passed a post
office on my way to work today".
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> Possibly.  One Java program I remember had Japanese comments encoded
> in Shift-JIS.  Will Python be better here?  Will it support the source
> code encodings that programmers around the world expect?

It's not a question of "will it". It does today, starting from Python 2.3.

>> Another possible reason is that the programmers were unsure whether
>> non-ASCII identifiers are allowed.
> 
> If that's the case, I'm not sure how you can improve on that in Python.

It will change on its own over time. "Not allowed" could mean "not
permitted by policy". Indeed, the PEP explicitly mandates a policy
that bans non-ASCII characters from source (whether in identifiers or
comments) for Python itself, and encourages other projects to define
similar policies. What projects pick up such a policy, or pick a
different policy (e.g. all comments must be in Korean) remains to
be seen.

Then, programmers will not be sure whether the language and the tools
allow it. For Python, it will be supported from 3.0, so people will
be worried initially whether their code needs to run on older Python
versions. When Python 3.5 comes along, people hopefully have lost
interest in supporting 2.x, so they will start using 3.x features,
including this one.

Now, it may be tempting to say "ok, so lets wait until 3.5, if people
won't use it before anyway". That is trick logic: if we add it only
to 3.5, people won't be using it before 4.0. *Any* new feature
takes several years to get into wide acceptance, but years pass
surprisingly fast.

> There are lots of possible reasons why all these programmers around
> the world who want to use non-ASCII identifiers end-up not using them.
> One is simply that very people ever really want to do so.  However,
> if you're to assume that they do, then you should look the existing
> practice in other languages to find out what they did right and what
> they did wrong.  You don't have to speculate.

That's indeed how this PEP came about. There were early adapters, like
Java, then experience gained from it (resulting in PEP 263, implemented
in Python 2.3 on the Python side, and resulting in UAX#39 on the Unicode
consortium side), and that experience now flows into PEP 3131.

If you think I speculated in reasoning why people did not use the
feature in Java: sorry for expressing myself unclearly. I know for
a fact that the reasons I suggested were actual reasons given by
actual people. I'm just not sure whether this was an exhaustive
list (because I did not interview every programmer in the world),
and what statistical relevance each of these reasons had (because
I did not conduct a scientific research to gain statistically
relevant data on usage of non-ASCII identifiers in different
regions of the world).

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> Currently, in Python 2.5, identifiers are specified as starting with
> an upper- or lowercase letter or underscore ('_') with the following
> "characters" of the identifier also optionally being a numerical digit
> ("0"..."9").
> 
> This current state seems easy to remember even if felt restrictive by
> many.
> 
> Contrawise, the referenced document "UAX-31" is a bit obscure to me

It's actually very easy. The basic principle will stay: the first
character must be a letter or an underscore, followed by letters,
underscores, and digits.

The question really is "what is a letter"? what is an underscore?
what is a digit?

> 1) Will this allow me to use, say, a "right-arrow" glyph (if I can
> find one) to start my identifier? 

No. A right-arrow (such as U+2192, RIGHTWARDS ARROW) is a symbol
(general category Sm: Symbol, Math). See

http://unicode.org/Public/UNIDATA/UCD.html

for a list of general category values, and

http://unicode.org/Public/UNIDATA/UnicodeData.txt

for a textual description of all characters.

Now, there is a special case in that Unicode supports "combining
modifier characters", i.e. characters that are not characters
themselves, but modify previous characters, to add diacritical
marks to letters. Unicode has great flexibility in applying these,
to form characters that are not supported themselves. Among those,
there is U+20D7, COMBINING RIGHT ARROW ABOVE, which is of general
category Mn, Mark, Nonspacing.

In PEP 3131, such marks may not appear as the first character
(since they need to modify a base character), but as subsequent
characters. This allows you to form identifiers such as
v⃗ (which should render as a small letter v, with an vector
arrow on top).

> 2) Could an ``ID_Continue`` be used as an ``ID_Start`` if using a RTL
> (reversed or "mirrored") identifier? (Probably not, but I don't know.)

Unicode, and this PEP, always uses logical order, not rendering order.
What matters is in what order the characters appear in the source code
string.

RTL languages do pose a challenge, in particular since bidirectional
algorithms apparently aren't implemented correctly in many editors.

> 3) Is or will there be a definitive and exhaustive listing (with
> bitmap representations of the glyphs to avoid the font issues) of the
> glyphs that the PEP 3131 would allow in identifiers? (Does this
> question even make sense?)

It makes sense, but it is difficult to implement. The PEP already
links to a non-normative list that is exhaustive for Unicode 4.1.
Future Unicode versions may add additional characters, so the
a list that is exhaustive now might not be in the future. The
Unicode consortium promises stability, meaning that what is an
identifier now won't be reclassified as a non-identifier in the
future, but the reverse is not true, as new code points get
assigned.

As for the list I generated in HTML: It might be possible to
make it include bitmaps instead of HTML character references,
but doing so is a licensing problem, as you need a license
for a font that has all these characters. If you want to
lookup a specific character, I recommend to go to the Unicode
code charts, at

http://www.unicode.org/charts/

Notice that an HTML page that includes individual bitmaps
for all characters would take *ages* to load.

Regards,
Martin

P.S. Anybody who wants to play with generating visualisations
of the PEP, here are the functions I used:

def isnorm(c):
return unicodedata.normalize("NFC", c)

def start(c):
if not isnorm(c):
return False
if unicodedata.category(c) in ('Ll', 'Lt', 'Lm', 'Lo', 'Nl'):
return True
if c==u'_':
return True
if c in u"\u2118\u212E\u309B\u309C":
return True
return False

def cont_only(c):
if not isnorm(c):
return False
if unicodedata.category(c) in ('Mn', 'Mc', 'Nd', 'Pc'):
return True
if 0x1369 <= ord(c) <= 0x1371:
return True
return False

def cont(c):
return start(c) or cont_only(c)

The isnorm() aspect excludes characters from the list which
change under NFC. This excludes a few compatibility characters
which are allowed in source code, but become indistinguishable
from their canonical form semantically.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Gregor Horvath

[EMAIL PROTECTED] schrieb:

> With the second one, all my standard tools would work fine.  My user's
> setups will work with it.  And there's a much higher chance that all
> the intervening systems will work with it.
> 

Please fix your setup.
This is the 21st Century. Unicode is the default in Python 3000.
Wake up before it is too late for you.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Ross Ridge

=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=  <[EMAIL PROTECTED]> wrote:
>One possible reason is that the tools processing the program would not
>know correctly what encoding the source file is in, and would fail
>when they guessed the encoding incorrectly. For comments, that is not
>a problem, as an incorrect encoding guess has no impact on the meaning
>of the program (if the compiler is able to read over the comment
>in the first place).

Possibly.  One Java program I remember had Japanese comments encoded
in Shift-JIS.  Will Python be better here?  Will it support the source
code encodings that programmers around the world expect?

>Another possible reason is that the programmers were unsure whether
>non-ASCII identifiers are allowed.

If that's the case, I'm not sure how you can improve on that in Python.

There are lots of possible reasons why all these programmers around
the world who want to use non-ASCII identifiers end-up not using them.
One is simply that very people ever really want to do so.  However,
if you're to assume that they do, then you should look the existing
practice in other languages to find out what they did right and what
they did wrong.  You don't have to speculate.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  [EMAIL PROTECTED]
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Steve Holden

Martin v. Löwis wrote:
> Neil Hodgson schrieb:
>> Martin v. Löwis:
>>
>>> ... regardless of whether this PEP gets accepted
>>> or not (which it just did).
>>Which version can we expect this to be implemented in?
> 
> The PEP says 3.0, and the planned implementation also targets
> that release.
> 
Can we take it this change *won't* be backported to the 2.X series?

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden
-- Asciimercial -
Get on the web: Blog, lens and tag your way to fame!!
holdenweb.blogspot.comsquidoo.com/pythonology
tagged items: del.icio.us/steve.holden/python
All these services currently offer free registration!
-- Thank You for Reading 

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread [EMAIL PROTECTED]

On May 16, 6:38 pm, [EMAIL PROTECTED] wrote:
> On May 16, 11:41 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> wrote:
>
> > Christophe wrote:
> snip...
> > > Who displays stack frames? Your code. Whose code includes unicode
> > > identifiers? Your code. Whose fault is it to create a stack trace
> > > display procedure that cannot handle unicode? You.
>
> > Thanks but no--I work with a _lot_ of code I didn't write, and looking
> > through stack traces from 3rd party packages is not uncommon.
>
> Are you worried that some 3rd-party package you have
> included in your software will have some non-ascii identifiers
> buried in it somewhere?  Surely that is easy to check for?
> Far easier that checking that it doesn't have some trojan
> code it it, it seems to me.

What do you mean, "check for"?  If, say, numeric starts using math
characters (as has been suggested), I'm not exactly going to stop
using numeric.  It'll still be a lot better than nothing, just
slightly less better than it used to be.

> > And I'm often not creating a stack trace procedure, I'm using the
> > built-in python procedure.
>
> > And I'm often dealing with mailing lists, Usenet, etc where I don't
> > know ahead of time what the other end's display capabilities are, how
> > to fix them if they don't display what I'm trying to send, whether
> > intervening systems will mangle things, etc.
>
> I think we all are in this position.  I always send plain
> text mail to mailing lists, people I don't know etc.  But
> that doesn't mean that email software should be contrainted
> to only 7-bit plain text, no attachements!  I frequently use
> such capabilities when they are appropriate.

Sure.  But when you're talking about maintaining code, there's a very
high value to having all the existing tools work with it whether
they're wide-character aware or not.

> If your response is, "yes, but look at the problems html
> email, virus infected, attachements etc cause", the situation
> is not the same.  You have little control over what kind of
> email people send you but you do have control over what
> code, libraries, patches, you choose to use in your
> software.
>
> If you want to use ascii-only, do it!  Nobody is making
> you deal with non-ascii code if you don't want to.

Yes.  But it's not like this makes things so horribly awful that it's
worth my time to reimplement large external libraries.  I remain at -0
on the proposal; it'll cause some headaches for the majority of
current Python programmers, but it may have some benefits to a
sizeable minority and may help bring in new coders.  And it's not
going to cause flaming catastrophic death or anything.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Richard Hanson

On Sun, 13 May 2007 17:44:39 +0200, Martin v. Löwis wrote:

> The syntax of identifiers in Python will be based on the Unicode
> standard annex UAX-31 [1]_, with elaboration and changes as defined
> below.
>
> Within the ASCII range (U+0001..U+007F), the valid characters for
> identifiers are the same as in Python 2.5.  This specification only
> introduces additional characters from outside the ASCII range.  For
> other characters, the classification uses the version of the Unicode
> Character Database as included in the ``unicodedata`` module.
>
> The identifier syntax is `` *``.
>
> ``ID_Start`` is defined as all characters having one of the general
> categories uppercase letters (Lu), lowercase letters (Ll), titlecase
> letters (Lt), modifier letters (Lm), other letters (Lo), letter numbers
> (Nl), plus the underscore (XXX what are "stability extensions" listed in
> UAX 31).
>
> ``ID_Continue`` is defined as all characters in ``ID_Start``, plus
> nonspacing marks (Mn), spacing combining marks (Mc), decimal number
> (Nd), and connector punctuations (Pc).
>
>
> [...]
>
>.. [1] http://www.unicode.org/reports/tr31/

First, to Martin: Thanks for writing this PEP.

While I have been reading both sides of this debate and finding both
sides reasonable and understandable in the main, I have several
questions which seem to not have been raised in this thread so far. 

Currently, in Python 2.5, identifiers are specified as starting with
an upper- or lowercase letter or underscore ('_') with the following
"characters" of the identifier also optionally being a numerical digit
("0"..."9").

This current state seems easy to remember even if felt restrictive by
many.

Contrawise, the referenced document "UAX-31" is a bit obscure to me
(which is not eased by the fact that various browsers render non-ASCII
characters differently or not at all depending on the setup and font
sets available). Further, a cursory perusing of the unicodedata module
seems to refer me back to the Unicode docs.

I note that UAX-31 seems to allow "ideographs" as ``ID_Start``, for
example. From my relative state of ignorance, several questions come
to mind:

1) Will this allow me to use, say, a "right-arrow" glyph (if I can
find one) to start my identifier? 

2) Could an ``ID_Continue`` be used as an ``ID_Start`` if using a RTL
(reversed or "mirrored") identifier? (Probably not, but I don't know.)

3) Is or will there be a definitive and exhaustive listing (with
bitmap representations of the glyphs to avoid the font issues) of the
glyphs that the PEP 3131 would allow in identifiers? (Does this
question even make sense?)

I have long programmed in RPL and have appreciated being able to use,
say, a "right arrow" symbol to start a name of a function (e.g., "->R"
or "->HMS" where the '->' is a single, right-arrow glyph).[1]

While it is not clear that identifiers I may wish to use would still
be prohibited under PEP 3131, I vote:

 +0

__
[1] RPL (HP's Dr. William Wickes' language and environment circa the
1980s) allows for a few specific "non-ASCII" glyphs as the start of a
name. I have solved my problem with my Python "appliance computer"
project by having up to three representations for my names: Python 2.x
acceptable names as the actual Python identifier, a Unicode text
display exposed to the end user, and also if needed, a bitmap display
exposed to the end user. So -- IAGNI. :-)

-- 
Richard Hanson

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread [EMAIL PROTECTED]

On May 16, 6:38 pm, [EMAIL PROTECTED] wrote:
> On May 16, 11:41 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> wrote:
>
> > Christophe wrote:
> snip...
> > > Who displays stack frames? Your code. Whose code includes unicode
> > > identifiers? Your code. Whose fault is it to create a stack trace
> > > display procedure that cannot handle unicode? You.
>
> > Thanks but no--I work with a _lot_ of code I didn't write, and looking
> > through stack traces from 3rd party packages is not uncommon.
>
> Are you worried that some 3rd-party package you have
> included in your software will have some non-ascii identifiers
> buried in it somewhere?  Surely that is easy to check for?
> Far easier that checking that it doesn't have some trojan
> code it it, it seems to me.

What do you mean, "check for"?  If, say, numeric starts using math
characters (as has been suggested), I'm not exactly going to stop
using numeric.  It'll still be a lot better than nothing, just
slightly less better than it used to be.

> > And I'm often not creating a stack trace procedure, I'm using the
> > built-in python procedure.
>
> > And I'm often dealing with mailing lists, Usenet, etc where I don't
> > know ahead of time what the other end's display capabilities are, how
> > to fix them if they don't display what I'm trying to send, whether
> > intervening systems will mangle things, etc.
>
> I think we all are in this position.  I always send plain
> text mail to mailing lists, people I don't know etc.  But
> that doesn't mean that email software should be contrainted
> to only 7-bit plain text, no attachements!  I frequently use
> such capabilities when they are appropriate.

Sure.  But when you're talking about maintaining code, there's a very
high value to having all the existing tools work with it whether
they're wide-character aware or not.

> If your response is, "yes, but look at the problems html
> email, virus infected, attachements etc cause", the situation
> is not the same.  You have little control over what kind of
> email people send you but you do have control over what
> code, libraries, patches, you choose to use in your
> software.
>
> If you want to use ascii-only, do it!  Nobody is making
> you deal with non-ascii code if you don't want to.

Yes.  But it's not like this makes things so horribly awful that it's
worth my time to reimplement large external libraries.  I remain at -0
on the proposal; it'll cause some headaches for the majority of
current Python programmers, but it may have some benefits to a
sizeable minority and may help bring in new coders.  And it's not
going to cause flaming catastrophic death or anything.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

Neil Hodgson schrieb:
> Martin v. Löwis:
> 
>> ... regardless of whether this PEP gets accepted
>> or not (which it just did).
> 
>Which version can we expect this to be implemented in?

The PEP says 3.0, and the planned implementation also targets
that release.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Neil Hodgson

Martin v. Löwis:

> ... regardless of whether this PEP gets accepted
> or not (which it just did).

Which version can we expect this to be implemented in?

Neil
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread [EMAIL PROTECTED]

On May 17, 2:30 pm, Gregor Horvath <[EMAIL PROTECTED]> wrote:
> Istvan Albert schrieb:
>
>
>
> > After the first time that your programmer friends need fix a trivial
> > bug in a piece of code that does not display correctly in the terminal
> > I can assure you that their mellow acceptance will turn to something
> > entirely different.
>
> Is there any difference for you in debugging this code snippets?
>
> class Türstock(object):
[snip]
> class Tuerstock(object):

After finding a platform where those are different, I have to say
yes.  Absolutely.  In my normal setup they both display as "class
Tuerstock" (three letters 'T' 'u' 'e' starting the class name).  If,
say, an exception was raised, it'd be fruitless for me to grep or
search for "Tuerstock" in the first one, and I might wind up wasting a
fair amount of time if a user emailed that to me before realizing that
the stack trace was just wrong.  Even if I had extended character
support, there's no guarantee that all the users I'm supporting do.
If they do, there's no guarantee that some intervening email system
(or whatever) won't munge things.

With the second one, all my standard tools would work fine.  My user's
setups will work with it.  And there's a much higher chance that all
the intervening systems will work with it.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

>> At the same time it takes some mental effort to analyze and understand
>> all the implications of a feature, and without taking that effort
>> "something" will always beat "nothing".
>>
> Indeed. For example, getattr() and friends now have to accept Unicode
> arguments, and presumably to canonicalize correctly to avoid errors, and
> treat equivalent Unicode and ASCII names as the same (question: if two
> strings compare equal, do they refer to the same name in a namespace?).

Actually, that is not an issue: In Python 3, there is no data type for
"ASCII string" anymore, so all __name__ attributes and __dict__ keys
are Unicode strings - regardless of whether this PEP gets accepted
or not (which it just did).

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Steve Holden

Gregor Horvath wrote:
> Istvan Albert schrieb:
>> After the first time that your programmer friends need fix a trivial
>> bug in a piece of code that does not display correctly in the terminal
>> I can assure you that their mellow acceptance will turn to something
>> entirely different.
>>
> 
> Is there any difference for you in debugging this code snippets?
> 
> class Türstock(object):
>höhe = 0
>breite = 0
>tiefe = 0
> 
>def _get_fläche(self):
>  return self.höhe * self.breite
> 
>fläche = property(_get_fläche)
> 
> #---
> 
> class Tuerstock(object):
>hoehe = 0
>breite = 0
>tiefe = 0
> 
>def _get_flaeche(self):
>  return self.hoehe * self.breite
> 
>flaeche = property(_get_flaeche)
> 
> 
> I can tell you that for me and for my costumers this makes a big difference.
> 
So you are selling to the clothing market? [I think you meant 
"customers". God knows I have no room to be snitty about other people's 
typos. Just thought it might raise a smile].

> Whether this PEP gets accepted or not I am going to use German 
> identifiers and you have to be frightened to death by that fact ;-)
> 
That's fine - they will be at least as meaningful to you as my English 
ones would be to your countrymen who don't speah English.

I think we should remember that while programs are about communication 
there's no requirement for (most of) them to be universally comprehensible.

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden
-- Asciimercial -
Get on the web: Blog, lens and tag your way to fame!!
holdenweb.blogspot.comsquidoo.com/pythonology
tagged items: del.icio.us/steve.holden/python
All these services currently offer free registration!
-- Thank You for Reading 

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Steve Holden

Istvan Albert wrote:
> On May 17, 9:07 am, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> 
>> up. I interviewed about 20 programmers (none of them Python users), and
>> most took the position "I might not use it myself, but it surely
>> can't hurt having it, and there surely are people who would use it".
> 
> Typically when you ask people about esoteric features that seemingly
> don't affect them but might be useful to someone, the majority will
> say yes. Its simply common courtesy, its is not like they have to do
> anything.
> 
> At the same time it takes some mental effort to analyze and understand
> all the implications of a feature, and without taking that effort
> "something" will always beat "nothing".
> 
Indeed. For example, getattr() and friends now have to accept Unicode 
arguments, and presumably to canonicalize correctly to avoid errors, and 
treat equivalent Unicode and ASCII names as the same (question: if two 
strings compare equal, do they refer to the same name in a namespace?).

> After the first time that your programmer friends need fix a trivial
> bug in a piece of code that does not display correctly in the terminal
> I can assure you that their mellow acceptance will turn to something
> entirely different.
> 
And pretty quickly, too.  If anyone but Martin were the author of the 
PEP I'd have serious doubts, but if he thinks it's worth proposing 
there's at least a chance that it will eventually be implemented.

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden
-- Asciimercial -
Get on the web: Blog, lens and tag your way to fame!!
holdenweb.blogspot.comsquidoo.com/pythonology
tagged items: del.icio.us/steve.holden/python
All these services currently offer free registration!
-- Thank You for Reading 

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Gregor Horvath

Istvan Albert schrieb:
> 
> After the first time that your programmer friends need fix a trivial
> bug in a piece of code that does not display correctly in the terminal
> I can assure you that their mellow acceptance will turn to something
> entirely different.
> 

Is there any difference for you in debugging this code snippets?

class Türstock(object):
   höhe = 0
   breite = 0
   tiefe = 0

   def _get_fläche(self):
 return self.höhe * self.breite

   fläche = property(_get_fläche)

#---

class Tuerstock(object):
   hoehe = 0
   breite = 0
   tiefe = 0

   def _get_flaeche(self):
 return self.hoehe * self.breite

   flaeche = property(_get_flaeche)


I can tell you that for me and for my costumers this makes a big difference.

Whether this PEP gets accepted or not I am going to use German 
identifiers and you have to be frightened to death by that fact ;-)

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread rurpy

On May 17, 4:56 am, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
...
> (look me in the eye and tell me that "def" is
> an English word, or that "getattr" is one)

That's not quite fair.  They are not english
words but they are derived from english and
have a memonic value to english speakers that
they don't (or only accidently) have for
non-english speakers.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Istvan Albert

On May 17, 9:07 am, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:

> up. I interviewed about 20 programmers (none of them Python users), and
> most took the position "I might not use it myself, but it surely
> can't hurt having it, and there surely are people who would use it".

Typically when you ask people about esoteric features that seemingly
don't affect them but might be useful to someone, the majority will
say yes. Its simply common courtesy, its is not like they have to do
anything.

At the same time it takes some mental effort to analyze and understand
all the implications of a feature, and without taking that effort
"something" will always beat "nothing".

After the first time that your programmer friends need fix a trivial
bug in a piece of code that does not display correctly in the terminal
I can assure you that their mellow acceptance will turn to something
entirely different.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> I'd suggest restricting identifiers under the rules of UTS-39,
> profile 2, "Highly Restrictive".  This limits mixing of scripts
> in a single identifier; you can't mix Hebrew and ASCII, for example,
> which prevents problems with mixing right to left and left to right
> scripts.  Domain names have similar restrictions.

That sounds interesting, however, I cannot find the document
your refer to. In TR 39 (also called Unicode Technical Standard #39),
at http://unicode.org/reports/tr39/ there is no mentioning
of numbered profiles, or "Highly Restrictive".

Looking at the document, it seems 3.1., "General Security Profile
for Identifiers" might apply. IIUC, xidmodifications.txt would
have to be taken into account.

I'm not quite sure what that means; apparently, a number of
characters (listed as restricted) should not be used in
identifiers. OTOH, it also adds HYPHEN-MINUS and KATAKANA
MIDDLE DOT - which surely shouldn't apply to Python
identifiers, no? (at least HYPHEN-MINUS already has a meaning
in Python, and cannot possibly be part of an identifier).

Also, mixed-script detection might be considered, but it is
not clear to me how to interpret the algorithm in section
5, plus it says that this is just one of the possible
algorithms.

Finally, Confusable Detection is difficult to perform on
a single identifier - it seems you need two of them to
find out whether they are confusable.

In any case, I added this as an open issue to the PEP.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Istvan Albert

On May 16, 11:09 pm, Gregor Horvath <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] schrieb:
>
> > On May 16, 12:54 pm, Gregor Horvath <[EMAIL PROTECTED]> wrote:
> >> Istvan Albert schrieb:
>
> >> So the solution is to forbid Chinese XP ?

Who said anything like that? It's just an example of surprising and
unexpected difficulties that may arise even when doing trivial things,
and that proponents do not seem to want to admit to.

> Should computer programming only be easy accessible to a small fraction
> of privileged individuals who had the luck to be born in the correct
> countries?

> Should the unfounded and maybe xenophilous fear of loosing power and
> control of a small number of those already privileged be a guide for
> development?

Now that right there is your problem. You are reading a lot more into
this than you should. Losing power, xenophilus(?) fear, privileged
individuals,

just step back and think about it for a second, it's a PEP and people
have different opinions, it is very unlikely that there is some
generic sinister agenda that one must be subscribed to

i.



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread rurpy

On May 16, 8:49 pm, Gregor Horvath <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] schrieb:
> >> 2) Create a way to internationalize the standard library (and possibly
> >> the language keywords, too). Ideally, create a general standardized way
> >> to internationalize code, possibly similiar to how people
> >> internationalize strings today.
> >
> > Why?  Or more acurately why before adopting the PEP?
> > The library is very usable by non-english speakers as long as
> > there is documentation in their native language.  It would be
>
> Microsoft once translated their VBA to foreign languages.
> I didn't use it because I was used to "English" code.
> If I program in mixed cultural contexts I have to use to smallest
> dominator. Mixing the symbols of the programming language is confusing.

Yup, I agree wholeheartedly.  So do almost all
the other people who have responded in this thread.
In public code, open source code, code being worked
on by people from different countries, English is almost
always the best choice.

Nothing in the PEP interferes with or prevents this.
The PEP only allows non-ascii indentifiers, when they
are appropriate: in code that is unlikely to be ever
be touched by people who don't know that language.
(Obviously any language feature can be misused
but peer-pressure, documentation, and education
have been very effective in preventing such misuse.
There is no reason they shouldn't be effective
here too.)

And yes, some code will be developed in a single
language enviroment and then be found to be useful
to a wider audience.  It's not the end of the world.
It is no worse than when code written with a single
language UI that is becomes public -- it will get
fixed so that it meets the standards for a internationaly
collaborative project.  Seems to me that replacing
identifiers with english ones is fairly trivial
isn't it?  One can identify identifiers by parsing
the program and replacing them from a prepared table
of replacements?  This seems much easier than fixing
comments and docstrings which need to be done by
hand.  But the comment/docstring problem exists now
and has nothing to do with the PEP.

> Long time ago at the age of 12 I learned programming using English
> Computer books. Then there were no German books at all. It was not easy.
> It would have been completely impossible if our schools system would not
> have been wise enough to teach as English early.
>
> I think millions of people are handicapped because of this.
> Any step to improve this, is a good step for all of us. In no doubt
> there are a lot of talents wasted because of this wall.

I agree that anyone who wants to be a programmer is
well advised to learn English.  I would also advise
anyone who wants to be a programmer to go to college.
But I have met very good programmers who were not
college graduates and although I don't know any non-
english speakers I am sure there are very good programers
who don't know English.

There is a big difference between encouraging someone
to do something, and taking steps to make them do
something.

A lot of the english-only retoric in this thread seems
very reminiscent of arguments a decade+ ago regarding
wide characters and unicode, and other i18n support.
"Computing is ascii-based, we don't need all this
crap, and besides, it doubles the memory used by strings!
English is good enough".  Except of course that it wasn't.

When technology demands that people adapt to it, it looses.
When technology adapts to the needs of people, it wins.

The fundamental question is whether languages designers,
or the people writing the code, should be the ones to
decide what language identifiers are most appropriate
for their program.  Do language designers, all of whom
are English speakers, have the wisdom to decide for
programmers all over the world, and for years to come,
that they must learn English to use Python effectively?
And if they do, will the people affected agree, or
will they choose a different language?

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Gregor Horvath

Martin v. Löwis schrieb:

> I've reported this before, but happily do it again: I have lived many
> years without knowing what a "hub" is, and what "to pass" means if
> it's not the opposite of "to fail". Yet, I have used their technical
> meanings correctly all these years.

That's not only true for computer terms.
In the German Viennese slang there are a lot of Italian, French, 
Hungarian, Czech, Hebrew and Serbocroatien words. Nobody knows the exact 
meaning in their original language (nor does the vast majority actually 
speak those languages), but all are used in the correct original context.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> I claim that this is *completely unrealistic*. When learning Python, you
> *do* learn the actual meanings of English terms like "open",
> "exception", "if" and so on if you did not know them before. It would be
> extremely foolish not to do so.

Having taught students for many years now, I can report that this is
most certainly *not* the case. Many people learn only ever the technical
meaning of some term, and never grasp the English meaning. They could
look into a dictionary, but they rather read the documentation.

I've reported this before, but happily do it again: I have lived many
years without knowing what a "hub" is, and what "to pass" means if
it's not the opposite of "to fail". Yet, I have used their technical
meanings correctly all these years.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> However, what I want to see is how people deal with such issues when
> sharing their code: what are their experiences and what measures do
> they mandate to make it all work properly? You can see some
> discussions about various IDEs mandating UTF-8 as the default
> encoding, along with UTF-8 being the required encoding for various
> kinds of special Java configuration files. 

I believe the problem is solved when everybody uses Eclipse.
You can set a default encoding for all Java source files in a project,
and you check the project file into your source repository.
Eclipse both provides the editor and drives the compiler, and
does so in a consistent way.

> Yes, it should reduce confusion at a technical level. But what about
> the tools, the editors, and so on? If every computing environment had
> decent UTF-8 support, wouldn't it be easier to say that everything has
> to be in UTF-8? 

For both Python and Java, it's too much historical baggage already.
When source encodings were introduced to Python, allowing UTF-8
only was already proposed. People rejected it at the time, because
a) they had source files where weren't encoded in UTF-8, and
   were afraid of breaking them, and
b) their editors would not support UTF-8.

So even with Python 3, UTF-8 is *just* the default default encoding.
I would hope that all Python IDEs, over time, learn about this
default, until then, users may have to manually configure their
IDEs and editors. With a default of UTF-8, it's still simpler than
with PEP 263: you can say that .py files are UTF-8, and your
editor will guess incorrectly only if there is an encoding
declaration other than UTF-8.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> After 175 replies (and counting), the only thing that is clear is the
> controversy around this PEP. Most people are very strong for or
> against it, with little middle ground in between. I'm not saying that
> every change must meet 100% acceptance, but here there is definitely a
> strong opposition to it. Accepting this PEP would upset lots of people
> as it seems, and it's interesting that quite a few are not even native
> english speakers.

I believe there is a lot of middle ground, but those people don't speak
up. I interviewed about 20 programmers (none of them Python users), and
most took the position "I might not use it myself, but it surely
can't hurt having it, and there surely are people who would use it".
2 people were strongly in favor, and 3 were strongly opposed.

Of course, those people wouldn't take a lot of effort to defend their
position in a usenet group. So that the majority of the responses
comes from people with strong feelings either way is no surprise.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> In the code I was looking at identifiers were allowed to use non-ASCII
> characters.  For whatever reason, the programmers choose not use non-ASCII
> indentifiers even though they had no problem using non-ASCII characters
> in commonets.

One possible reason is that the tools processing the program would not
know correctly what encoding the source file is in, and would fail
when they guessed the encoding incorrectly. For comments, that is not
a problem, as an incorrect encoding guess has no impact on the meaning
of the program (if the compiler is able to read over the comment
in the first place).

Another possible reason is that the programmers were unsure whether
non-ASCII identifiers are allowed.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

René Fleschenberg schrieb:
> Stefan Behnel schrieb:
>> Then get tools that match your working environment.
> 
> Integration with existing tools *is* something that a PEP should
> consider. This one does not do that sufficiently, IMO.

What specific tools should be discussed, and what specific problems
do you expect?

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

>> So, please provide feedback, e.g. perhaps by answering these
>> questions:
>> - should non-ASCII identifiers be supported? why?
> 
> I think the biggest argument against this PEP is how little similar
> features are used in other languages and how poorly they are supported
> by third party utilities.  Your PEP gives very little thought to how
> the change would affect the standard Python library.  Are non-ASCII
> identifiers going to be poorly supported in Python's own library and
> utilities?

For other languages (in particular Java), one challenge is that
you don't know the source encoding - it's neither fixed, nor is
it given in the source code file itself.

Instead, the environment has to provide the source encoding, and that
makes it difficult to use. The JDK javac uses the encoding from the
locale, which is non-sensical if you check-out source from a
repository. Eclipse has solved the problem: you can specify source
encoding on a per-project basis, and it uses that encoding
consistently in the editor and when running the compiler.

For Python, this problem was solved long ago: PEP 263 allows to
specify the source encoding within the file, and there was
always a default encoding. The default encoding will change to
UTF-8 in Python 3.

IDLE has been supporting PEP 263 from the beginning, and several
other editors support it as well. Not sure what other tools
you have in mind, and what problems you expect.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Sion Arrowsmith

Hendrik van Rooyen <[EMAIL PROTECTED]> wrote:
>"Sion Arrowsmith" <[EMAIL PROTECTED]> wrote:
>>Hendrik van Rooyen wrote:
>>>I still don't like the thought of the horrible mix of "foreign"
>>>identifiers and English keywords, coupled with the English 
>>>sentence construction.
>>How do you think you'd feel if Python had less in the way of
>>(conventionally used) English keywords/builtins. Like, say, Perl?
>Would not like it at all, for the same reason I don't like re's -
>It looks like random samples out of alphabet soup to me.

What I meant was, would the use of "foreign" identifiers look so
horrible to you if the core language had fewer English keywords?
(Perhaps Perl, with its line-noise, was a poor choice of example.
Maybe Lisp would be better, but I'm not so sure of my Lisp as to
make such an assertion for it.)

-- 
\S -- [EMAIL PROTECTED] -- http://www.chaos.org.uk/~sion/
   "Frankly I have no feelings towards penguins one way or the other"
-- Arthur C. Clarke
   her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Hendrik van Rooyen



> > Now look me in the eye and tell me that you find
> > the mix of proper German and English keywords
> > beautiful.
> 
> I can't admit that, but I find that using German
> class and method names is beautiful. The rest around
> it (keywords and names from the standard library)
> are not English - they are Python.
> 
> (look me in the eye and tell me that "def" is
> an English word, or that "getattr" is one)
> 
> Regards,
> Martin

LOL - true - but a broken down assembler programmer like me
does not use getattr - and def is short for define, and for and while
and in are not German.

Looks like you have stirred up a hornets nest...

- Hendrik


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> You could say the same about Python standard library and keywords then.
> Shouldn't these also have to be translated? One can even push things a
> little further: I don't know about the languages used in the countries
> you mention, but for example, a simple construction like 'if 
> ' will look weird to a Japanese (the Japanese language has
> a "post-fix" feel: the equivalent of the 'if' is put after the
> condition). So why enforce an English-like sentence structure?

The Python syntax does not use an English-like sentence structure.
In English, a statement follows the pretty strict sequence of subject,
predicate, object (SPO). In Python, statements don't have a subject;
some don't even have a verb (e.g. assignments).

Regardless, this PEP does not propose to change the syntax of the
language, because doing so would cause technical problems - unlike
the proposed PEP, which does not cause any technical problems to
the language implementation whatsoever (and only slight technical
problems to editors, which aren't worse than the ones cause by
PEP 263).

> You have a point here. When learning to program, or when programming for
> fun without any intention to do something serious, it may be better to
> have a language supporting "native" characters in identifiers. My
> problem is: if you allow these, how can you prevent them from going
> public someday?

You can't, and you shouldn't. What you can prevent is that the code
enters *your* project. I cannot see why you want to censor what code
other people publish.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> IMO, the burden of proof is on you. If this PEP has the potential to
> introduce another hindrance for code-sharing, the supporters of this PEP
> should be required to provide a "damn good reason" for doing so. So far,
> you have failed to do that, in my opinion. All you have presented are
> vague notions of rare and isolated use-cases.

The PEP explicitly states what the damn good reason is: "Such developers
often desire to define classes and functions with names in their native
languages, rather than having to come up with an (often incorrect)
English translation of the concept they want to name."

So the reason is that with this PEP, code clarity and readability will
become better. It's the same reason as for many other features
introduced into Python recently, e.g. the with statement.

If you doubt the claim, please indicate which of these three aspects
you doubt:
1. there are programmers which desire to defined classes and functions
   with names in their native language.
2. those developers find the code clearer and more maintainable than
   if they had to use English names.
3. code clarity and maintainability is important.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> Consequently, Python's keywords and even the standard library can
> exist with names being "just symbols" for many people.

I already told that on the py3k list: Until a week ago, I didn't know
why "pass" was chosen for the "no action" statement - with all my
English knowledge, I still could not understand why the opposite
of "fail" should mean "no action".

Still, I have been using "pass" for more than 10 years now, without
ever questioning what it means in English, and I've successfully
used it as a token. Except for the first draft of Das Python-Buch,
where I, from memory, thought the statement should be "skip";
I remembered it had four letters, and meant "go to the next line".

Now I understand it is meaning 12 in Merriam-Webster's dictionary,
a) "to decline to bid, double, or redouble in a card game", or b)
"to let something go by without accepting or taking
advantage of it".

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Bjoern Schliessmann

"Martin v. Löwis" wrote:
> I can't admit that, but I find that using German
> class and method names is beautiful. The rest around
> it (keywords and names from the standard library)
> are not English - they are Python.
> 
> (look me in the eye and tell me that "def" is
> an English word, or that "getattr" is one)

He's got a point (a small one though). For example:

- self (can be changed though)
- is 
- with
- isinstance
- try

Regards,


Björn

-- 
BOFH excuse #435:

Internet shut down due to maintenance

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> A possible modification to the PEP would be to permit identifiers to
> also include \u and \U escape sequences (as some other
> languages already do).

Several languages do that (e.g. C and C++), but I deliberately left
this out, as I cannot see this work in a practical way. Also,
it could be added later as another extension if there is an actual
need.

> I think this would remove several of the objections: such as being
> unable to tell at a glance whether someone is trying to spoof your
> variable names,

If you are willing to run a script on the patch you receive, you
can perform that check even without having support for the \u
syntax in the language - either you convert to the \u notation,
and then check manually (converting back if all is fine), or you
have an automated check (e.g. at commit time) that checks for
conformance to the style guide.

> or being unable to do minor maintenance on code using
> character sets which your editor doesn't support: you just run the
> script which would be included with every copy of Python to restrict the
> character set of the source files to whatever character set you feel
> happy with. The script should also be able to convert unrepresentable
> characters in strings and comments (although that last operation
> wouldn't be guaranteed reversible). 

Again, if it's reversible, you don't need support for it in the
language. You convert to your editor's supported Unicode subset,
edit, then convert back.

However, I somewhat doubt that this case "my editor cannot display
my source code" is likely to occur: if the editor cannot display
it, you likely have a ban on those characters, anyway.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

> Now look me in the eye and tell me that you find
> the mix of proper German and English keywords
> beautiful.

I can't admit that, but I find that using German
class and method names is beautiful. The rest around
it (keywords and names from the standard library)
are not English - they are Python.

(look me in the eye and tell me that "def" is
an English word, or that "getattr" is one)

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis

>PEP 3131 uses a similar definition to C# except that PEP 3131
> disallows formatting characters (category Cf). See section 9.4.2 of
> http://www.ecma-international.org/publications/standards/Ecma-334.htm

UAX#31 discusses formatting characters in 2.2, and recognizes that
there might be good reasons to allow (and ignore) them; however,
it recommends against doing so except in special cases.

So I decided to disallow them.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Gregor Horvath

Hendrik van Rooyen schrieb:

> I can sympathise a little bit with a customer who tries to read code.
> Why that should be necessary, I cannot understand - does the stuff
> not work to the extent that the customer feels he has to help you?
> You do not talk as if you are incompetent, so I see no reason why 
> the customer should want to meddle in what you have written, unless
> he is paying you to train him to program, and as Eric Brunel has 
> pointed out, this mixing of languages is all right in a training environment.

That is highly domain and customer specific individual logic, that the 
costumer knows best. (For example variation logic of window and door 
manufacturers)
He has to understand the code, so that he can verify it's correct.
We are in fact developing it together.
Some costumers even are coding this logic themselves. Some of them are 
not fluent in English especially not in the computer domain.

Translating the logic into a documentation is a waste of time if the 
code is self documenting and easy to grasp. (As python usually is) But 
the code can only be self documenting if it is written in the domain 
specific language of the customer. Sometimes these are words that are 
not even used in general German. Even in German different customers are 
naming the same thing with different words. Talking and coding in the 
language of the customer is a huge benefit.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Duncan Booth

"Gabriel Genellina" <[EMAIL PROTECTED]> wrote:

> - Someone proposed using escape sequences of some kind, supported by  
> editor plugins, so there is no need to modify the parser.

I'm not sure whether my suggestion below is the same as or a variation
on this. 

> 
> - Refactoring tools should let you rename foreign identifiers into
> ASCII  only.

A possible modification to the PEP would be to permit identifiers to
also include \u and \U escape sequences (as some other
languages already do). Then you could have a script easily (and
reversibly) convert all identifiers to ascii or indeed any other
encoding or subset of unicode using escapes only for the unrepresentable
characters. 

I think this would remove several of the objections: such as being
unable to tell at a glance whether someone is trying to spoof your
variable names, or being unable to do minor maintenance on code using
character sets which your editor doesn't support: you just run the
script which would be included with every copy of Python to restrict the
character set of the source files to whatever character set you feel
happy with. The script should also be able to convert unrepresentable
characters in strings and comments (although that last operation
wouldn't be guaranteed reversible). 

Of course it doesn't do anything for the objection about such
identifiers being ugly, but you can't have everything.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Hendrik van Rooyen

"Gregor Horvath" <[EMAIL PROTECTED]> wrote:

> Hendrik van Rooyen schrieb:
> 
> > It is not so much for technical reasons as for aesthetic 
> > ones - I find reading a mix of languages horrible, and I am
> > kind of surprised by the strength of my own reaction.
> 
> This is a matter of taste.

I agree - and about perceptions of quality. Of what is good, 
and not good. - If you havent yet, read Robert Pfirsig's book:
"Zen and the art of motorcycle maintenance"

> In some programs I use German identifiers (not unicode). I and others 
> like the mix. My customers can understand the code better. (They are 
> only reading it)
> 

I can sympathise a little bit with a customer who tries to read code.
Why that should be necessary, I cannot understand - does the stuff
not work to the extent that the customer feels he has to help you?
You do not talk as if you are incompetent, so I see no reason why 
the customer should want to meddle in what you have written, unless
he is paying you to train him to program, and as Eric Brunel has 
pointed out, this mixing of languages is all right in a training environment.

> > 
> > "Beautiful is better than ugly"
> 
> Correct.
> But why do you think you should enforce your taste to all of us?

You misjudge me - the OP asked if I would use the feature, and I am 
speaking for myself when I explain why I would not use it.

> 
> With this logic you should all drive Alfa Romeos!
> 

Actually no - this is not about logic - my post clearly stated
that I was talking about feelings.  And the only logic that applies 
to feelings is the incontrovertible fact that they exist, and that it
makes good logical sense to acknowledge them, and to take that
into account in one's actions.

And as far as Alfa's go - we have found here that they are rather 
soft - our dirt roads destroy them in no time.  : - (

- Hendrik

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Hendrik van Rooyen


"Sion Arrowsmith" <[EMAIL PROTECTED]> wrote:


>Hendrik van Rooyen wrote:
>
>>I still don't like the thought of the horrible mix of "foreign"
>>identifiers and English keywords, coupled with the English 
>>sentence construction.
>
>How do you think you'd feel if Python had less in the way of
>(conventionally used) English keywords/builtins. Like, say, Perl?

Would not like it at all, for the same reason I don't like re's -
It looks like random samples out of alphabet soup to me.

- Hendrik


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread rurpy

On May 16, 1:37 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> On May 16, 12:54 pm, Gregor Horvath <[EMAIL PROTECTED]> wrote:
>
> > Istvan Albert schrieb:
>
> > > Here is something that just happened and relates to this subject: I
> > > had to help a student run some python code on her laptop, she had
> > > Windows XP that hid the extensions. I wanted to set it up such that
> > > the extension is shown. I don't have XP in front of me but when I do
> > > it takes me 15 seconds to do it. Now her Windows was set up with some
> > > asian fonts (Chinese, Korean not sure), looked extremely unfamiliar
> > > and I had no idea what the menu systems were. We have spent quite a
> > > bit of time figuring out how to accomplish the task. I had her read me
> > > back the options, but something like "hide extensions" comes out quite
> > > a bit different. Surprisingly tedious and frustrating experience.
>
> > So the solution is to forbid Chinese XP ?
>
> It's one solution, depending on your support needs.
>
> Independent of Python, several companies I've worked at in Ecuador
> (entirely composed of native Spanish-speaking Ecuadoreans) use the
> English-language OS/application installations--they of course have the
> Spanish dictionaries and use Spanish in their documents, but for them,
> having localized application menus generates a lot more problems than
> it solves.

Isn't the point of PEP-3131 free choice?  How would
Ecuadoreans feel if their government mandated all
computers must use English?


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Gabriel Genellina

En Mon, 14 May 2007 13:30:42 -0300, <[EMAIL PROTECTED]> escribió:

> Although probably not-sufficient to overcome this built-in
> bias, it would be interesting if some bi-lingual readers would
> raise this issue in some non-english Python discussion
> groups to see if the opposition to this idea is as strong
> there as it is here.

Survey results from a Spanish-speaking group and a local group from  
Argentina:
Yes:6
No: 3
Total:  9

Comments summary:

- Spanish requires few additional characters in addition to ASCII letters:  
ñáéíóúü, so there is no great need of Unicode identifiers by Spanish  
developers.

- Python can be embedded and extended using libraries - in those cases,  
what matters mostly is the domain specific usage. Letting the final users  
write their scripts/tasklets/etc using domain-specific and  
language-specific names would be a great thing.

- Would be nice if class attribute names could correspond to table column  
names directly; would be nice to use the Pi greek symbol, by example, in  
math formulae.

- Others raised already seen concerns: about source code legibility; being  
unable to type identifiers; risk of keywords being translated; that you  
can't know in advance whether your code will become widely published so  
best to use English identifiers from start.

- Someone proposed using escape sequences of some kind, supported by  
editor plugins, so there is no need to modify the parser.

- Refactoring tools should let you rename foreign identifiers into ASCII  
only.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Gregor Horvath

[EMAIL PROTECTED] schrieb:

> On May 16, 12:54 pm, Gregor Horvath <[EMAIL PROTECTED]> wrote:
>> Istvan Albert schrieb:
>>
>> So the solution is to forbid Chinese XP ?
>>
> 
> It's one solution, depending on your support needs.
> 

That would be a rather arrogant solution.
You would consider dropping the language and culture of millions of 
users because a small number of support team staff does not understand 
it? I would recommend to drop the support team and the management that 
even considers this.

This PEP is not a technical question.
Technically it would no change much.

The underlying question is a philosophical one.
Should computer programming only be easy accessible to a small fraction 
of privileged individuals who had the luck to be born in the correct 
countries?

Should the unfounded and maybe xenophilous fear of loosing power and 
control of a small number of those already privileged be a guide for 
development?

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Gregor Horvath

[EMAIL PROTECTED] schrieb:

>> 2) Create a way to internationalize the standard library (and possibly
>> the language keywords, too). Ideally, create a general standardized way
>> to internationalize code, possibly similiar to how people
>> internationalize strings today.
> 
> Why?  Or more acurately why before adopting the PEP?
> The library is very usable by non-english speakers as long as
> there is documentation in their native language.  It would be

Microsoft once translated their VBA to foreign languages.
I didn't use it because I was used to "English" code.
If I program in mixed cultural contexts I have to use to smallest 
dominator. Mixing the symbols of the programming language is confusing.

Long time ago at the age of 12 I learned programming using English 
Computer books. Then there were no German books at all. It was not easy. 
It would have been completely impossible if our schools system would not 
have been wise enough to teach as English early.

I think millions of people are handicapped because of this.
Any step to improve this, is a good step for all of us. In no doubt 
there are a lot of talents wasted because of this wall.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread rurpy

On May 16, 11:41 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> Christophe wrote:
snip...
> > Who displays stack frames? Your code. Whose code includes unicode
> > identifiers? Your code. Whose fault is it to create a stack trace
> > display procedure that cannot handle unicode? You.
>
> Thanks but no--I work with a _lot_ of code I didn't write, and looking
> through stack traces from 3rd party packages is not uncommon.

Are you worried that some 3rd-party package you have
included in your software will have some non-ascii identifiers
buried in it somewhere?  Surely that is easy to check for?
Far easier that checking that it doesn't have some trojan
code it it, it seems to me.

> And I'm often not creating a stack trace procedure, I'm using the
> built-in python procedure.
>
> And I'm often dealing with mailing lists, Usenet, etc where I don't
> know ahead of time what the other end's display capabilities are, how
> to fix them if they don't display what I'm trying to send, whether
> intervening systems will mangle things, etc.

I think we all are in this position.  I always send plain
text mail to mailing lists, people I don't know etc.  But
that doesn't mean that email software should be contrainted
to only 7-bit plain text, no attachements!  I frequently use
such capabilities when they are appropriate.

If your response is, "yes, but look at the problems html
email, virus infected, attachements etc cause", the situation
is not the same.  You have little control over what kind of
email people send you but you do have control over what
code, libraries, patches, you choose to use in your
software.

If you want to use ascii-only, do it!  Nobody is making
you deal with non-ascii code if you don't want to.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread rurpy

On May 16, 1:44 am, René Fleschenberg <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] schrieb:
> > I'm not sure how you conclude that no problem exists.
> > - Meaningful identifiers are critical in creating good code.
>
> I agree.
>
> > - Non-english speakers can not create or understand
> >   english identifiers hence can't create good code nor
> >   easily grok existing code.
>
> I agree that this is a problem, but please understand that is problem is
> _not_ solved by allowing non-ASCII identifiers!
>
> > Considering the vastly greater number of non-English
> > spreakers in the world, who are not thus unable to use
> > Python effectively, seems like a problem to me.
>
> Yes, but this problem is not really addressed by the PEP.

I agree that the PEP does not provide a perfect solution
(whatever that is) to the difficulties faced by non-english
speaking Python users, but it provides a big and useful
improvement.

> If you want to
> do something about this:
> 1) Translate documentation.

Done.  (In some cases.)  For example here are the Standard
Library and Python Tutorial in Japanese:

http://www.python.jp/doc/release/lib/lib.html
http://www.python.jp/doc/release/tut/tut.html

(I mentioned this yesterday in
http://groups.google.com/group/comp.lang.python/msg/6ca67e21e9dc5358?hl=en&;
but I can't critisize anyone for missing messages in this
hughmongous disscusion :-)

> 2) Create a way to internationalize the standard library (and possibly
> the language keywords, too). Ideally, create a general standardized way
> to internationalize code, possibly similiar to how people
> internationalize strings today.

Why?  Or more acurately why before adopting the PEP?
The library is very usable by non-english speakers as long as
there is documentation in their native language.  It would be
nice to have an internationalized standard library but there is
no reason why this should be a prerequisite to the PEP.

> When that is done, non-ASCII identifiers could become useful. But of
> course, doing that might create a hog of other problems.

I disagree, non-ascii identifiers are an immensely useful
change, right now.   Python is somewhat useable by non-english
speakers now, but the identifier issue is a significant barrier.
If I can't write code with identifiers that are meaninful to
me (and my non-fluent-in-english colleagues) then I either
write code that is hard to understand by anyone (using ascii
transliterations) or write code understandable to you but
not me (using english).  Neither option makes sense and
in practice I just use some language other than Python.

> > That all programers know enough english to create and
> > understand english identifiers is currently speculation or
> > based on tiny personaly observed samples.
>
> It is based on a look at the current Python environment. You do *at
> least* have the problem that the standard library uses English names.

I don't know every nook and corner of the standard library.
Even as an english speaker, I only look up names for things
I already know.  Those same things I recognise in code
because I use them.  Otherwise I look in the index (which
is in my native language if I am not fluent in english).  True,
I can't use docstrings effectively.  And true, I can't guess at
the use of an unfamiliar name (but I have documentation)
Neither of those prevent my effective use of Python, nor
negate the immense value to me of being able to write code
that I and my colleagues can maintain.

So I see the use of english names in the standard lib as
a small problem, certainly not a reason to reject the PEP.

> This assumes that there is documentation in the native language that is
> good enough (i.e. almost as good as the official one), which I can tell
> is not the case for German.

There is a chicken-and-egg problem here.  Why would
many non-english speaking people want to adopt Python
if they cannot write maintainable (for them) programs in it?
If there aren't many non-english speaking Python users,
why would anyone want to put the effort into translating
docs for them?  This is particularly true for people that
use scripts not based on latin letters.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Matthew Woodcraft

Eric Brunel <[EMAIL PROTECTED]> wrote:
> Joke aside, this just means that I won't ever be able to program math in  
> ADA, because I have absolutely no idea on how to do a 'pi' character on my  
> keyboard.

Just in case it wasn't clear: you could of course continue to use the
old name 'Pi' instead.

-M-

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread [EMAIL PROTECTED]

On May 16, 12:54 pm, Gregor Horvath <[EMAIL PROTECTED]> wrote:
> Istvan Albert schrieb:
>
> > Here is something that just happened and relates to this subject: I
> > had to help a student run some python code on her laptop, she had
> > Windows XP that hid the extensions. I wanted to set it up such that
> > the extension is shown. I don't have XP in front of me but when I do
> > it takes me 15 seconds to do it. Now her Windows was set up with some
> > asian fonts (Chinese, Korean not sure), looked extremely unfamiliar
> > and I had no idea what the menu systems were. We have spent quite a
> > bit of time figuring out how to accomplish the task. I had her read me
> > back the options, but something like "hide extensions" comes out quite
> > a bit different. Surprisingly tedious and frustrating experience.
>
> So the solution is to forbid Chinese XP ?
>

It's one solution, depending on your support needs.

Independent of Python, several companies I've worked at in Ecuador
(entirely composed of native Spanish-speaking Ecuadoreans) use the
English-language OS/application installations--they of course have the
Spanish dictionaries and use Spanish in their documents, but for them,
having localized application menus generates a lot more problems than
it solves.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread rurpy

"Hendrik van Rooyen" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> <[EMAIL PROTECTED]> wrote:

> > First "while" is a keyword and will remain "while" so
> > that has nothing to do with anything.
>
> I think this cuts right down to why I oppose the PEP.
> It is not so much for technical reasons as for aesthetic
> ones - I find reading a mix of languages horrible, and I am
> kind of surprised by the strength of my own reaction.

But to reiterate, most public code will remain english
because that is the only practical way of managing an
international project.

If don't understand this almost pathological fear that
if the PEP is adopted, the world will be deluged by
a torrent of non-english programs.  99.9% of such programs
will be born an die in an enviroment where only speakers
of those languages will touch them.

The few that leak into the wider world will have to
be internationalized before most people will consider
adopting them, volenteering to maintain them, etc.

And has been already pointed out this is already the
case.  How can you maintain a python program written
with only ascii identifiers but transliterated from
a non-english language and with documention, comments,
prompts and messages in that language?

This situation exists right now and it hasn't caused
the end of python-programming-as-we-know-it.

> If I try to analyse my feelings, I think that really the PEP
> does not go far enough, in a sense, and from memory
> it seems to me that only E Brunel, R Fleschenberg and
> to a lesser extent the Martellibot seem to somehow think
> in a similar way as I do, but I seem to have an extreme
> case of the disease...
>
> And the summaries of reasons for and against have left
> out objections based on this feeling of ugliness of mixed
> language.
>
> Interestingly, the people who seem to think a bit like that all
> seem to be non native English speakers who are fluent in
> English.

I have read that people who move to, or become citizens
of a new country often become far more patriotic and
defensive of their new country, then their native-born
compatriots.

> While the support seems to come from people whose English
> is perfectly adequate, but who are unsure to the extent that they
> apologise for their "bad" English.
>
> Is this a pattern that you have identified? - I don't know.
>
> I still don't like the thought of the horrible mix of "foreign"
> identifiers and English keywords, coupled with the English
> sentence construction.  And that, in a nutshell, is the main
> reason for my rather vehement opposition to this PEP.
>
> The other stuff about sharing and my inability to even type
> the OP's name correctly with the umlaut is kind of secondary
> to this feeling of revulsion.

Interesting explanation, thanks.  I personally feel a
lot of the reaction against the PEP involves psychological
drivers like loss of control and loss of status but am
not a psycologist so it would be too much work from me
to try and defend, so I won't try to.

I'll just say I think that making Python (significantly!!)
more accessible to non-English speakers is far too imporant
to both those potential new users as to Python itself,
that it should not be decided by "feelings".

> "Beautiful is better than ugly"

"Beauty is in the eye of the beholder"

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread laurentszyster

On May 13, 5:44 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> PEP 1 specifies that PEP authors need to collect feedback from the
> community. As the author of PEP 3131, I'd like to encourage comments
> to the PEP included below, either here (comp.lang.python), or to
> [EMAIL PROTECTED]
>
> In summary, this PEP proposes to allow non-ASCII letters as
> identifiers in Python. If the PEP is accepted, the following
> identifiers would also become valid as class, function, or
> variable names: Löffelstiel, changé, ошибка, or 売り場
> (hoping that the latter one means "counter").

+1

If only for one simple reason: JSON objects have UNICODE names and it
may be convenient from a Python web agent to be able to instanciate/
serialize any such object as-is ... to/from a Python class instead of
a dictionnary.

Regards,


Laurent Szyster

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread rurpy

On May 16, 9:04 am, "Eric Brunel" <[EMAIL PROTECTED]> wrote:
> On Wed, 16 May 2007 16:29:27 +0200, Neil Hodgson  
>
> <[EMAIL PROTECTED]> wrote:
...snip...
> > Each of these can be handled reasonably considering their frequency of  
> > occurrence. I have never learned Japanese but have had to deal with  
> > Japanese text at a couple of jobs and it isn't that big of a problem.  
> > Its certainly not "virtually impossible" nor is there "absolutely no way  
> > of entering the word" (売り場). I think you should moderate your  
> > exaggerations.
>
> I do admit it was a bit exaggerated: there actually are ways. You know it,  
> and I know it. But what about the average guy, not knowing anything about  
> Japanese, kanji, radicals and stroke counts? How will he manage to enter  
> these funny-looking characters, perhaps not even knowing it's Japanese?  
> And does he have to learn a new input method each time he stumbles across  
> identifiers written in a character set he doesn't know? And even if he  
> finds a way, the chances are that it will be terribly inefficient. Having  
> to pay attention on how you can type the things you want is a really big  
> problem when it comes to programming: you have a lot of other things to  
> think about.

What does this have to do with the adoption of PEP-3131?  Are you
saying that if non-english speakers are allowed to use non-english
identifiers in their code, that you will have to *write* code in a
language
you don't know using a script you don't know?

If you, for some extremely improbable reason *have* to modify the
code, then you will be cutting and pasting the existing variables.
If you are creating new variables, then given that you don't know
the language and have no idea what to name the variable, the
mechanics of entering it are the least of your problems.  Name
the new variable in ascii and leave it to a native speaker to fix
later.

If the aswer to that is, "see, non-english Python is bad", then
arguments
against *enforcing* english-only python are elsewhere in the thread
so I won't repeat here.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread [EMAIL PROTECTED]

Christophe wrote:
> [EMAIL PROTECTED] a ecrit :
> > Christophe wrote:
> >> [EMAIL PROTECTED] a ecrit :
> >>> Steven D'Aprano wrote:
>  I would find it useful to be able to use non-ASCII characters for heavily
>  mathematical programs. There would be a closer correspondence between the
>  code and the mathematical equations if one could write D(u*p) instead of
>  delta(mu*pi).
> >>> Just as one risk here:
> >>> When reading the above on Google groups, it showed up as "if one could
> >>> write ?(u*p)..."
> >>> When quoting it for response, it showed up as "could write D(u*p)".
> >>>
> >>> I'm sure that the symbol you used was neither a capital letter d nor a
> >>> question mark.
> >>>
> >>> Using identifiers that are so prone to corruption when posting in a
> >>> rather popular forum seems dangerous to me--and I'd guess that a lot
> >>> of source code highlighters, email lists, etc have similar problems.
> >>> I'd even be surprised if some programming tools didn't have similar
> >>> problems.
> >> So, it was google groups that continuously corrupted the good UTF-8
> >> posts by force converting them to ISO-8859-1?
> >>
> >> Of course, there's also the possibility that it is a problem on *your*
> >> side
> >
> > Well, that's part of the point isn't it?  It seems incredibly naive to
> > me to think that you could use whatever symbol was intended and have
> > it show up, and the "well fix your machine!" argument doesn't fly.  A
> > lot of the time programmers have to look at stack traces on end-user's
> > machines (whatever they may be) to help debug.  They have to look at
> > code on the (GUI-less) production servers over a terminal link.  They
> > have to use all kinds of environments where they can't install the
> > latest and greatest fonts.  Promoting code that becomes very hard to
> > read and debug in real situations seems like a sound negative to me.
>
> Who displays stack frames? Your code. Whose code includes unicode
> identifiers? Your code. Whose fault is it to create a stack trace
> display procedure that cannot handle unicode? You.

Thanks but no--I work with a _lot_ of code I didn't write, and looking
through stack traces from 3rd party packages is not uncommon.

And I'm often not creating a stack trace procedure, I'm using the
built-in python procedure.

And I'm often dealing with mailing lists, Usenet, etc where I don't
know ahead of time what the other end's display capabilities are, how
to fix them if they don't display what I'm trying to send, whether
intervening systems will mangle things, etc.

> Even if you don't
> make use of them, you still have to fix the stack trace display
> procedure because the exception error message can include unicode text
> *today*

It can, but having identifiers in portable characters at least allows
some ability to navigate the code.  Display of strings is safe by
default anyway, as they can contain all sorts of data.

> You should know that displaying and editing UTF-8 text as if it was
> latin-1 works very very well.

Given that we've already seen one (fairly simple) character posted in
this thread that displayed differently in the HTML view than in the
edit--and neither place as the symbol originally intended--I'm going
to reserve judgement on this statement.   I don't know whether the
problem was with Google, my browser, or something else, but I do know
that it made interchange of information difficult and that I'm using a
fairly recent (within the last 3 years) out-of-the-box setup.

> Also, Terminals have support for UTF-8 encodings already. Or you could
> always use kate+fish to edit your script on the distant server without
> such problems (fish is a KDE protocol used to access a computer with ssh
> as if it was a hard disk and kate is the standard text/code editor) It's
> a matter of tools.

You don't always get to pick your tools.  It's very nice to have
things work with standard setups, be they brand new Windows boxes or
the c. 1993 mail server in the office or your wife's handheld that
you've grabbed to help do emergency troubleshooting from your vacation
or whatever.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Gregor Horvath

Istvan Albert schrieb:

> Here is something that just happened and relates to this subject: I
> had to help a student run some python code on her laptop, she had
> Windows XP that hid the extensions. I wanted to set it up such that
> the extension is shown. I don't have XP in front of me but when I do
> it takes me 15 seconds to do it. Now her Windows was set up with some
> asian fonts (Chinese, Korean not sure), looked extremely unfamiliar
> and I had no idea what the menu systems were. We have spent quite a
> bit of time figuring out how to accomplish the task. I had her read me
> back the options, but something like "hide extensions" comes out quite
> a bit different. Surprisingly tedious and frustrating experience.
> 

So the solution is to forbid Chinese XP ?

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Istvan Albert

As a non-native English speaker,

On May 13, 11:44 am, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:

> - should non-ASCII identifiers be supported? why?

No. I don't think it adds much, I think it will be a little used
feature (as it should be), every python instructor will start their
class by saying here is a feature that you should stay away from
because you never know where your code ends up.

> - would you use them if it was possible to do so? in what cases?

No. The only possible uses I can think of are intentionally
obfuscating code.

Here is something that just happened and relates to this subject: I
had to help a student run some python code on her laptop, she had
Windows XP that hid the extensions. I wanted to set it up such that
the extension is shown. I don't have XP in front of me but when I do
it takes me 15 seconds to do it. Now her Windows was set up with some
asian fonts (Chinese, Korean not sure), looked extremely unfamiliar
and I had no idea what the menu systems were. We have spent quite a
bit of time figuring out how to accomplish the task. I had her read me
back the options, but something like "hide extensions" comes out quite
a bit different. Surprisingly tedious and frustrating experience.

Anyway, something to keep in mind. In the end features like this may
end up hurting those it was meant to help.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Gregor Horvath

Eric Brunel schrieb:
> 
> The point is that today, I have a reasonable chance of being able to 
> read, understand and edit any Python code. With PEP 3131, it will no 
> more be true. That's what bugs me.

That's just not true. I and others in this thread have stated that they 
use German or other languages as identifiers today but are forced to 
make a stupid and unreadable translation to ASCII.

> 
> Same question again and again: how does he know that non-Russian 
> speakers will *ever* get in touch with his code and/or need to update it?

If you didn't get non English comments and identifiers until now, you 
will not get any with this PEP either. And if you do get them today or 
with the PEP it doesn't make a difference for you to get some glyphs not 
properly displayed, doesn't it?

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Eric Brunel

On Wed, 16 May 2007 17:14:32 +0200, Gregor Horvath <[EMAIL PROTECTED]>  
wrote:

> Eric Brunel schrieb:
>
>> Highly improbable in the general context. If I stumble on a source code  
>> in Chinese, Russian or Hebrew, I wouldn't be able to figure out a  
>> single sound.
>
> If you get source code in a programming language that you don't know you  
> can't figure out a single sound too.
> How is that different?

What kind of argument is that? If it was carved in stone, I would not be  
able to enter it in my computer without rewriting it. So what?

The point is that today, I have a reasonable chance of being able to read,  
understand and edit any Python code. With PEP 3131, it will no more be  
true. That's what bugs me.

> If someone decides to make *his* identifiers in Russian he's taking into  
> account that none-Russian speakers are not going to be able to read the  
> code.

Same question again and again: how does he know that non-Russian speakers  
will *ever* get in touch with his code and/or need to update it?
-- 
python -c "print ''.join([chr(154 - ord(c)) for c in  
'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-'])"
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Paul Boddie

On 16 May, 15:49, Carsten Haese <[EMAIL PROTECTED]> wrote:
>
> [*] And if you respond that they must know "some" English in the form of
> keywords and such, the answer is no, they need not. It is not hard for
> Europeans to learn to visually recognize a handful of simple Chinese
> characters without having to learn their pronunciation or even their
> actual meaning. By the same token, a Chinese person can easily learn to
> recognize "if", "while", "print" and so on visually as symbols, without
> having to learn anything beyond what those symbols do in a Python
> program.

I think this is a crucial point being made here. Taking a page from
the python.jp site, from which an example was posted elsewhere in the
discussion, we see a sprinkling of Latin-based identifiers much like a
number of other Japanese sites:

http://www.python.jp/Zope/pythondoc_jp/

I know hardly anything about the Japanese language and have heard only
anecdotal tales of English proficiency amongst Japanese speakers, but
is it really likely that readers of that page (particularly newcomers)
know the special pronunciation of "LaTeX" (or even most English
readers unfamiliar with that technology) and the derivation of that
name, that "Q" specifically means "question", that "HTML" specifically
means "Hypertext Markup Language", and so on? It seems to me that
modern Japanese culture and society is familiar with such "symbols"
without there being any convincing argument to suggest that this is
only the case because "they all must know English".

Consequently, Python's keywords and even the standard library can
exist with names being "just symbols" for many people. It would be
interesting to explore the notion of localised versions of the
library; the means of providing interoperability between programs and
library versions in different languages would be one of the many
challenges involved.

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Gregor Horvath

Eric Brunel schrieb:

> Highly improbable in the general context. If I stumble on a source code 
> in Chinese, Russian or Hebrew, I wouldn't be able to figure out a single 
> sound.

If you get source code in a programming language that you don't know you 
can't figure out a single sound too.
How is that different?

If someone decides to make *his* identifiers in Russian he's taking into 
account that none-Russian speakers are not going to be able to read the 
code.
If someone decides to program in Fortran he takes into account that the 
average Python programmer can not read the code.

How is that different?

It's the choice of the author.
Taking away the choice is not a good thing.
Following this logic we should forbid all other programming languages 
except Python so everyone can read every code in the world.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Eric Brunel

On Wed, 16 May 2007 16:29:27 +0200, Neil Hodgson  
<[EMAIL PROTECTED]> wrote:

> Eric Brunel:
>
>> Have you ever tried to enter anything more than 2 or 3 characters like  
>> that?
>
> No, only for examples. Lengthy texts are either already available  
> digitally or are entered by someone skilled in the language.
>
>  > I did. It just takes ages. Come on: are you really serious about
>> entering *identifiers* in a *program* this way?
>
> Are you really serious about entry of identifiers in another  
> language  being a problem?

Yes.

> Most of the time your identifiers will be available by selection  
> from an autocompletion list or through cut and paste.

Auto-completion lists have always caused me more disturbance than help.  
Since - AFAIK - you have to type some characters before they can be of any  
help, I don't think they can help much here. I also did have to copy/paste  
identifiers to program (because of a broken keyboard, IIRC), and found it  
extremely difficult to handle. Constant movements to get every identifier  
- either by keyboard or with the mouse - are not only unnecessary, but  
also completely breaks my concentration. Programming this way takes me 4  
or 5 times longer than being able to type characters directly.

> Less commonly, you'll know what they sound like.

Highly improbable in the general context. If I stumble on a source code in  
Chinese, Russian or Hebrew, I wouldn't be able to figure out a single  
sound.

> Even more rarely you'll only have a printed document.

I wonder how that could be of any help.

> Each of these can be handled reasonably considering their frequency of  
> occurrence. I have never learned Japanese but have had to deal with  
> Japanese text at a couple of jobs and it isn't that big of a problem.  
> Its certainly not "virtually impossible" nor is there "absolutely no way  
> of entering the word" (売り場). I think you should moderate your  
> exaggerations.

I do admit it was a bit exaggerated: there actually are ways. You know it,  
and I know it. But what about the average guy, not knowing anything about  
Japanese, kanji, radicals and stroke counts? How will he manage to enter  
these funny-looking characters, perhaps not even knowing it's Japanese?  
And does he have to learn a new input method each time he stumbles across  
identifiers written in a character set he doesn't know? And even if he  
finds a way, the chances are that it will be terribly inefficient. Having  
to pay attention on how you can type the things you want is a really big  
problem when it comes to programming: you have a lot of other things to  
think about.
-- 
python -c "print ''.join([chr(154 - ord(c)) for c in  
'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-'])"
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Neil Hodgson

Eric Brunel:

> Have you ever tried to enter anything more than 2 or 3 characters like 
> that? 

No, only for examples. Lengthy texts are either already available 
digitally or are entered by someone skilled in the language.

 > I did. It just takes ages. Come on: are you really serious about
> entering *identifiers* in a *program* this way?

Are you really serious about entry of identifiers in another 
language  being a problem?

Most of the time your identifiers will be available by selection 
from an autocompletion list or through cut and paste. Less commonly, 
you'll know what they sound like. Even more rarely you'll only have a 
printed document. Each of these can be handled reasonably considering 
their frequency of occurrence. I have never learned Japanese but have 
had to deal with Japanese text at a couple of jobs and it isn't that big 
of a problem. Its certainly not "virtually impossible" nor is there 
"absolutely no way of entering the word" (売り場). I think you should 
moderate your exaggerations.

Is there a realistic scenario in which foreign character set 
identifier entry would be difficult for you?

Neil
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Eric Brunel

On Wed, 16 May 2007 15:46:10 +0200, Neil Hodgson  
<[EMAIL PROTECTED]> wrote:

> Eric Brunel:
>
>> Funny you talk about Japanese, a language I'm a bit familiar with and  
>> for which I actually know some input methods. The thing is, these only  
>> work if you know the transcription to the latin alphabet of the word  
>> you want to type, which closely match its pronunciation. So if you  
>> don't know that 売り場 is pronounced "uriba" for example, you have  
>> absolutely no way of entering the word. Even if you could choose among  
>> a list of characters, are you aware that there are almost 2000 "basic"  
>> Chinese characters used in the Japanese language? And if I'm not  
>> mistaken, there are several tens of thousands characters in the Chinese  
>> language itself. This makes typing them virtually impossible if you  
>> don't know the language and/or have the correct keyboard.
>
> It is nowhere near that difficult. There are several ways to  
> approach this, including breaking up each character into pieces and  
> looking through the subset of characters that use that piece (the  
> Radical part of the IME). For 売, you can start with the cross with a  
> short bottom stroke (at the top of the character) 士, for 場 look for  
> the crossy thing on the left 土. The middle character is simple looking  
> so probably not Chinese so found it in Hiragana. Another approach is to  
> count strokes (Strokes section of the IME) and look through the  
> characters with that number of strokes. Within lists, the characters are  
> ordered from simplest to more complex so you can get a feel for where to  
> look.

Have you ever tried to enter anything more than 2 or 3 characters like  
that? I did. It just takes ages. Come on: are you really serious about  
entering *identifiers* in a *program* this way?
-- 
python -c "print ''.join([chr(154 - ord(c)) for c in  
'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-'])"
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Neil Hodgson

Eric Brunel:

> Funny you talk about Japanese, a language I'm a bit familiar with and 
> for which I actually know some input methods. The thing is, these only 
> work if you know the transcription to the latin alphabet of the word you 
> want to type, which closely match its pronunciation. So if you don't 
> know that 売り場 is pronounced "uriba" for example, you have absolutely 
> no way of entering the word. Even if you could choose among a list of 
> characters, are you aware that there are almost 2000 "basic" Chinese 
> characters used in the Japanese language? And if I'm not mistaken, there 
> are several tens of thousands characters in the Chinese language itself. 
> This makes typing them virtually impossible if you don't know the 
> language and/or have the correct keyboard.

It is nowhere near that difficult. There are several ways to 
approach this, including breaking up each character into pieces and 
looking through the subset of characters that use that piece (the 
Radical part of the IME). For 売, you can start with the cross with a 
short bottom stroke (at the top of the character) 士, for 場 look for 
the crossy thing on the left 土. The middle character is simple looking 
so probably not Chinese so found it in Hiragana. Another approach is to 
count strokes (Strokes section of the IME) and look through the 
characters with that number of strokes. Within lists, the characters are 
ordered from simplest to more complex so you can get a feel for where to 
look.

Neil
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Carsten Haese

On Wed, 2007-05-16 at 14:35 +0200, René Fleschenberg wrote:
> You have misread my statements.
> 
> Carsten Haese schrieb:
> > There is evidence against your assertions that knowing some English is a
> > prerequisite for programming 
> 
> I think it is a prerequesite for "real" programming. Yes, I can imagine
> that if you use Python as a teaching tool for Chinese 12 year-olds, then
> it might be nice to be able to spell identifiers with Chinese
> characters. However, IMO this is such a special use-case that it is
> justified to require the people who need this to explicitly enable it,
> by using a patched interpreter or by enabling an interpreter option for
> example.

There you go again with "real" programming. Nobody that I'm aware of
dictates that Python must only be used for real programming.

It sounds like you are acknowledging that there are use cases for
allowing non-ASCII identifiers after all. Making some switch for
enabling this feature is a compromise that has been suggested on this
thread before, including by yours truly. I wouldn't even be opposed to
making this switch be off by default, as long as the feature is there
for people who need it.

> > in Python and that people won't use non-ASCII
> > identifiers if they could. 
> 
> I did not assert that at all, where did you get the impression that I
> do? If I were convinced that noone would use it, I would have not such a
> big problem with it. I fear that it *will* be used "in the wild" if the
> PEP in its current form is accepted and that I personally *will* have to
> deal with such code.

Yes, I apologize, I completely mangled your assertion. I don't know what
I was thinking when I wrote that. In reality you asserted, and I'll
quote verbatim this time: "It is naive to believe that you can program
in Python without understanding any English once you can use your native
characters in identifiers." It is precisely this assertion that is being
disproved by HYRY's students who *do* program in Python without
understanding any English[*], using native characters in identifiers.
But they have to launder their programs before they can run them.

[*] And if you respond that they must know "some" English in the form of
keywords and such, the answer is no, they need not. It is not hard for
Europeans to learn to visually recognize a handful of simple Chinese
characters without having to learn their pronunciation or even their
actual meaning. By the same token, a Chinese person can easily learn to
recognize "if", "while", "print" and so on visually as symbols, without
having to learn anything beyond what those symbols do in a Python
program.

Regards,

-- 
Carsten Haese
http://informixdb.sourceforge.net

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Marc 'BlackJack' Rintsch

In <[EMAIL PROTECTED]>, Stefan Behnel wrote:

> René Fleschenberg wrote:
>> Marc 'BlackJack' Rintsch schrieb:
>>> There are potential users of Python who don't know much english or no
>>> english at all.  This includes kids, old people, people from countries
>>> that have "letters" that are not that easy to transliterate like european
>>> languages, people who just want to learn Python for fun or to customize
>>> their applications like office suites or GIS software with a Python
>>> scripting option.
>> 
>> Make it an interpreter option that can be turned on for those cases.
> 
> No. Make "ASCII-only" an interpreter option that can be turned on for the
> cases where it is really required.

Make no interpreter options and use `pylint` and `pychecker` for checking
if the sources follow your style guide in respect to identifiers.

Ciao,
Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Sion Arrowsmith

Hendrik van Rooyen <[EMAIL PROTECTED]> wrote:
>I still don't like the thought of the horrible mix of "foreign"
>identifiers and English keywords, coupled with the English 
>sentence construction.

How do you think you'd feel if Python had less in the way of
(conventionally used) English keywords/builtins. Like, say, Perl?

-- 
\S -- [EMAIL PROTECTED] -- http://www.chaos.org.uk/~sion/
   "Frankly I have no feelings towards penguins one way or the other"
-- Arthur C. Clarke
   her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Javier Bezos

"Eric Brunel" <[EMAIL PROTECTED]> escribió en el mensaje

> Funny you talk about Japanese, a language I'm a bit familiar with and for
> which I actually know some input methods. The thing is, these only work if
> you know the transcription to the latin alphabet of the word you want to
> type, which closely match its pronunciation. So if you don't know that ??
> ? is pronounced "uriba" for example, you have absolutely no way of
> entering the word.

Actually, you can draw the character (in XP, at
least) entirely or in part and the system shows a
list of them with similar shapes. IIRC, there is
a similar tool on Macs. Of course, I'm not saying
this allows to enter kanji in a easy and fast way,
but certainly it's not impossible at all, even if
you don't know the pronunciation.

Javier
---
http://www.texytipografia.com




-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Sion Arrowsmith

Steven D'Aprano  <[EMAIL PROTECTED]> wrote:
>On Tue, 15 May 2007 09:09:30 +0200, Eric Brunel wrote:
>> Joke aside, this just means that I won't ever be able to program math in
>> ADA, because I have absolutely no idea on how to do a 'pi' character on
>> my keyboard.
>Maybe you should find out then? Personal ignorance is never an excuse for 
>rejecting technology.

The funny thing is, I could have told you exactly how to type a 'pi'
character 18 years ago, when my main use of computers was typesetting
on a Mac. These days ... I've just spent 20 minutes trying to find out
how to insert one into this text (composed in emacs on a remote
machine, connected via ssh from konsole).

-- 
\S -- [EMAIL PROTECTED] -- http://www.chaos.org.uk/~sion/
   "Frankly I have no feelings towards penguins one way or the other"
-- Arthur C. Clarke
   her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Jochen Schulz

* RenÃ© Fleschenberg:
> Stefan Behnel schrieb:
>> 
>> [...] They are just tools. Even if you do not
>> understand English, they will not get in your way. You just learn them.
> 
> I claim that this is *completely unrealistic*. When learning Python, you
> *do* learn the actual meanings of English terms like "open",
> "exception", "if" and so on if you did not know them before.

This is certainly true for easy words like "open" and "in". But there
are a lot of counterexamples.

When learning something new, you always learn a lot of new concepts and
you get to know how they are called in this specific context. For
example, when you learn to program, you might stumble upon the idea of
"exceptions", which you can raise/throw and except/catch. But even if
you know how to use that concept and understand what it does, you do not
necessarily know the "usual" meaning of the word outside of your domain.

As far as I can tell, quite often these are the terms that even enter
the native language without any translation (even though there are
perfect translations for the words in their original meaning).  German
examples are "exceptions", "client" and "server", "mail", "hub" and
"switch", "web" and many, many more. Nobody who uses these terms has to
know their exact meaning in his native language as long as he speaks to
Germans or stays in the domain where he learned them.

I read a lot of English text every day but I am sometimes still
surprised to learn that a word I already knew has a meaning outside
of computing. "Hub" is a nice example for that. I was very surprised to
learn that even my bike has this. ;-)

J.
-- 
If I was Mark Chapman I would have shot John Lennon with a water pistol.
[Agree]   [Disagree]
 
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread George Sakkis

On May 13, 11:44 am, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:

> (snipped)
>
> So, please provide feedback, e.g. perhaps by answering these
> questions:
> - should non-ASCII identifiers be supported? why?

Initially I was on -1 but from this thread it seems that many closed
(or semi-closed) environments would benefit from such a change. I'm
still concerned though about the segregation this feature encourages.
In my (admittedly limited) experience on non-english-speaking
environments, everyone used to have some basic command of english and
was encouraged to use proper english identifiers; OTOH, the hodgepodge
of english keywords/stdlib/3rd party symbols with transliterated to
ascii application identifiers was being looked down as clumsy and in
fact less readable.

Bottom line, -0.

> - would you use them if it was possible to do so? in what cases?

No, and I would refuse to maintain code that did use them*.

George

* Unless I start teaching programming to preschoolers or something.

-- 
http://mail.python.org/mailman/listinfo/python-list

1 2 3 4 5 >

1 - 100 of 404 matches

Mail list logo