Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Steve Holden
Martin v. Löwis wrote:
 Please, if you have a *new* idea that doesn't have a failure mode, by
 all means post it.  But don't resurrect a pointless bikeshed.
 
 While I completely agree that it is pointless to reiterate the same
 arguments over and over, I disagree that the bikeshed metapher applies.
 This metapher (IIUC) describes a trivial design issue that is merely
 a matter of taste, rather than having deep technical implications.
 Using Unicode or bytes for strings is not of that kind.
 
+1

These issues are very important because they affect everyone. Even
though very few people actually understand them. Including me, which is
why I've been so quiet on this thread.

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Georg Brandl
Barry Warsaw schrieb:
 On Dec 4, 2008, at 6:21 PM, Martin v. Löwis wrote:
 
 I can't find any docs built for Python 3.0 (not 3.1a0).

 The Windows installation has new 3.0 doc dated Dec 3, so it was  
 built,
 just not posted correctly.
 
 That doesn't mean very much. I built it on my local machine. Anybody
 with subversion and python could do that; the documentation is in
 subversion.
 
 Whether or not it appears on the web site as part of the release
 process is an entirely different matter. It used to be that the
 doc maintainer (Fred Drake) was part of the release team and release
 process. I think Georg is complaining that he is release maintainer,
 but not part of the release process.
 
 I've asked Georg to update PEP 101 to make his role as Documentation  
 Expert explicit.  Unfortunately we only debug major releases once (or  
 twice) every 18 months.  But next time, we'll get that part right for  
 sure!

Done that now. Since release.py builds the docs all right, there's not
much left for me to do except check that everything is ok.

 In the meantime, I'll make sure Georg is involved in point releases  
 moving forward.

That's good. Thanks!

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Adam Olsen
On Fri, Dec 5, 2008 at 12:00 AM, Martin v. Löwis [EMAIL PROTECTED] wrote:
 Please, if you have a *new* idea that doesn't have a failure mode, by
 all means post it.  But don't resurrect a pointless bikeshed.

 While I completely agree that it is pointless to reiterate the same
 arguments over and over, I disagree that the bikeshed metapher applies.
 This metapher (IIUC) describes a trivial design issue that is merely
 a matter of taste, rather than having deep technical implications.
 Using Unicode or bytes for strings is not of that kind.

That we need to support both unicode and bytes is important, but
already seems to have consensus.  However, they present two distinct
usage patterns:

* unicode text, presentable to the user, interacts with all manor of
standardized APIs
* bytes, limited to local, internal use.  Only approximated forms can
be presented to the user, only custom formats can be saved externally

None of the proposals have turned these into a single use case.  All
they do is trade off various forms of subtly switch back and forth,
which leads to failure.  Debating which subtle failure is better is a
bikeshed.

Not only that, but we already have a solution that makes the choice
explicit, avoiding the subtle failure.  This is the solution already
in use for os file  path functions.  It's the solution Guido
supports.


-- 
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Taint Mode in Python 3.0

2008-12-05 Thread Nick Coghlan
Maciej Fijalkowski wrote:
 Hello,
 
 The thing is pypy's taint code is broken. Basically you don't only
 need to patch all places that return pyobject, but also all places
 that might modify anything. (All side effects) For example innocently
 looking call to addition might end up calling arbitrary python code
 (and have arbitrary side effects). There is a question how do you
 approach such things?

Taint isn't an easy problem, but PyPy is still a *much* better platform
for that kind of experimentation than CPython.

RPython, objects spaces, the code generation, etc all give you much more
powerful tools to play with than the raw C code of the reference
interpreter.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan
[EMAIL PROTECTED] wrote:
 At least this time I think I've encapsulated pretty much my entire
 argument here, so if you don't buy it, we can probably just agree to
 disagree :).

Glyph, the only point I would add to your message is this one:

Adding a blessed way to encode arbitrary binary data into a Python 3.0
str object strikes me as giving up on one of the key advances in the new
version of the language.

8-bit strings were a problem in Python 2.x because they blurred the
boundary between arbitrary binary data and ASCII or latin-1 character data.

One of the most interesting aspects of Python 3.0 is its attempt to get
developers to be explicit about this distinction (both in the code and
in their own minds) by enforcing separation between arbitrary binary
data (held in bytes and bytearray instances) and character data (held in
str instances).

I don't understand how tunneling arbitrary binary data through str
instances (*regardless* of encoding mechanism) can possibly fail to
recreate exactly the same is it text or binary data? ambiguity
problems that the str/bytes split is intended to eliminate. And if that
happens, then what exactly was the point in moving to an all Unicode
string model for Py3k?

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Victor Stinner
Le Friday 05 December 2008 00:39:24 Martin v. Löwis, vous avez écrit :
 5) represent all environment variables in Unicode strings,
including the ones that currently fail to decode.
(then do the same to file names, then drop the byte-oriented
 file operations again)

Please, don't do that! Bytes are not characters!

-- 
Victor Stinner aka haypo
http://www.haypocalc.com/blog/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Ulrich Eckhardt
On Friday 05 December 2008, Adam Olsen wrote:
 Many of the windows APIs use UTF-16 without validating it.  They'll
 pass through invalid strings until they hit something that does
 validate, at which point it'll blow up.

 I suspect that it doesn't happen very often in practice, as having
 only one encoding makes it quite clear that it's a broken file name,
 not a mixed encoding environment.

Actually, I wouldn't say that's a problem at all. The point is that stuff that 
is blissfully unaware of encodings typically uses some ASCII-de(p)rived text. 
Those char-strings are translated according to the current locale, which then 
does the filtering and validation. The result may be gibberish (GIGO 
principle) but at least it's UTF-16 gibberish. ;)

Uli

-- 
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

**
   Visit our website at http://www.satorlaser.de/
**
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten 
bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen 
Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein 
sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, 
weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte 
Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht 
verantwortlich.

**

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Victor Stinner
Hi,

Le Thursday 04 December 2008 21:02:19 Toshio Kuratomi, vous avez écrit :
 I opened up bug http://bugs.python.org/issue4006 a while ago and it was
 suggested in the report that it's not a bug but a feature and so I
 should come here to see about getting the feature changed :-)

Yeah, I prefer to discuss such changes on the mailing list.

 These mixed encodings can occur for a variety of reasons.  Here's an
 example that isn't too contrived :-)
 (...)
 Furthermore, they don't want to suffer from the space loss of using 
 utf-8 to encode Japanese so they use shift-jis everywhere.

space loss? Really? If you configure your server correctly, you should get 
UTF-8 even if the file system is Shift-JIS. But it would be much easier to 
use UTF-8 everywhere.

Hum... I don't think that the discussion is about one specific server, but the 
lack of bytes environment variables in Python3 :-)

 1) return mixed unicode and byte types in ...

NO!

 2) return only byte types in os.environ

Hum... Most users have UTF-8 everywhere (eg. all Windows users ;-)), and 
Python3 already use Unicode everywhere (input(), open(), filenames, ...).

 3) silently ignore non-decodable value when accessing os.environ['PATH']
 as we do now but allow access to the full information via
 os.environ[b'PATH'] and os.getenvb()

I don't like os.environ[b'PATH']. I prefer to always get the same result 
type... But os.listdir() doesn't respect that :-(

   os.listdir(str) - list of str
   os.listdir(bytes) - list of bytes

I would prefer a similar API for easier migration from Python2/Python3
(unicode). os.environb sounds like the best choice for me.


But they are open questions (already asked in the bug tracker):

(a) Should os.environ be updated if os.environb is changed? If yes, how?
   os.environb['PATH'] = '\xff' (or any invalid string in the system 
 default encoding)
   = os.environ['PATH'] = ???

(b) Should os.environb be updated if os.environ is changed? If yes, how?

The problem comes with non-Unicode locale (eg. latin-1 or ASCII): most charset 
are unable to encode the whole Unicode charset (eg. codes = 65535).

   os.environ['PATH'] = chr(0x1)
   = os.environb['PATH'] = ???

(c) Same question when a key is deleted (del os.environ['PATH']).

If Python 3.1 will have os.environ and os.environb, I'm quite sure that some 
modules will user os.environ and other will prefer os.environb. If both 
environments are differents, the two modules set will work differently :-/

It would be maybe easier if os.environ supports bytes and unicode keys. But we 
have to keep these assertions:
   os.environ[bytes] - bytes
   os.environ[str] - str

 4) raise an exception when non-decodable values are *accessed* and
 continue as in #3.

I like os.listdir() behaviour: just *ignore* non-decodable files. If you 
really want to access these files, use a bytes directory name ;-)

 I think that the ease of debugging is lost when we silently ignore an error.

Guido gave a good example. If your directory contains an non decodable 
filename (eg. ???.txt): glob('*.py') will fail because of the evil 
filename. With the current behaviour, you're unable to list all files but 
glob('*.py') will list all Python scripts!

And Python3 is released, it's maybe a bad idea to change the behaviour (of 
os.environ) in Python 3.1 :-/

 The bug report I opened suggests creating a PEP to address this issue.

Please, try to answer to my questions about os.environ and os.environb 
consistency.

I also like bytes environment variables. I need them for my fuzzing program. 
The lack of bytes variables is a regression from Python2 (for my program). On 
UNIX, filenames are bytes and the environment variables are bytes. For the 
best interoperability, Python3 should support bytes. But the default choice 
should always be characters (unicode) and to never mix the bytes and str 
types ;-)

---

As usual, it goes faster if someone writes a patch :-) I could try to work on 
it.

-- 
Victor Stinner aka haypo
http://www.haypocalc.com/blog/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Ulrich Eckhardt
On Friday 05 December 2008, Guido van Rossum wrote:
 At the risk of bringing up something that was already rejected, let me
 propose something that follows the path taken in 3.0 for filenames,
 rather than doubling back:

 For os.environ, os.getenv() and os.putenv(), I think a similar
 approach as used for os.listdir() and os.getcwd() makes sense: let
 os.environ skip variables whose name or value is undecodable, and have
 a separate os.environb() which contains bytes; let os.getenv() and
 os.putenv() do the right thing when the arguments passed in are bytes.

 For sys.argv, because it's positional, you can't skip undecodable
 values, so I propose to use error=replace for the decoding; again, we
 can add sys.argvb that contains the raw bytes values. The various
 os.exec*() and os.spawn*() calls (as well as os.system(), os.popen()
 and the subprocess module) should all accept bytes as well as strings.

 On Windows, the bytes APIs should probably not exist.

 I predict that most developers can get away with not using the bytes
 APIs at all. The small minority that needs to be robust if not all
 filenames use the system encoding can use the bytes APIs.

I know some of those developers, you can contact them via 
[EMAIL PROTECTED] Seriously, what would you suggest to someone that 
wants to handle paths in a portable way? Using the Unicode variants of 
functions is fubar, because encoding/decoding is not universally possible. 
Using the byte variant is equally fubar, because e.g. on MS Windows it is not 
supported, except through a very lossy roundtrip through the locale's 
codepage, limiting your functionality.

I actually think it is about time to give up on trying to think about a path 
as a string. Dito for data received from os.environ or sys.argv. There are 
only very few things that are universal to them and a reliable encoding is 
none of them. Then, once you have let that idea go, meditate a bit over the 
Zen.

What I propose is that paths must be treated as OS-specific, with the only 
common reliable operations being joining them, concatenating them and 
splitting them into segments divided by the (again, OS-specific) separator. 
Other operations, like e.g. appending a string or converting it to a string 
in order to display it can fail. And if they fail, they should fail noisily. 
In 99% of all cases, using the default encoding will work and do what people 
expect, which is why I would make this conversion automatic. In all other 
cases, it will at least not fail silently (which would lead to garbage and 
data loss) and allow more sophisticated applications to handle it.

Uli

-- 
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

**
   Visit our website at http://www.satorlaser.de/
**
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten 
bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen 
Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein 
sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, 
weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte 
Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht 
verantwortlich.

**

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Fix for frame_setlineno() in frameobject.c function

2008-12-05 Thread Fabien . Bouleau
Hello,

This concerns a known bug in the frame_setlineno() function for Python 
2.5.x and 2.6.x (maybe in earlier version too). It is not possible to use 
this function when the address or line offset are greater than 127. The 
problem comes from the lnotab variable which is typed char*, therefore 
implicitely signed char*. Any value above 127 becomes a negative number.

The fix is very simple (applied on the Python 2.6.1 version of the source 
code):

--- frameobject.c   Thu Oct 02 19:39:50 2008
+++ frameobject_fixed.c Fri Dec 05 11:27:42 2008
@@ -119,8 +119,8 @@
line = f-f_code-co_firstlineno;
new_lasti = -1;
for (offset = 0; offset  lnotab_len; offset += 2) {
-   addr += lnotab[offset];
-   line += lnotab[offset+1];
+   addr += ((unsigned char*)lnotab)[offset];
+   line += ((unsigned char*)lnotab)[offset+1];
if (line = new_lineno) {
new_lasti = addr;
new_lineno = line;


It would be nice to fix it for Python 2.5 and above, in order to have a 
proper MSI installer for Windows.

Best regards,
Fabien Bouleau



DISCLAIMER: 
This e-mail contains proprietary information some or all of which may be 
legally privileged. It is for the intended recipient only. If an addressing or 
transmission error has misdirected this e-mail, please notify the author by 
replying to this e-mail. If you are not the intended recipient you must not 
use, disclose, distribute, copy, print, or rely on this e-mail.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final FFT

2008-12-05 Thread Lambert, David W (ST)
http://code.activestate.com/recipes/576550/ 

This recipe shows how to use gsl FFT with python 3.

ctypes is really good!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Jean-Paul Calderone

On Thu, 4 Dec 2008 22:05:05 -0800, Guido van Rossum [EMAIL PROTECTED] wrote:

On Thu, Dec 4, 2008 at 9:40 PM,  [EMAIL PROTECTED] wrote:

The default case, the case of the user without the wherewithal
to understand the nuances of the distinction between 2.x and 3.x, is a user
who should use 2.x.


Not at all clear. If they're not sensitive to those nuances it's just
as likely that they're a casual developer (e.g. a student just
learning to program). Such users are unlikely to start using major 3rd
party packages like Twisted or Django, which would be completely
overwhelming to someone just learning.


That seems like it would be right to me, but two or three times a month
someone shows up in the Twisted IRC channel who is learning both Python
and Twisted at the same time.  So apparently there are a lot of people
for whom this isn't overwhelming.

Jean-Paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Eduardo O. Padoan
On Fri, Dec 5, 2008 at 12:35 AM, A.M. Kuchling [EMAIL PROTECTED] wrote:
 On Thu, Dec 04, 2008 at 05:29:31PM -0800, Raymond Hettinger wrote:
 Here's a bright idea.  On the 3.0 release page, include a box listing
 which major third-party apps have been converted.  Update it
 once every couple of weeks.  That way, we're not explicitly

 That's an excellent idea.  We could have a webpage, or start a
 topic-specific weblog for posting announcements.

 I've started a draft of a 3.0 FAQ in the wiki at
 http://wiki.python.org/moin/Python3000/FAQ.  Once it's finished we
 can move it into the 3.0 release pages.  Everyone please edit and
 improve it!

Sometime ago I started a page on the wiki to collect reports of early
migrations by the community:
http://wiki.python.org/moin/Early2to3Migrations

Maybe this would be relevant to point on the FAQ.

 --amk
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/eduardo.padoan%40gmail.com




-- 
Eduardo de Oliveira Padoan
http://djangopeople.net/edcrypt/
Distrust those in whom the desire to punish is strong. -- Goethe,
Nietzsche, Dostoevsky
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fix for frame_setlineno() in frameobject.c function

2008-12-05 Thread Benjamin Peterson
Hi,
Please post this on the issue tracker. http://bugs.python.org

On Fri, Dec 5, 2008 at 4:42 AM,  [EMAIL PROTECTED] wrote:
 Hello,

 This concerns a known bug in the frame_setlineno() function for Python
 2.5.x and 2.6.x (maybe in earlier version too). It is not possible to use
 this function when the address or line offset are greater than 127. The
 problem comes from the lnotab variable which is typed char*, therefore
 implicitely signed char*. Any value above 127 becomes a negative number.

 The fix is very simple (applied on the Python 2.6.1 version of the source
 code):

 --- frameobject.c   Thu Oct 02 19:39:50 2008
 +++ frameobject_fixed.c Fri Dec 05 11:27:42 2008
 @@ -119,8 +119,8 @@
line = f-f_code-co_firstlineno;
new_lasti = -1;
for (offset = 0; offset  lnotab_len; offset += 2) {
 -   addr += lnotab[offset];
 -   line += lnotab[offset+1];
 +   addr += ((unsigned char*)lnotab)[offset];
 +   line += ((unsigned char*)lnotab)[offset+1];
if (line = new_lineno) {
new_lasti = addr;
new_lineno = line;





-- 
Cheers,
Benjamin Peterson
There's nothing quite as beautiful as an oboe... except a chicken
stuck in a vacuum cleaner.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread James Y Knight

On Dec 5, 2008, at 5:27 AM, Ulrich Eckhardt wrote:
Using the byte variant is equally fubar, because e.g. on MS Windows  
it is not

supported, except through a very lossy roundtrip through the locale's
codepage, limiting your functionality.



Yeah, IMO whole mess could have been avoided by keeping the filename/ 
args/environ simply *bytes*, like it really is, on unix. Then, make  
the Windows version of python use (always! not dependent upon locale!)  
utf-8 to decode the utf-8 bytestring to the UTF-16 that the Windows  
platform APIs expect (and vice versa). And never use the ASCII variant  
of the windows APIs.


This would mean that all *inputs* would succeed, but some *outputs*  
would not, on Windows. But that's not a new kind of failure: NUL has  
never been allowed in argv/environ, and filenames have all sorts of  
platform-dependent restrictions.


But unfortunately, it's too late for that solution...

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Terry Reedy wrote:
 Toshio Kuratomi wrote:

 I would think life would be ultimately easier if either the file server
 or the shell server automatically translated file names from jis and
 utf8 and back, so that the PATH on the *nix shell server is entirely
 utf8.

 This is not possible because no part of the computer knows what the
 encoding is.  To the computer, it's just a sequence of bytes.  Unlike
 xml or the windows filesystem (winfs? ntfs?) where the encoding is
 specified as part of the document/filesystem there's nothing to tell
 what encoding the filenames are in.
 
 I thought you said that the file server keep all filenames in shift-jis,
 and the shell server all in utf-8.

Yes.  But this is part of the setup of the example to keep things
simple.  The fileserver or shell server could themselves be of mixed
encodings (for instance, if it was serving home directories to users all
over the world each user might be using a different encoding.)

  If so, then the shell server could
 know if it were told so.
 

Where are you going to store that information?  In order for python to
run without errors, will it have to be configured on each system it's
installed on to know the encoding of each filename?  Or are we going to
try to talk each *NIX vendor into creating new filesystems that record
that information and after a five year span of time declare that python
will not run on other filesystems in corner cases?

I think that this way does not hold a reasonable expectation of keeping
python a portable language.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python security: draft article on the wiki

2008-12-05 Thread Victor Stinner
Hi,

I started to write a short article about Python security on the wiki:

   http://wiki.python.org/moin/Security

Nothing useful yet.

-- 
Victor Stinner aka haypo
http://www.haypocalc.com/blog/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread skip

Martin There is. There have been the following trove classifiers
Martin defined for a few weeks now:

Martin Programming Language :: Python :: 2
Martin Programming Language :: Python :: 2.3
Martin Programming Language :: Python :: 2.4
Martin Programming Language :: Python :: 2.5
Martin Programming Language :: Python :: 2.6
Martin Programming Language :: Python :: 2.7
Martin Programming Language :: Python :: 3
Martin Programming Language :: Python :: 3.0
Martin Programming Language :: Python :: 3.1

Good.  Now we just need to populate them.  I take it the classifiers without
minor numbers imply any known minor version (e.g., 2 == 2.3 and greater)?

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread A.M. Kuchling
On Fri, Dec 05, 2008 at 05:40:46AM -, [EMAIL PROTECTED] wrote:
 For most users, especially new users who have yet to be impressed with  
 Python's power, 2.x is much better.  It's not like library support is  
 one small check-box on the language's feature sheet: most of the  
 attractive things about Python are libraries.  Of course I am not free  

Here I agree, sort of.  Newbies may not understand what they're giving
up in terms of libraries.  (The 'sort of' is because, having learned
3.0, learning the changes for 2.6 is certainly much easier than
learning a first programming language is.)

 The third (albeit much less likely) option is that you're learning  
 Python to learn to interact with a system that's scriptable in embedded  
 Python, like Blender or Gimp.  I don't think there's a single system of  
 that variety which uses 3.0 yet, and these will likely be even slower to  
 move than libraries.  

Let me note that if some application embeds Python for a specialized
purpose, where the only modules imported are either user-written or
part of the application, it seems much *easier* to move to Python 3
because the scripts don't use arbitrary third-party libraries.  Python
embedded in an e-mail MTA might use libraries for DNS or file I/O or
databases and has to be cautious about versions; Python in Gimp
probably doesn't, in practice.

--amk
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python + Java Integration

2008-12-05 Thread Bill Janssen
 One thing that would help Python in this debate (or, perhaps simply  
 put it in the running, at least as a next Java candidate) would be  
 if Python had an easier migration path for Java developers that  
 currently rely upon various third-party libraries.  The wealth of  
 third-party libraries available for Java has always been one of its  
 great strengths.  Ergo, if Python had an easy-to-use, recommended way  
 to use those libraries within the Python environment, that would be a  
 significant advantage to present to Java developers and those who  
 would choose Ruby over Java.  Platform compatibility is always a huge  
 motivator for those looking to migrate or upgrade.

Personally, I'm using Andi Vajda's JCC for this purpose.  Recommended.
The nice thing about it is that it turns jar files into Python modules;
you don't need the source.

http://pypi.python.org/pypi/JCC

Bill
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Python tracker Issues

2008-12-05 Thread Python tracker

ACTIVITY SUMMARY (11/28/08 - 12/05/08)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 2233 open (+55) / 14139 closed (+41) / 16372 total (+96)

Open issues with patches:   753

Average duration of open issues: 705 days.
Median duration of open issues: 2193 days.

Open Issues Breakdown
   open  2214 (+54)
pending19 ( +1)

Issues Created Or Reopened (96)
___

Coding cookie crashes IDLE   11/28/08
CLOSED http://bugs.python.org/issue4454created  tjreedy   
   

No Windows List in IDLE if several windows have the same title   11/28/08
CLOSED http://bugs.python.org/issue4455created  amaury.forgeotdarc
   patch   

xmlrpc is broken 11/28/08
CLOSED http://bugs.python.org/issue4456created  benjamin.peterson 
   

__import__ documentation obsolete11/29/08
   http://bugs.python.org/issue4457created  stevenjd  
   

getopt.gnu_getopt() loses dash argument  11/29/08
CLOSED http://bugs.python.org/issue4458created  muntyan   
   

bdist_rpm assumes python 11/29/08
   http://bugs.python.org/issue4459created  John5342  
   

The parameter of PyInt_AsSsize_t() is not checked to see if it i 11/29/08
CLOSED http://bugs.python.org/issue4460created  CWRU_Researcher1  
   

parameters of PyLong_FromString() are not checked for NULL   11/29/08
   http://bugs.python.org/issue4461created  CWRU_Researcher1  
   patch   

result of PyList_GetItem() not validated 11/29/08
CLOSED http://bugs.python.org/issue4462created  CWRU_Researcher1  
   

Parameters and result of PyList_GetItem() are not validated  11/29/08
CLOSED http://bugs.python.org/issue4463created  CWRU_Researcher1  
   

PyList_GetItem() result and parameters not fully validated   11/29/08
CLOSED http://bugs.python.org/issue4464created  CWRU_Researcher1  
   

The result of set_copy() is not checked for NULL 11/29/08
CLOSED http://bugs.python.org/issue4465created  CWRU_Researcher1  
   

The return value of PyFile_FromFile is not checked for NULL  11/29/08
CLOSED http://bugs.python.org/issue4466created  CWRU_Researcher1  
   

return value of PyUnicode_AsEncodedString() is not checked for N 11/29/08
CLOSED http://bugs.python.org/issue4467created  CWRU_Researcher1  
   

Restore chapter enumeration in Python docs   11/30/08
CLOSED http://bugs.python.org/issue4468created  schluehk  
   

CVE-2008-5031 multiple integer overflows 11/30/08
   http://bugs.python.org/issue4469created  doko  
   

smtplib SMTP_SSL not working.11/30/08
   http://bugs.python.org/issue4470created  lcatucci  
   patch   

IMAP4 missing support for starttls   11/30/08
   http://bugs.python.org/issue4471created  lcatucci  
   patch   

Is shared lib building broken on trunk?  11/30/08
   http://bugs.python.org/issue4472created  skip.montanaro
   

POP3 missing support for starttls 

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Victor Stinner wrote:
 Hi,
 
 Le Thursday 04 December 2008 21:02:19 Toshio Kuratomi, vous avez écrit :
 
 These mixed encodings can occur for a variety of reasons.  Here's an
 example that isn't too contrived :-)
 (...)
 Furthermore, they don't want to suffer from the space loss of using 
 utf-8 to encode Japanese so they use shift-jis everywhere.
 
 space loss? Really? If you configure your server correctly, you should get 
 UTF-8 even if the file system is Shift-JIS. But it would be much easier to 
 use UTF-8 everywhere.
 
 Hum... I don't think that the discussion is about one specific server, but 
 the 
 lack of bytes environment variables in Python3 :-)

Yep.  I can't change the logicalness of the policies of a different
organization, only code my application to deal with it :-)

 1) return mixed unicode and byte types in ...
 
 NO!
 
It's nice that we agree... but I would prefer if you leave enough
context so that others can see that we agree as well :-)

 2) return only byte types in os.environ
 
 Hum... Most users have UTF-8 everywhere (eg. all Windows users ;-)), and 
 Python3 already use Unicode everywhere (input(), open(), filenames, ...).

We're also in agreement here.

 3) silently ignore non-decodable value when accessing os.environ['PATH']
 as we do now but allow access to the full information via
 os.environ[b'PATH'] and os.getenvb()
 
 I don't like os.environ[b'PATH']. I prefer to always get the same result 
 type... But os.listdir() doesn't respect that :-(
 
os.listdir(str) - list of str
os.listdir(bytes) - list of bytes
 
 I would prefer a similar API for easier migration from Python2/Python3
 (unicode). os.environb sounds like the best choice for me.
 
nod.  After thinking about how it would be used in subprocess calls I
agree.  os.environb would allow us to retrieve the full dict as bytes.
os.environ[b''] only works on individual keys.  Also os.getenv serves
the same purpose as os.environ[b''] would whereas os.environb would have
 its own uses.

 
 But they are open questions (already asked in the bug tracker):
 
I answered these in the bug tracker.  Here are the answers for the
mailing list:

 (a) Should os.environ be updated if os.environb is changed? If yes, how?
os.environb['PATH'] = '\xff' (or any invalid string in the system 
  default encoding)
= os.environ['PATH'] = ???
 
The underlying environment that both variables reflect should be updated
but what is displayed by os.environ should continue to follow the same
rules.  So if we follow option #3::
 os.environb['PATH'] = b'\xff'
 os.environ['PATH'] = raises KeyError because PATH is not a key in
the unicode decoded environment.

(option #4 would issue a UnicodeDecodeError instead of a KeyError)

Similarly, if you start with a variable in os.environb that can only be
represented as bytes and your program transforms it into something that
is decodable it should then show up in os.environ.

 (b) Should os.environb be updated if os.environ is changed? If yes, how?
 
 The problem comes with non-Unicode locale (eg. latin-1 or ASCII): most 
 charset 
 are unable to encode the whole Unicode charset (eg. codes = 65535).
 
os.environ['PATH'] = chr(0x1)
= os.environb['PATH'] = ???

Ah, this is a good question.  I misunderstood what you were getting at
when you posted this to the bug report.  I see several options but the
one that seems the most sane is to raise UnicodeEncodeError when setting
the value.  With that, proper code to set an environment variable might
look like this::

LANG=C python3.0
 variable = chr(0x1)
 try:
 # Unicode aware locales
 os.environ['MYVAR'] = variable
 except UnicodeEncodeError:
 # Non-Unicode locales
 os.environb['MYVAR'] = bytes(variable, encoding='utf8')

 (c) Same question when a key is deleted (del os.environ['PATH']).
 
Update the underlying env so both os.environ and os.environb reflect the
change.  Deleting should not hold the problems that updating does.

 If Python 3.1 will have os.environ and os.environb, I'm quite sure that some 
 modules will user os.environ and other will prefer os.environb. If both 
 environments are differents, the two modules set will work differently :-/
 
Exactly.  So making sure they hold the same information is a priority.

 It would be maybe easier if os.environ supports bytes and unicode keys. But 
 we 
 have to keep these assertions:
os.environ[bytes] - bytes
os.environ[str] - str
 
I think the same choices have to be made here.  If LANG=C, we still have
to decide what to do when os.environ[str] is set to a non-ASCii string.

Additionally, the subprocess question makes using the key value
undesirable compared with having a separate os.environb that accesses
the same underlying data.

 4) raise an exception when non-decodable values are *accessed* and
 continue as in #3.
 
 I like os.listdir() behaviour: just *ignore* non-decodable files. If you 
 really want to access these 

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Guido van Rossum
On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt [EMAIL PROTECTED] wrote:
 Seriously, what would you suggest to someone that
 wants to handle paths in a portable way? Using the Unicode variants of
 functions is fubar, because encoding/decoding is not universally possible.
 Using the byte variant is equally fubar, because e.g. on MS Windows it is not
 supported, except through a very lossy roundtrip through the locale's
 codepage, limiting your functionality.

Write a lightweight abstraction layer that uses Unicode when possible
and bytes otherwise. You'd need to write a few functions for the path
handling code you need, with a platform check or two sprinkled in.

Writing such an abstraction for the purpose of one specific
application is usually simple enough. However, writing a similar
abstraction that serves all apps and all use cases is hard. I hope
that eventually someone will come up with one though -- the failure of
earlier path object proposals notwithstanding.

 I actually think it is about time to give up on trying to think about a path
 as a string. Dito for data received from os.environ or sys.argv. There are
 only very few things that are universal to them and a reliable encoding is
 none of them. Then, once you have let that idea go, meditate a bit over the
 Zen.

This sounds too pessimistic to me. I expect that in five years it will
be universally accepted that these variables must be encoded in a
standard encoding. People are never going to give up thinking about
filenames etc. as strings, because that's what they are conceptually.
The problem is purely one of encoding, and that's where Unix/Linux are
behind the curve, since (so far) they haven't taken the plunge and
picked a universal standard encoding, the way Windows and Mac OS X
have done.

 What I propose is that paths must be treated as OS-specific, with the only
 common reliable operations being joining them, concatenating them and
 splitting them into segments divided by the (again, OS-specific) separator.
 Other operations, like e.g. appending a string or converting it to a string
 in order to display it can fail. And if they fail, they should fail noisily.

That's bad though, since filenames are being displayed all the time
(e.g. in error messages).

 In 99% of all cases, using the default encoding will work and do what people
 expect, which is why I would make this conversion automatic. In all other
 cases, it will at least not fail silently (which would lead to garbage and
 data loss) and allow more sophisticated applications to handle it.

I think the always fail noisily approach isn't the best approach.
E.g. if I am globbing for *.py, and there's an undecodable .txt file
in a directory, its presence shouldn't cause the glob to fail.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Ted Leung

On Dec 4, 2008, at 7:59 PM, [EMAIL PROTECTED] wrote:



On 02:35 am, [EMAIL PROTECTED] wrote:

On Thu, Dec 04, 2008 at 05:29:31PM -0800, Raymond Hettinger wrote:
Here's a bright idea.  On the 3.0 release page, include a box  
listing

which major third-party apps have been converted.  Update it
once every couple of weeks.  That way, we're not explicitly


That's an excellent idea.  We could have a webpage, or start a
topic-specific weblog for posting announcements.

I've started a draft of a 3.0 FAQ in the wiki at
http://wiki.python.org/moin/Python3000/FAQ.  Once it's finished we
can move it into the 3.0 release pages.  Everyone please edit and
improve it!


It occurs to me that this specific idea (the box with the list of  
supported applications / libraries) should be implementable as a  
simple query against PyPI.  I don't know if it actually is :), but  
it should be.  In general it would be nice to know whether one's  
favorite tools were available for *any* new Python version.


I agree with this.   Plus it might act as an incentive for people to  
port libraries faster...


Ted
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Guido van Rossum
On Thu, Dec 4, 2008 at 11:27 PM,  [EMAIL PROTECTED] wrote:
 With all due respect, for me, library support and serious use are
 synonymous.

Glyph, I cannot have a discussion with you if every single post of
yours is longer than my combined daily output. Please spend some time
writing shorter posts. I'm sure I'm not the only one here with a short
attention span. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Fred Drake

On Dec 5, 2008, at 10:25 AM, [EMAIL PROTECTED] wrote:
Good.  Now we just need to populate them.  I take it the classifiers  
without
minor numbers imply any known minor version (e.g., 2 == 2.3 and  
greater)?



This is an excellent question, Skip.

There was already Programming Language :: Python, provided by many  
packages.  I think version compatibility relationships meant by each  
of these classifiers should be made explicit, wherever it is that  
documentation for classifiers is provided.


I don't recall having seen any such documentation; hopefully I just  
need to be hit by another clue.



  -Fred

--
Fred Drake   fdrake at acm.org

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] __import__ docs follow-up

2008-12-05 Thread Georg Brandl
Hi,

as a follow-up to the thread a few days ago, and the bug report, I've
rewritten most of the __import__ docs.  I've attached the suggested patch
to the issue http://bugs.python.org/issue4457.

I'd be glad for reviews. Also, I'd like to ask about opinions if this
winning idiom (as a bug comment states) should be in it, instead of
the getattr() helper function:

 import sys
 __import__('x.y.z')
 mod = sys.modules['x.y.z']

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] ANN: new python-porting mailing list

2008-12-05 Thread Georg Brandl
Hi all,

to facilitate discussion about porting Python code between different versions
(mainly of course from 2.x to 3.x), we've created a new mailing list

   [EMAIL PROTECTED]

It is a public mailing list open to everyone.  We expect active participation
of many people porting their libraries/programs, and hope that the list can
be a help to all wanting to go this (not always smooth :-) way.

@python-dev: it would of course be nice to have more than a few developers
on that list ;-)

regards,
Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Merging flow

2008-12-05 Thread Mark Dickinson
On Thu, Dec 4, 2008 at 3:12 PM, Christian Heimes [EMAIL PROTECTED] wrote:
 Flow diagram
 

 trunk --- release26-maint
   \-  py3k   --- release30-maint


I'm running into problems making this work, with a trivial change:
I committed r67590 (which adds a single assert to ast.c) to the
trunk, then merged to 2.6 and py3k in r67592 and r67595 respectively.
Then I tried:

../svnmerge.py merge -r67595

from the root directory of a clean copy of the release30-maint
branch (svn status gives no output), and got conflicts on '.':

property 'svnmerge-integrated' set on '.'

property 'svnmerge-blocked' set on '.'

--- Merging r67595 into '.':
UPython/ast.c
 C   .

property 'svnmerge-integrated' set on '.'

property 'svnmerge-blocked' deleted from '.'.

I now have a new file dir_conflicts.prej that looks something like:

Trying to change property 'svnmerge-integrated' from
'/python/trunk:1-61437,...,67528,67590', but property has been locally
changed from
'/python/branches/py3k:1-67498,67522-67524,67539,67541,67559,67588' to
'/python/trunk:1-61437,...,67467,67484,67528'.

(where the ... abbreviates a big long list of revision numbers).

Did I mess up somewhere, or does svnmerge not work on
a revision that was itself the result of an svnmerge?

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ANN: new python-porting mailing list

2008-12-05 Thread Brett Cannon
On Fri, Dec 5, 2008 at 10:36, Georg Brandl [EMAIL PROTECTED] wrote:
 Hi all,

 to facilitate discussion about porting Python code between different versions
 (mainly of course from 2.x to 3.x), we've created a new mailing list

   [EMAIL PROTECTED]

 It is a public mailing list open to everyone.  We expect active participation
 of many people porting their libraries/programs, and hope that the list can
 be a help to all wanting to go this (not always smooth :-) way.


The mailing list URL is
http://mail.python.org/mailman/listinfo/python-porting for those who
don't want to search on the mail.python.org home page (which looks
really dated at this point).

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Merging flow

2008-12-05 Thread Brett Cannon
On Fri, Dec 5, 2008 at 11:20, Mark Dickinson [EMAIL PROTECTED] wrote:
 On Thu, Dec 4, 2008 at 3:12 PM, Christian Heimes [EMAIL PROTECTED] wrote:
 Flow diagram
 

 trunk --- release26-maint
   \-  py3k   --- release30-maint


 I'm running into problems making this work, with a trivial change:
 I committed r67590 (which adds a single assert to ast.c) to the
 trunk, then merged to 2.6 and py3k in r67592 and r67595 respectively.
 Then I tried:

 ../svnmerge.py merge -r67595

 from the root directory of a clean copy of the release30-maint
 branch (svn status gives no output), and got conflicts on '.':

 property 'svnmerge-integrated' set on '.'

 property 'svnmerge-blocked' set on '.'

 --- Merging r67595 into '.':
 UPython/ast.c
  C   .

 property 'svnmerge-integrated' set on '.'

 property 'svnmerge-blocked' deleted from '.'.

 I now have a new file dir_conflicts.prej that looks something like:

 Trying to change property 'svnmerge-integrated' from
 '/python/trunk:1-61437,...,67528,67590', but property has been locally
 changed from
 '/python/branches/py3k:1-67498,67522-67524,67539,67541,67559,67588' to
 '/python/trunk:1-61437,...,67467,67484,67528'.

 (where the ... abbreviates a big long list of revision numbers).

 Did I mess up somewhere, or does svnmerge not work on
 a revision that was itself the result of an svnmerge?

Someone might know better than me, but I am willing to bet you can't
svnmerge a svnmerge revision. Since the svnmerge revision contains
changes to the metadata on . that will conflict with the new svnmerge
values that the svnmerge you are trying to do causes. But if I am
right about this then won't that require blocking the svnmerge
revision on release30-maint the svnmerge revision on py3k?

Ugh. Is this getting to the point that we can only svnmerge between
trunk and py3k and the maintenance branches just have to be managed
the old-fashion way?

And I have pinged the people helping me with the DVCS PEP in hopes of
getting us moved off of svn sooner rather than later.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Merging flow

2008-12-05 Thread Fred Drake

On Dec 5, 2008, at 2:20 PM, Mark Dickinson wrote:

Did I mess up somewhere, or does svnmerge not work on
a revision that was itself the result of an svnmerge?


I ran into this yesterday as well with my patch to the cgi module.   
The work-around was to revert the change to that property and edit it  
manually.


I think this is a significant issue, since editing that property is  
about as error-prone as it can be.  I've not really looked at the code  
in svnmerge.py, so I'm not sure how hard it would be to fix.



  -Fred

--
Fred Drake   fdrake at acm.org

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Gregor Lingl



[EMAIL PROTECTED] schrieb:


To be fair, if someone asked me specifically about educating non- 
programmer adults about programming, I would probably at least 
*mention* py3, if not recommend it outright.  The improved consistency 
is worth a lot in an educational setting.  (But, if one is educating 
children and interested in soliciting their genuine enthusiasm, 
whiz-bang graphics are really a must-have, not a negotiable extra.)
As a non native English speaker I'm not sure if I understand correctly, 
what you mean with whiz-bang graphics. Nevertheless I'd like to point 
you to the new turtle graphics module (which is part of the standard 
librarys since 2.6). At least it was designed especially for use in the 
educational  domain. Moreover the source-distribution also contains a 
bunch of some ten example scripts.


Regards,
Gregor

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ANN: new python-porting mailing list

2008-12-05 Thread skip

Georg[EMAIL PROTECTED]

Georg It is a public mailing list open to everyone.  We expect active
Georg participation of many people porting their libraries/programs,
Georg and hope that the list can be a help to all wanting to go this
Georg (not always smooth :-) way.

I trust you will announce this in python-list and python-announce-list if
you haven't already?

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ANN: new python-porting mailing list

2008-12-05 Thread Georg Brandl
[EMAIL PROTECTED] schrieb:
 Georg[EMAIL PROTECTED]
 
 Georg It is a public mailing list open to everyone.  We expect active
 Georg participation of many people porting their libraries/programs,
 Georg and hope that the list can be a help to all wanting to go this
 Georg (not always smooth :-) way.
 
 I trust you will announce this in python-list and python-announce-list if
 you haven't already?

I've sent it to python-announce, it's in the moderator queue.  I'm not on
python-list so I can't answer followups.  If you'd like to do an
announcement there, I'd be happy :)

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Mike Klaas


On 5-Dec-08, at 8:40 AM, A.M. Kuchling wrote:


On Fri, Dec 05, 2008 at 05:40:46AM -, [EMAIL PROTECTED] wrote:
For most users, especially new users who have yet to be impressed  
with
Python's power, 2.x is much better.  It's not like library  
support is

one small check-box on the language's feature sheet: most of the
attractive things about Python are libraries.  Of course I am not  
free


Here I agree, sort of.  Newbies may not understand what they're giving
up in terms of libraries.  (The 'sort of' is because, having learned
3.0, learning the changes for 2.6 is certainly much easier than
learning a first programming language is.)


For possible insight, here is a current discussion on the topic:

http://www.reddit.com/r/programming/comments/7hlra/ask_progit_ive_got_the_itch_to_learn_python_since/

(note that these would be programmers interested in learning python,  
not people trying to learn programming)


-Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Guido van Rossum wrote:
 On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt [EMAIL PROTECTED] wrote:
 In 99% of all cases, using the default encoding will work and do what people
 expect, which is why I would make this conversion automatic. In all other
 cases, it will at least not fail silently (which would lead to garbage and
 data loss) and allow more sophisticated applications to handle it.
 
 I think the always fail noisily approach isn't the best approach.
 E.g. if I am globbing for *.py, and there's an undecodable .txt file
 in a directory, its presence shouldn't cause the glob to fail.
 
But why should it make glob() fail?  This sounds like an implementation
detail of glob.  Here's some pseudo-code::

def glob(pattern):
string = False
if isinstance(pattern, str):
string = True
if platform == 'POSIX':
pattern = bytes(pattern, encoding=defaultencoding)
rawfiles = os.listdir(os.path.dirname(pattern) or pattern)
if string and platform == 'POSIX':
return [str(f) for f in rawfiles if match(f, pattern)]
else:
return rawfiles

This way the traceback occurs if anything in the result set is
undecodable.  What am I missing?

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Guido van Rossum
On Fri, Dec 5, 2008 at 12:05 PM, Toshio Kuratomi [EMAIL PROTECTED] wrote:
 Guido van Rossum wrote:
 On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt [EMAIL PROTECTED] wrote:
 In 99% of all cases, using the default encoding will work and do what people
 expect, which is why I would make this conversion automatic. In all other
 cases, it will at least not fail silently (which would lead to garbage and
 data loss) and allow more sophisticated applications to handle it.

 I think the always fail noisily approach isn't the best approach.
 E.g. if I am globbing for *.py, and there's an undecodable .txt file
 in a directory, its presence shouldn't cause the glob to fail.

 But why should it make glob() fail?  This sounds like an implementation
 detail of glob.

Glob was just an example. Many use cases for directory traversal
couldn't care less if they see *all* files.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Guido van Rossum wrote:
 Glob was just an example. Many use cases for directory traversal
 couldn't care less if they see *all* files.
 
Okay.  Makes it harder to prove correct or not if I don't know what the
use case is :-)  I can't think of a single use case off-hand.

Even your example of a ??.txt file making retrieval of *.py files fail
is a little broken.  If there was a ??.py file that was undecodable the
program would most likely want to know that file existed.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Gregor Lingl wrote:
 
 [EMAIL PROTECTED] schrieb:
 To be fair, if someone asked me specifically about educating non- 
 programmer adults about programming, I would probably at least 
 *mention* py3, if not recommend it outright.  The improved consistency 
 is worth a lot in an educational setting.  (But, if one is educating 
 children and interested in soliciting their genuine enthusiasm, 
 whiz-bang graphics are really a must-have, not a negotiable extra.)
 As a non native English speaker I'm not sure if I understand correctly, 
 what you mean with whiz-bang graphics. Nevertheless I'd like to point 
 you to the new turtle graphics module (which is part of the standard 
 librarys since 2.6). At least it was designed especially for use in the 
 educational  domain. Moreover the source-distribution also contains a 
 bunch of some ten example scripts.

I'm pretty sure he that turtle graphics are not whiz-bang (in this
century, at least).  Begin able to do pygame-style OpenGL stuff would be
whiz bang[1] in my book.


[1] http://www.merriam-webster.com/dictionary/whizbang


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [EMAIL PROTECTED]
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJOZPn+gerLs4ltQ4RAnE1AKCl+Z51tACSJLBmAOcp5q534Mx+2ACg1I28
re6gaV7AFEU0WS1yvUIiZS0=
=4Pda
-END PGP SIGNATURE-

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Guido van Rossum wrote:
 At the risk of bringing up something that was already rejected, let me
 propose something that follows the path taken in 3.0 for filenames,
 rather than doubling back:
 
 For os.environ, os.getenv() and os.putenv(), I think a similar
 approach as used for os.listdir() and os.getcwd() makes sense: let
 os.environ skip variables whose name or value is undecodable, and have
 a separate os.environb() which contains bytes; let os.getenv() and
 os.putenv() do the right thing when the arguments passed in are bytes.
 
I prefer the method used by file.read() where an error is thrown when
accessing undecodable data.  I think in time python programmers will
consider not throwing an exception a wart in python3.  However, this is
enough to allow programmers to do the right thing once an error is
reported by users and the cause has been tracked down so it doesn't
block fixing errors as the current code does.

And it's not like anyone expected python3 to be wart-free just because
the python2 warts were fixed ;-)

 For sys.argv, because it's positional, you can't skip undecodable
 values, so I propose to use error=replace for the decoding; again, we
 can add sys.argvb that contains the raw bytes values. The various
 os.exec*() and os.spawn*() calls (as well as os.system(), os.popen()
 and the subprocess module) should all accept bytes as well as strings.
 
This also seems sane with the same comment about throwing errors.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Victor Stinner
Hi,

  But they are open questions (already asked in the bug tracker):

 I answered these in the bug tracker.  Here are the answers for the
 mailing list:

Oh, sorry. I didn't follow the end of the discussion on the bug tracker.

 os.environb['PATH'] = '\xff'
 = os.environ['PATH'] = ???

  os.environ['PATH'] = raises KeyError because PATH is not a key in
 the unicode decoded environment.

Ok, good answer :-)

 os.environ['PATH'] = chr(0x1)
 = os.environb['PATH'] = ???

 raise UnicodeEncodeError when setting the value.

Ok, it's consistent the current behaviour.

$ LANG=C ./python
Python 3.0rc3+ (py3k:67498M, Dec  4 2008, 17:45:54)
 import os
 os.environ['x'] = '\xff'
 os.environ['x']
Traceback (most recent call last):
  File stdin, line 1, in module
  File /home/haypo/prog/py3k/Lib/io.py, line 1491, in write
b = encoder.encode(s)
  File /home/haypo/prog/py3k/Lib/encodings/ascii.py, line 22, in encode
return codecs.ascii_encode(input, self.errors)[0]
UnicodeEncodeError: 'ascii' codec can't encode character '\xff' in position 1: 
ordinal not in range(128)

Oh, that's strange :-p The error is delayed when we read the value.

  It would be maybe easier if os.environ supports bytes and unicode keys.
  But we have to keep these assertions:
 os.environ[bytes] - bytes
 os.environ[str] - str

 I think the same choices have to be made here.  If LANG=C, we still have
 to decide what to do when os.environ[str] is set to a non-ASCii string.

If the charset is US-ASCII, os.environ will drop non-ASCII values. But most 
variables are ASCII only. Examples with my shell:

$ env
XCURSOR_THEME=kubuntu
LANG=fr_FR.UTF-8
EDITOR=vim
HOME=/home/haypo
...

 Additionally, the subprocess question makes using the key value
 undesirable compared with having a separate os.environb that accesses
 the same underlying data.

The user should be able to choose bytes or unicode. Examples:
 - subprocess.Popen('ls') = use unicode environment (os.environ)
 - subprocess.Popen(b'ls') = use bytes environment (os.environb)

 Here's my problem with it, though.  With these semantics any program
 that works on arbitrary files and runs on *NIX has to check
 os.listdir(b'') and do the conversion manually.

Only programs that have to support strange environment like yours (mixing 
Shift-JIS and UTF-8) :-) Most programs don't have to support these charset 
mixture.

We can imagine an higher library working on UNIX and Windows (bytes or 
Unicode). But that would be later.

 I think the desired behaviour assuming the existence of a nondecodable
 file is this:

I prefer the current behaviour :-)

 Why do you think that glob.glob('*.py') is special and should not traceback?

It's not special. glob() reuses listdir(), and it was an example to show 
that it just works.

 I just differ in that I think lack of tracebacks when
 UnicodeDecodeErrors are encountered is a wart in python3 that did not
 exist in python2.

Right.

-- 
Victor Stinner aka haypo
http://www.haypocalc.com/blog/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan
Toshio Kuratomi wrote:
 Guido van Rossum wrote:
 Glob was just an example. Many use cases for directory traversal
 couldn't care less if they see *all* files.

 Okay.  Makes it harder to prove correct or not if I don't know what the
 use case is :-)  I can't think of a single use case off-hand.
 
 Even your example of a ??.txt file making retrieval of *.py files fail
 is a little broken.  If there was a ??.py file that was undecodable the
 program would most likely want to know that file existed.

Why? Most programs won't be able to do anything with it. And if the
program *can* do something with it... that's what the bytes version of
the APIs are for.

Cheers,
Nick.


-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __import__ docs follow-up

2008-12-05 Thread Nick Coghlan
Georg Brandl wrote:
 Hi,
 
 as a follow-up to the thread a few days ago, and the bug report, I've
 rewritten most of the __import__ docs.  I've attached the suggested patch
 to the issue http://bugs.python.org/issue4457.
 
 I'd be glad for reviews. Also, I'd like to ask about opinions if this
 winning idiom (as a bug comment states) should be in it, instead of
 the getattr() helper function:
 
 import sys
 __import__('x.y.z')
 mod = sys.modules['x.y.z']

That way is a lot cleaner than other mechanisms I've seen (including the
current mechanism in the docs). Making that the recommended way of doing
a dynamic import seems like a good idea to me.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Victor Stinner wrote:
 It would be maybe easier if os.environ supports bytes and unicode keys.
 But we have to keep these assertions:
os.environ[bytes] - bytes
os.environ[str] - str
 I think the same choices have to be made here.  If LANG=C, we still have
 to decide what to do when os.environ[str] is set to a non-ASCii string.
 
 If the charset is US-ASCII, os.environ will drop non-ASCII values. But most 
 variables are ASCII only. Examples with my shell:
 
Yes.  But you still have the question of what to do when:
os.environ[str] = chr(0x1)

So I don't think it makes things simpler than having separate os.environ
and os.environb that update the same data behind the scenes.

 Additionally, the subprocess question makes using the key value
 undesirable compared with having a separate os.environb that accesses
 the same underlying data.
 
 The user should be able to choose bytes or unicode. Examples:

the subprocess question was posed further up the thread as basically --
does the user need to access os.environb in order to override things in
the environment when calling subprocess?  I think the answer to that is
yes since you might want to start with your environment and modify it
slightly when you call programs via subprocess.  If you just try to copy
os.environ and os.environ only iterates through the decodable env vars,
that doesn't work.  If you have an os.environb to copy it becomes possible.

  - subprocess.Popen('ls') = use unicode environment (os.environ)
  - subprocess.Popen(b'ls') = use bytes environment (os.environb)
 
That's... not expected to me :-(

If I never touch os.environ and invoke subprocess the normal way, I'd
still expect the whole environment to be passed on to the program being
called.  This is how invoking programs manually, shell scripting,
invoking programs from perl, python2, etc work.

Also, it's not really a good fit with the other things that key off of
the initial argument.  os.listdir(b'.') changes the output to bytes.
subprocess.Popen(b'ls') would change what environment gets input into
the call.

 Here's my problem with it, though.  With these semantics any program
 that works on arbitrary files and runs on *NIX has to check
 os.listdir(b'') and do the conversion manually.
 
 Only programs that have to support strange environment like yours (mixing 
 Shift-JIS and UTF-8) :-) Most programs don't have to support these charset 
 mixture.
 
Any program that is intended to be distributed, accesses arbitrary
files, and works on *nix platforms needs to take this into account.
Just because the environment inside of my organization is sane doesn't
mean that when we release the code to customers, clients, or the free
software community that the places it runs will be as strict about these
things.

Are most programs specific to one organization or are they distributed
to other people?  I can't answer that... everything I work on (except
passwords:-) is distributed -- from sys admin cronjobs to web
applications since I'm lucky that my whole job is devoted to working on
free software.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Merging flow

2008-12-05 Thread Nick Coghlan
Fred Drake wrote:
 On Dec 5, 2008, at 2:20 PM, Mark Dickinson wrote:
 Did I mess up somewhere, or does svnmerge not work on
 a revision that was itself the result of an svnmerge?
 
 I ran into this yesterday as well with my patch to the cgi module.  The
 work-around was to revert the change to that property and edit it manually.
 
 I think this is a significant issue, since editing that property is
 about as error-prone as it can be.  I've not really looked at the code
 in svnmerge.py, so I'm not sure how hard it would be to fix.

I think we're discovering the real reasons why people generally prefer
to use a DVCS when trying to manage multiple branches :P

For now it looks like we might have to maintain 3.0 manually, with
svnmerge only helping out for trunk-2.6 and trunk-py3k...

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan
Toshio Kuratomi wrote:
 Are most programs specific to one organization or are they distributed
 to other people?

The former. That's pretty well documented in assorted IT literature
('shrink-wrap' and open source commodity software are still relatively
new players on the scene that started to shift the balance the other
way, but now the server side elements of web services are shifting it
back again).

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Merging flow

2008-12-05 Thread Christian Heimes

Nick Coghlan wrote:

I think we're discovering the real reasons why people generally prefer
to use a DVCS when trying to manage multiple branches :P

For now it looks like we might have to maintain 3.0 manually, with
svnmerge only helping out for trunk-2.6 and trunk-py3k...


The problem seems to be trunk - py3k - 3.0. I had no issues with py3k 
- 3.0.


Christian
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Nick Coghlan wrote:
 Toshio Kuratomi wrote:
 Are most programs specific to one organization or are they distributed
 to other people?
 
 The former. That's pretty well documented in assorted IT literature
 ('shrink-wrap' and open source commodity software are still relatively
 new players on the scene that started to shift the balance the other
 way, but now the server side elements of web services are shifting it
 back again).
 
Cool.  So it's only people writing code to be shared with the larger
community or written for multiple customers that are affected by bugs
like this. :-/

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Nick Coghlan wrote:
 Toshio Kuratomi wrote:
 Guido van Rossum wrote:
 Glob was just an example. Many use cases for directory traversal
 couldn't care less if they see *all* files.

 Okay.  Makes it harder to prove correct or not if I don't know what the
 use case is :-)  I can't think of a single use case off-hand.

 Even your example of a ??.txt file making retrieval of *.py files fail
 is a little broken.  If there was a ??.py file that was undecodable the
 program would most likely want to know that file existed.
 
 Why? Most programs won't be able to do anything with it. And if the
 program *can* do something with it... that's what the bytes version of
 the APIs are for.
 
Nonsense.  A program can do tons of things with a non-decodable
filename.  Where it's limited is non-decodable filedata.

For instance, if you have a graphical text editor, you need to let the
user select files to load.  To do that you need to list all the files in
a directory, even the ones that aren't decodable.  The ones that aren't
decodable need to substitute something like:
  str(filename, errors='replace') + '(Filename not encoded in UTF8)'
in the file listing that the user sees.  When the file is loaded, it
needs to access the actual raw filename.  The file can then be loaded
and operated upon and even saved back to disk using the raw, undecodable
filename.

If you have a file manager, you need to code something that let's the
user move the file around.  Once again, the program loads the raw
filenames.  It transforms the name into something representable to the
user.  It displays that.  The user selects it and asks that it be moved
to another location.  Then the program uses the raw filename to move
from one location to another.

If you have a backup program, you need to list all the files in a
directory.  Then you need to copy those files to another location.  Once
again you have to retrieve the byte version of any non-decodable filenames.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Merging flow

2008-12-05 Thread Fred Drake

On Dec 5, 2008, at 5:31 PM, Nick Coghlan wrote:

I think we're discovering the real reasons why people generally prefer
to use a DVCS when trying to manage multiple branches :P


Really?  I don't.  The issue has nothing to do with someone  
maintaining private change sets, or wanting to do development with  
local commits without having access to commit to the project.


I expect (and someone from work has said they do as well) that  
Subversion 1.5's merge tracking would have handled this situation.



For now it looks like we might have to maintain 3.0 manually, with
svnmerge only helping out for trunk-2.6 and trunk-py3k...



I don't know if I'll have time to look at svnmerge this weekend (with  
house guests and all), but I really don't expect it's a difficult  
problem to solve in the tool.  The behavior suggests that this tiered  
set of branch relationships wasn't expected.



  -Fred

--
Fred Drake   fdrake at acm.org

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Merging flow

2008-12-05 Thread Jim Jewett
Nick Coghlan wrote:

 For now it looks like we might have to maintain 3.0 manually, with
 svnmerge only helping out for trunk-2.6 and trunk-py3k

Does it make the bookkeeping horrible if you merge from trunk straight
to 3.0, and then blocked svnmerged changes from propagating?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Martin v. Löwis
 Good.  Now we just need to populate them.  I take it the classifiers without
 minor numbers imply any known minor version (e.g., 2 == 2.3 and greater)?

Perhaps. As usual, they mean what people use them for.

I intended them to mean 2.x and 3.x, respectively, with no constraint on
x (i.e. including possibly 2.0 and 2.1). In particular, presence of 2
and absence of 3 is meant to indicate I know that it won't work on
Python 3.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread rdmurray

On Fri, 5 Dec 2008 at 12:11, Guido van Rossum wrote:

On Fri, Dec 5, 2008 at 12:05 PM, Toshio Kuratomi [EMAIL PROTECTED] wrote:

Guido van Rossum wrote:

On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt [EMAIL PROTECTED] wrote:

In 99% of all cases, using the default encoding will work and do what people
expect, which is why I would make this conversion automatic. In all other
cases, it will at least not fail silently (which would lead to garbage and
data loss) and allow more sophisticated applications to handle it.


I think the always fail noisily approach isn't the best approach.
E.g. if I am globbing for *.py, and there's an undecodable .txt file
in a directory, its presence shouldn't cause the glob to fail.


But why should it make glob() fail?  This sounds like an implementation
detail of glob.


Glob was just an example. Many use cases for directory traversal
couldn't care less if they see *all* files.


I agree with Toshio.  The only use case I can think of for not seeing
all files is when selecting a subset, and if the thing that does the
selecting only generates a traceback if a file that falls into the
subset is undecodable, then I don't see a problem.  That is, if I'm
selecting a subset of the files in a directory, and one of that subset
is undecodable, I _want_ a traceback, because I'll be wanting _all_
of the files that match my selection criteria.(*)

So I'm curious to hear your use cases where undecodable files are
don't care.

(*) More specifically, I want the program of a developer who didn't think
about the fact that users might have files with undecodable filenames
in their directory to generate a traceback rather than silently losing
those files.  (This is spoken to both by the principle of least
surprise and the zen rule that errors should never pass silently :)

--RDM
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan
Toshio Kuratomi wrote:
 Nick Coghlan wrote:
 Toshio Kuratomi wrote:
 Guido van Rossum wrote:
 Glob was just an example. Many use cases for directory traversal
 couldn't care less if they see *all* files.

 Okay.  Makes it harder to prove correct or not if I don't know what the
 use case is :-)  I can't think of a single use case off-hand.

 Even your example of a ??.txt file making retrieval of *.py files fail
 is a little broken.  If there was a ??.py file that was undecodable the
 program would most likely want to know that file existed.
 Why? Most programs won't be able to do anything with it. And if the
 program *can* do something with it... that's what the bytes version of
 the APIs are for.

 Nonsense.  A program can do tons of things with a non-decodable
 filename.  Where it's limited is non-decodable filedata.

You can't display a non-decodable filename to the user, hence the user
will have no idea what they're working on. Non-filesystem related apps
have no business trying to deal with insane filenames.

Linux is moving towards a standard of UTF-8 for filenames, and once we
get to the point where the idea of encoding filenames and environment
variables any other way is seen as crazy, then the Python 3 approach
will work seamlessly.

In the meantime, raw bytes APIs will provide an alternative for those
that disagree with that philosophy.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Thomas Wouters
On Fri, Dec 5, 2008 at 19:10, Guido van Rossum [EMAIL PROTECTED] wrote:

 On Thu, Dec 4, 2008 at 11:27 PM,  [EMAIL PROTECTED] wrote:
  With all due respect, for me, library support and serious use are
  synonymous.

 Glyph, I cannot have a discussion with you if every single post of
 yours is longer than my combined daily output. Please spend some time
 writing shorter posts. I'm sure I'm not the only one here with a short
 attention span. :-)


Allow me to paraphrase glyph (with whom I'm in complete agreement, for what
it's worth): many newbies will be disappointed by Python if they start with
Python 3.0 and discover that most of the cool possibilities they had heard
about are 'being worked on' and not quite ready. I don't doubt that 3.0 will
be easier for the new programmer to learn, but I do not believe the average
Oh, I heard about Python, let's learn it person should be pointed to 3.0
right now. They should be encouraged to learn 2.6 -- or even 2.5.

In spite of Python being a programming language, there is a difference
between 'casual user of the language' and 'library developer'; 3.0 is
certainly a must for all actual library developers, and I'm sure most of
them know about 3.0 by now. We're talking about first impressions for people
without that knowledge.

-- 
Thomas Wouters [EMAIL PROTECTED]

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Michael Urman
On Fri, Dec 5, 2008 at 18:48, Nick Coghlan [EMAIL PROTECTED] wrote:
 Toshio Kuratomi wrote:
 Nick Coghlan wrote:
 Toshio Kuratomi wrote:
 Guido van Rossum wrote:
 Glob was just an example. Many use cases for directory traversal
 couldn't care less if they see *all* files.

 Okay.  Makes it harder to prove correct or not if I don't know what the
 use case is :-)  I can't think of a single use case off-hand.

 Even your example of a ??.txt file making retrieval of *.py files fail
 is a little broken.  If there was a ??.py file that was undecodable the
 program would most likely want to know that file existed.
 Why? Most programs won't be able to do anything with it. And if the
 program *can* do something with it... that's what the bytes version of
 the APIs are for.

 Nonsense.  A program can do tons of things with a non-decodable
 filename.  Where it's limited is non-decodable filedata.

 You can't display a non-decodable filename to the user, hence the user
 will have no idea what they're working on. Non-filesystem related apps
 have no business trying to deal with insane filenames.

And what of python's batteries---does a library that takes filenames
or directories from a controlling program and processes the contents
of the file need to care whether the file can be encoded properly? Is
said library filesystem related or not?

Won't it be awful when it's the directory name, and processing the
file works if you change into its directory, but not if you're outside
of it? And if there's an error during processing and the library
reports a full filename using os.abspath(file.ext), but cannot get
the results?

 Linux is moving towards a standard of UTF-8 for filenames, and once we
 get to the point where the idea of encoding filenames and environment
 variables any other way is seen as crazy, then the Python 3 approach
 will work seamlessly.

 In the meantime, raw bytes APIs will provide an alternative for those
 that disagree with that philosophy.

And until that time, it's agony for the library writers who didn't
think they needed to care, but find that their users (other
developers) do.
-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Steven D'Aprano
On Sat, 6 Dec 2008 09:18:47 am Nick Coghlan wrote:
 Toshio Kuratomi wrote:
  Guido van Rossum wrote:
  Glob was just an example. Many use cases for directory traversal
  couldn't care less if they see *all* files.
 
  Okay.  Makes it harder to prove correct or not if I don't know what
  the use case is :-)  I can't think of a single use case off-hand.
 
  Even your example of a ??.txt file making retrieval of *.py files
  fail is a little broken.  If there was a ??.py file that was
  undecodable the program would most likely want to know that file
  existed.

 Why? Most programs won't be able to do anything with it.

But the program can report a sensible error message, so the user can fix 
the problem.

I'd rather have the Python API report errors then silence them, at least 
by default. I don't suppose it's on the table for functions to grow an 
extra argument that tells them to skip broken file names and 
environment variables? 

What I have in mind is something like:

os.listdir(path, silence_errors=False) - list_of_strings

By default, if a filename in path is not a valid string, an exception is 
raised, with the guilty file name given in bytes as an attribute of the 
exception. If silence_errors is true, the invalid file names are 
silently skipped.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan
Toshio Kuratomi wrote:
 Nick Coghlan wrote:
 Toshio Kuratomi wrote:
 Are most programs specific to one organization or are they distributed
 to other people?
 The former. That's pretty well documented in assorted IT literature
 ('shrink-wrap' and open source commodity software are still relatively
 new players on the scene that started to shift the balance the other
 way, but now the server side elements of web services are shifting it
 back again).

 Cool.  So it's only people writing code to be shared with the larger
 community or written for multiple customers that are affected by bugs
 like this. :-/

True, but it's still a fairly important problem to have a solution to.
Even internally in large organisations there can be some pretty insane
environments as cruft accumulates over the years.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Martin v. Löwis
 There was already Programming Language :: Python, provided by many
 packages.  I think version compatibility relationships meant by each of
 these classifiers should be made explicit, wherever it is that
 documentation for classifiers is provided.
 
 I don't recall having seen any such documentation; hopefully I just need
 to be hit by another clue.

There is no documentation for classifiers whatsoever. I don't think
nuances matter much, anyway.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Martin v. Löwis
 5) represent all environment variables in Unicode strings,
including the ones that currently fail to decode.
(then do the same to file names, then drop the byte-oriented
 file operations again)
 
 Please, don't do that! Bytes are not characters!

And environment variables, command line arguments, and file names
are not bytes, but characters.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread James Y Knight

On Dec 5, 2008, at 7:48 PM, Nick Coghlan wrote:

You can't display a non-decodable filename to the user, hence the user
will have no idea what they're working on. Non-filesystem related apps
have no business trying to deal with insane filenames.


Sigh, same arguments, all over again.

Again, *both* KDE and Gnome apps display non-decodable filenames to  
the user, and let the user work with the files. They display as good a  
rendition as they can, using a replacement character as appropriate.  
In some earlier versions, KDE did not work at all on poorly-encoded  
files, and, users submitted bug reports. People do care, it does  
happen in real life, and it is a bug in your software if you cannot  
deal with the users' files. They just want the software to work. If it  
shows something weird in the window titlebar, that's a bit irritating  
but at least it doesn't get in the way of working.



Linux is moving towards a standard of UTF-8 for filenames, and once we
get to the point where the idea of encoding filenames and environment
variables any other way is seen as crazy, then the Python 3 approach
will work seamlessly.


I seriously doubt that would ever enforce utf-8 filenames/env vars/ 
command arguments. Oddly encoded strings will always be with us in  
some form or another.


Now, perhaps you use crontab? At least on the systems I have, programs  
run by cron don't have any locale environment variables set, and so  
default to the C locale. So utf-8 encoded filenames/etc will fail,  
by default, for any python3 program run under cron.


I'd like to make an analogy: what if Python3 couldn't deal with  
filenames with spaces in them on unix? Most filenames don't have  
spaces in them, so it should be okay, right? And those people who  
really need to deal with space-containing filenames can use this other  
API variant, instead of the recommended and most obvious one. That'd  
be okay, right? No, of course it wouldn't be okay!


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Michael Urman
On Fri, Dec 5, 2008 at 19:22, Martin v. Löwis [EMAIL PROTECTED] wrote:
 Please, don't do that! Bytes are not characters!

 And environment variables, command line arguments, and file names
 are not bytes, but characters.

On Windows NT, sure. On Unix they're still bytes no matter how much we
want them to be characters.

This difference, and secondarily the way python 3 tries to sweep it
under the rug, seem to be the roots of the problem.

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Steven D'Aprano
On Sat, 6 Dec 2008 12:47:45 pm Guido van Rossum wrote:
 But I disagree that most of the cool possibilities they have heard
 about are necessarily third party libraries. Python's standard
 library has lots of stuff to offer.

+1 on that. I've been using Python for a decade now, and the first third 
party library I've downloaded and used was Pyparsing a month or two 
ago. I'll be the first to admit that my programs tend to be on the 
small size, but they're useful to me. The lack of third party libraries 
to Python 3 is not necessarily a show-stopper.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Martin v. Löwis
 And environment variables, command line arguments, and file names
 are not bytes, but characters.
 
 On Windows NT, sure. On Unix they're still bytes no matter how much we
 want them to be characters.

Only in the API of the OS itself. Treating them as bytes in the
application is a mistake. The bytes are intended to represent
characters, so Python should treat them as what they are.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Steven D'Aprano
On Sat, 6 Dec 2008 11:48:27 am Nick Coghlan wrote:
 Toshio Kuratomi wrote:
  Nick Coghlan wrote:
...
  Why? Most programs won't be able to do anything with it. And if
  the program *can* do something with it... that's what the bytes
  version of the APIs are for.
 
  Nonsense.  A program can do tons of things with a non-decodable
  filename.  Where it's limited is non-decodable filedata.

 You can't display a non-decodable filename to the user, hence the
 user will have no idea what they're working on. Non-filesystem
 related apps have no business trying to deal with insane filenames.

I don't agree. Putting my user's hat on, I know what I would expect: the 
app should display *some* name, it doesn't matter exactly what, so long 
as:

* it's as close as possible to the real name; 

* it is unique in that directory (doesn't shadow another file); and

* it's enough to identify the file so I can read/save/delete/rename the 
file.

I think there are analogous situations: long-time Windows users will be 
used to seeing files listed as longfilename.txt in some applications 
and longfi~1.txt in another. Under POSIX, file names can contain 
unprintable ctrl characters, and the shell will print them at least 
three ways, depending on context. E.g. for a file containing a 
formfeed, I get one of ? \f or ^L in bash.

Applications can deal with such weird file names. KDE's file manager 
(konqueror) and file selection dialog both show the character as a 
small square, presumably the font's missing character glyph, and KDE 
apps can open and save the file. Still speaking as a user, I think it 
is quite reasonable to expect applications to deal with undisplayable 
filenames: displaying the name and opening the file are orthogonal 
concepts, although I accept that command-line interfaces will have 
difficulty with file names that can't be typed by the user!

I appreciate that broken unicode is more difficult to deal with than 
unprintable control characters, but the basic principle is the same.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Bill Janssen
Thomas Wouters [EMAIL PROTECTED] wrote:

 Allow me to paraphrase glyph (with whom I'm in complete agreement, for what
 it's worth): many newbies will be disappointed by Python if they start with
 Python 3.0 and discover that most of the cool possibilities they had heard
 about are 'being worked on' and not quite ready. I don't doubt that 3.0 will
 be easier for the new programmer to learn, but I do not believe the average
 Oh, I heard about Python, let's learn it person should be pointed to 3.0
 right now. They should be encouraged to learn 2.6 -- or even 2.5.

I think that's right.

I was asked this question today, and it comes up (to me) fairly often at
PARC.  I usually suggest using the Python version that's standard for
the user's platform, if they use OS X or Linux (and most do), which is
typically 2.5 (for OS X Leopard), and 2.4 (for Linux -- may be out of date).
For Windows users, I suggest the latest release (2.6).

Bill
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Ulrich Eckhardt wrote:
 On Friday 05 December 2008, Guido van Rossum wrote:
 At the risk of bringing up something that was already rejected, let me
 propose something that follows the path taken in 3.0 for filenames,
 rather than doubling back:

 For os.environ, os.getenv() and os.putenv(), I think a similar
 approach as used for os.listdir() and os.getcwd() makes sense: let
 os.environ skip variables whose name or value is undecodable, and have
 a separate os.environb() which contains bytes; let os.getenv() and
 os.putenv() do the right thing when the arguments passed in are bytes.

 For sys.argv, because it's positional, you can't skip undecodable
 values, so I propose to use error=replace for the decoding; again, we
 can add sys.argvb that contains the raw bytes values. The various
 os.exec*() and os.spawn*() calls (as well as os.system(), os.popen()
 and the subprocess module) should all accept bytes as well as strings.

 On Windows, the bytes APIs should probably not exist.

 I predict that most developers can get away with not using the bytes
 APIs at all. The small minority that needs to be robust if not all
 filenames use the system encoding can use the bytes APIs.
 
 I know some of those developers, you can contact them via 
 [EMAIL PROTECTED] Seriously, what would you suggest to someone that 
 wants to handle paths in a portable way? Using the Unicode variants of 
 functions is fubar, because encoding/decoding is not universally possible. 
 Using the byte variant is equally fubar, because e.g. on MS Windows it is not 
 supported, except through a very lossy roundtrip through the locale's 
 codepage, limiting your functionality.
 
 I actually think it is about time to give up on trying to think about a path 
 as a string. Dito for data received from os.environ or sys.argv. There are 
 only very few things that are universal to them and a reliable encoding is 
 none of them. Then, once you have let that idea go, meditate a bit over the 
 Zen.
 
 What I propose is that paths must be treated as OS-specific, with the only 
 common reliable operations being joining them, concatenating them and 
 splitting them into segments divided by the (again, OS-specific) separator. 
 Other operations, like e.g. appending a string or converting it to a string 
 in order to display it can fail. And if they fail, they should fail noisily. 
 In 99% of all cases, using the default encoding will work and do what people 
 expect, which is why I would make this conversion automatic. In all other 
 cases, it will at least not fail silently (which would lead to garbage and 
 data loss) and allow more sophisticated applications to handle it.

Amen!  the idea that paths, environment varioables, and stuff pulled off
of sockets can be treated as text rather than strings is just wishful
thinking.


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [EMAIL PROTECTED]
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJOgYd+gerLs4ltQ4RArQFAKDUZLXjwsIvNfNji4hbqM/aOZ0lMQCfRBq/
DHdYt2GGA1CrYA4a5pj+AZ4=
=4CcT
-END PGP SIGNATURE-

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread rdmurray

On Sat, 6 Dec 2008 at 13:06, Steven D'Aprano wrote:

Applications can deal with such weird file names. KDE's file manager
(konqueror) and file selection dialog both show the character as a
small square, presumably the font's missing character glyph, and KDE
apps can open and save the file. Still speaking as a user, I think it
is quite reasonable to expect applications to deal with undisplayable
filenames: displaying the name and opening the file are orthogonal


Agreed.  I would file a bug report if an application couldn't
handle a file that validly exists in my file system, no matter
how broken the filename might appear to be.


concepts, although I accept that command-line interfaces will have
difficulty with file names that can't be typed by the user!


Difficult, but not impossible: tab completion in the shell can allow
the user to submit otherwise difficult to type filenames to a program.
Which means python should be able to handle such things in argument
strings, so that my python utilities can manipulate such files when
specified as command line argumentsand a sensible error should be
generated by default if the program hasn't been written in such a way
that it can handle such input.

It would be wonderful if all Unix variants would switch to all UTF-8 (I
have done so on my own machines...I think :).  But it is a slow process.

--RDM
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread glyph


On 5 Dec, 06:10 pm, [EMAIL PROTECTED] wrote:

On Thu, Dec 4, 2008 at 11:27 PM,  [EMAIL PROTECTED] wrote:

With all due respect, for me, library support and serious use are
synonymous.


Glyph, I cannot have a discussion with you if every single post of
yours is longer than my combined daily output. Please spend some time
writing shorter posts. I'm sure I'm not the only one here with a short
attention span. :-)


I already spend a lot of time trying to remove extraneous details.  The 
drafts of these messages are usually 3x as long :).  So, trying to keep 
it short:


Thomas paraphrased my point pretty well.  The importance of libraries 
cannot be overemphasized.  Maybe you're right and the stdlib is enough 
for a large audience, but I don't know that audience.  Everyone I know 
who uses Python, uses it because of a library.  In some cases, an 
equivalent library exists for another language, and Python wins because 
it has a nicer syntax.  But, in no case does Python win where it 
*doesn't* have the library.


I think that the marketing for py3 needs to target library vendors 
before targeting novices.  If the novices are targeted first, they are 
going to have a bad experience when python libraries don't work with 
py3, and library maintainers are going to have a bad experience when 
clueless newbies harass them to update their software without 
understanding the magnitude of the work to do so.


I've been predicting this for years, but two days into Python 3's 
release, I've already seen real-world examples of this pattern in 
#twisted.  I can tell these people to downgrade to py2 when they come 
ask me for help, but I don't think most of them ask for help.  They just 
get angry and learn Java instead.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Stephen J. Turnbull
Nick Coghlan writes:

  True, but it's still a fairly important problem to have a solution to.
  Even internally in large organisations there can be some pretty insane
  environments as cruft accumulates over the years.

MA and globalization makes it inevitable.

Toshio will remember the Mizuho April Fool's Day fiasco (a couple of
large banks merged, and when they reopened as a merged entity called
Mizuho, the ATM system immediately crashed).

Japan being a country that doesn't believe in GAAP, such mergers are a
very difficult problem.  I don't know the details, but I wouldn't even
be surprised if encodings played a role in that mess because Japanese
companies often have their own internal variants of the national
standard JIS encoding.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Stephen J. Turnbull
Martin v. Löwis writes:
   5) represent all environment variables in Unicode strings,
  including the ones that currently fail to decode.
  (then do the same to file names, then drop the byte-oriented
   file operations again)
   
   Please, don't do that! Bytes are not characters!
  
  And environment variables, command line arguments, and file names
  are not bytes, but characters.

Unfortunately, both POSIX and OS implementation practice (including,
for example, VFAT file systems: NT-derived OSes are not safe!) say
otherwise, and that makes your line of argument extremely dangerous.

Remember, in a fight between human custom and machine programming, the
machine can always win by crashing.  For that reason, bytes must be
the underlying representation, always available, although I think it's
essential to make a text representation easily accessible, and even
the default.  Humans who would rather kvetch about the machine's
breakage than get a useful answer can (and should---problems will be
rare for most usage patterns) use the text representation.  Humans who
want reliability or debuggability, on the other hand, should have
something that cannot be mistaken for text immediately available.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread glyph

On 01:47 am, [EMAIL PROTECTED] wrote:

In spite of Python being a programming language, there is a difference
between 'casual user of the language' and 'library developer'; 3.0 is
certainly a must for all actual library developers, and I'm sure most 
of
them know about 3.0 by now. We're talking about first impressions for 
people

without that knowledge.


Well if most library developers already know 3.0 by now, I would hope
they aren't going to sit on their hands, and solve the issues at hand!


The best thing for 3.0 adoption would be a 3.0 welcoming committee.  A 
group of hackers wandering from one popular open source library to 
another, writing patches for 3.x compatibility issues.  There must be 
lots of people who care about 3.x adoption, and this is probably the 
most effective way they can reach that goal.


Each time I am going to fix a 3.0 compatibility issue, I have a choice: 
I can either make Twisted itself better (add features, fix bugs), or I 
can keep Twisted exactly the same but do lots of work so it will work on 
3.0.  It seems pretty clear to me that, to the extent that I have time 
for Twisted, fixing bugs in the HTTP implementation would be a better 
deal than puzzling through a megabyte of diffs generated by 2to3, trying 
to understand where it went wrong, and how.


This doesn't mean I'm sitting on my hands.  It just means I have 
better things to be doing with my hands.  (To be precise, 1054 better 
things to do, re: Twisted.  Add in the Divmod projects and it's more 
like 3000.)


Of course the distant threat of an unmaintained 2.x series is enough to 
motivate me to push a *little* in this direction, but it doesn't make me 
happy about it.


I think this is exactly what the marketing effort around 3.0 needs to be 
doing: making a positive case for library and application authors to 
spend time to update to 3.x.  This is a lot of work, and many (I might 
even say most) of us need a lot of cajoling.  Free patches are a good 
incentive :).

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Bugbee, Larry

There has been some discussion here that users should use the str or
byte function variant based on what is relevant to their system, for
example when getting a list of file names or opening a file.  That
thought process really doesn't do much for those of us that write code
that needs to run on any platform type, without alteration or the
addition of complex if-statements and/or exceptions.

Whatever the resolution here, and those of you addressing this thorny
issue have my admiration, the solution should be such that it gives
consistent behavior regardless of platform type and doesn't require the
programmer to know of all the minute details of each possible target
platform.  

That may not be possible for a while, so interim solutions should be
such that it minimizes later pain.  If that means hiding implementation
details behind a new function, so be it.  Then, at least, the body of
one's app is not burdened with this problem later when conditions
change.

I'm glad I'm not the only one with hard problems.  ;-)

Larry

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com