date:20081205

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Steve Holden

Martin v. Löwis wrote:
>> Please, if you have a *new* idea that doesn't have a failure mode, by
>> all means post it.  But don't resurrect a pointless bikeshed.
> 
> While I completely agree that it is pointless to reiterate the same
> arguments over and over, I disagree that the bikeshed metapher applies.
> This metapher (IIUC) describes a trivial design issue that is merely
> a matter of taste, rather than having deep technical implications.
> Using Unicode or bytes for strings is not of that kind.
> 
+1

These issues are very important because they affect everyone. Even
though very few people actually understand them. Including me, which is
why I've been so quiet on this thread.

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Georg Brandl

Barry Warsaw schrieb:
> On Dec 4, 2008, at 6:21 PM, Martin v. Löwis wrote:
> 
 I can't find any docs built for Python 3.0 (not 3.1a0).
>>>
>>> The Windows installation has new 3.0 doc dated Dec 3, so it was  
>>> built,
>>> just not posted correctly.
> 
>> That doesn't mean very much. I built it on my local machine. Anybody
>> with subversion and python could do that; the documentation is in
>> subversion.
> 
>> Whether or not it appears on the web site as part of the release
>> process is an entirely different matter. It used to be that the
>> doc maintainer (Fred Drake) was part of the release team and release
>> process. I think Georg is complaining that he is release maintainer,
>> but not part of the release process.
> 
> I've asked Georg to update PEP 101 to make his role as Documentation  
> Expert explicit.  Unfortunately we only debug major releases once (or  
> twice) every 18 months.  But next time, we'll get that part right for  
> sure!

Done that now. Since release.py builds the docs all right, there's not
much left for me to do except check that everything is ok.

> In the meantime, I'll make sure Georg is involved in point releases  
> moving forward.

That's good. Thanks!

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Adam Olsen

On Fri, Dec 5, 2008 at 12:00 AM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
>> Please, if you have a *new* idea that doesn't have a failure mode, by
>> all means post it.  But don't resurrect a pointless bikeshed.
>
> While I completely agree that it is pointless to reiterate the same
> arguments over and over, I disagree that the bikeshed metapher applies.
> This metapher (IIUC) describes a trivial design issue that is merely
> a matter of taste, rather than having deep technical implications.
> Using Unicode or bytes for strings is not of that kind.

That we need to support both unicode and bytes is important, but
already seems to have consensus.  However, they present two distinct
usage patterns:

* unicode text, presentable to the user, interacts with all manor of
standardized APIs
* bytes, limited to local, internal use.  Only approximated forms can
be presented to the user, only custom formats can be saved externally

None of the proposals have turned these into a single use case.  All
they do is trade off various forms of subtly switch back and forth,
which leads to failure.  Debating which subtle failure is better is a
bikeshed.

Not only that, but we already have a solution that makes the choice
explicit, avoiding the subtle failure.  This is the solution already
in use for os file & path functions.  It's the solution Guido
supports.

-- 
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Taint Mode in Python 3.0

2008-12-05 Thread Nick Coghlan

Maciej Fijalkowski wrote:
> Hello,
> 
> The thing is pypy's taint code is broken. Basically you don't only
> need to patch all places that return pyobject, but also all places
> that might modify anything. (All side effects) For example innocently
> looking call to addition might end up calling arbitrary python code
> (and have arbitrary side effects). There is a question how do you
> approach such things?

Taint isn't an easy problem, but PyPy is still a *much* better platform
for that kind of experimentation than CPython.

RPython, objects spaces, the code generation, etc all give you much more
powerful tools to play with than the raw C code of the reference
interpreter.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan

[EMAIL PROTECTED] wrote:
> At least this time I think I've encapsulated pretty much my entire
> argument here, so if you don't buy it, we can probably just agree to
> disagree :).

Glyph, the only point I would add to your message is this one:

Adding a "blessed" way to encode arbitrary binary data into a Python 3.0
str object strikes me as giving up on one of the key advances in the new
version of the language.

8-bit strings were a problem in Python 2.x because they blurred the
boundary between arbitrary binary data and ASCII or latin-1 character data.

One of the most interesting aspects of Python 3.0 is its attempt to get
developers to be explicit about this distinction (both in the code and
in their own minds) by enforcing separation between arbitrary binary
data (held in bytes and bytearray instances) and character data (held in
str instances).

I don't understand how tunneling arbitrary binary data through str
instances (*regardless* of encoding mechanism) can possibly fail to
recreate exactly the same "is it text or binary data?" ambiguity
problems that the str/bytes split is intended to eliminate. And if that
happens, then what exactly was the point in moving to an all Unicode
string model for Py3k?

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Victor Stinner

Le Friday 05 December 2008 00:39:24 Martin v. Löwis, vous avez écrit :
> 5) represent all environment variables in Unicode strings,
>including the ones that currently fail to decode.
>(then do the same to file names, then drop the byte-oriented
> file operations again)

Please, don't do that! Bytes are not characters!

-- 
Victor Stinner aka haypo
http://www.haypocalc.com/blog/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Ulrich Eckhardt

On Friday 05 December 2008, [EMAIL PROTECTED] wrote:
> Filenames and environment variables would all need to be encoded or
> decoded according to this magic encoding.

Those, and commandline arguments, too.

Uli

-- 
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

**
   Visit our website at 
**
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten 
bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen 
Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein 
sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, 
weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte 
Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht 
verantwortlich.

**

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Ulrich Eckhardt

On Friday 05 December 2008, Adam Olsen wrote:
> Many of the windows APIs use UTF-16 without validating it.  They'll
> pass through invalid strings until they hit something that does
> validate, at which point it'll blow up.
>
> I suspect that it doesn't happen very often in practice, as having
> only one encoding makes it quite clear that it's a broken file name,
> not a mixed encoding environment.

Actually, I wouldn't say that's a problem at all. The point is that stuff that 
is blissfully unaware of encodings typically uses some ASCII-de(p)rived text. 
Those char-strings are translated according to the current locale, which then 
does the filtering and validation. The result may be gibberish (GIGO 
principle) but at least it's UTF-16 gibberish. ;)

Uli

-- 
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

**
   Visit our website at 
**
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten 
bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen 
Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein 
sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, 
weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte 
Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht 
verantwortlich.

**

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Victor Stinner

Hi,

Le Thursday 04 December 2008 21:02:19 Toshio Kuratomi, vous avez écrit :
> I opened up bug http://bugs.python.org/issue4006 a while ago and it was
> suggested in the report that it's not a bug but a feature and so I
> should come here to see about getting the feature changed :-)

Yeah, I prefer to discuss such changes on the mailing list.

> These mixed encodings can occur for a variety of reasons.  Here's an
> example that isn't too contrived :-)
> (...)
> Furthermore, they don't want to suffer from the space loss of using 
> utf-8 to encode Japanese so they use shift-jis everywhere.

"space loss"? Really? If you configure your server correctly, you should get 
UTF-8 even if the file system is Shift-JIS. But it would be much easier to 
use UTF-8 everywhere.

Hum... I don't think that the discussion is about one specific server, but the 
lack of bytes environment variables in Python3 :-)

> 1) return mixed unicode and byte types in ...

NO!

> 2) return only byte types in os.environ

Hum... Most users have UTF-8 everywhere (eg. all Windows users ;-)), and 
Python3 already use Unicode everywhere (input(), open(), filenames, ...).

> 3) silently ignore non-decodable value when accessing os.environ['PATH']
> as we do now but allow access to the full information via
> os.environ[b'PATH'] and os.getenvb()

I don't like os.environ[b'PATH']. I prefer to always get the same result 
type... But os.listdir() doesn't respect that :-(

   os.listdir(str) -> list of str
   os.listdir(bytes) -> list of bytes

I would prefer a similar API for easier migration from Python2/Python3
(unicode). os.environb sounds like the best choice for me.


But they are open questions (already asked in the bug tracker):

(a) Should os.environ be updated if os.environb is changed? If yes, how?
   os.environb['PATH'] = '\xff' (or any invalid string in the system 
 default encoding)
   => os.environ['PATH'] = ???

(b) Should os.environb be updated if os.environ is changed? If yes, how?

The problem comes with non-Unicode locale (eg. latin-1 or ASCII): most charset 
are unable to encode the whole Unicode charset (eg. codes >= 65535).

   os.environ['PATH'] = chr(0x1)
   => os.environb['PATH'] = ???

(c) Same question when a key is deleted (del os.environ['PATH']).

If Python 3.1 will have os.environ and os.environb, I'm quite sure that some 
modules will user os.environ and other will prefer os.environb. If both 
environments are differents, the two modules set will work differently :-/

It would be maybe easier if os.environ supports bytes and unicode keys. But we 
have to keep these assertions:
   os.environ[bytes] -> bytes
   os.environ[str] -> str

> 4) raise an exception when non-decodable values are *accessed* and
> continue as in #3.

I like os.listdir() behaviour: just *ignore* non-decodable files. If you 
really want to access these files, use a bytes directory name ;-)

> I think that the ease of debugging is lost when we silently ignore an error.

Guido gave a good example. If your directory contains an non decodable 
filename (eg. "???.txt"): glob('*.py') will fail because of the evil 
filename. With the current behaviour, you're unable to list all files but 
glob('*.py') will list all Python scripts!

And Python3 is released, it's maybe a bad idea to change the behaviour (of 
os.environ) in Python 3.1 :-/

> The bug report I opened suggests creating a PEP to address this issue.

Please, try to answer to my questions about os.environ and os.environb 
consistency.

I also like bytes environment variables. I need them for my fuzzing program. 
The lack of bytes variables is a regression from Python2 (for my program). On 
UNIX, filenames are bytes and the environment variables are bytes. For the 
best interoperability, Python3 should support bytes. But the default choice 
should always be characters (unicode) and to never mix the bytes and str 
types ;-)

---

As usual, it goes faster if someone writes a patch :-) I could try to work on 
it.

-- 
Victor Stinner aka haypo
http://www.haypocalc.com/blog/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Ulrich Eckhardt

On Friday 05 December 2008, Guido van Rossum wrote:
> At the risk of bringing up something that was already rejected, let me
> propose something that follows the path taken in 3.0 for filenames,
> rather than doubling back:
>
> For os.environ, os.getenv() and os.putenv(), I think a similar
> approach as used for os.listdir() and os.getcwd() makes sense: let
> os.environ skip variables whose name or value is undecodable, and have
> a separate os.environb() which contains bytes; let os.getenv() and
> os.putenv() do the right thing when the arguments passed in are bytes.
>
> For sys.argv, because it's positional, you can't skip undecodable
> values, so I propose to use error=replace for the decoding; again, we
> can add sys.argvb that contains the raw bytes values. The various
> os.exec*() and os.spawn*() calls (as well as os.system(), os.popen()
> and the subprocess module) should all accept bytes as well as strings.
>
> On Windows, the bytes APIs should probably not exist.
>
> I predict that most developers can get away with not using the bytes
> APIs at all. The small minority that needs to be robust if not all
> filenames use the system encoding can use the bytes APIs.

I know some of those developers, you can contact them via 
[EMAIL PROTECTED] Seriously, what would you suggest to someone that 
wants to handle paths in a portable way? Using the Unicode variants of 
functions is fubar, because encoding/decoding is not universally possible. 
Using the byte variant is equally fubar, because e.g. on MS Windows it is not 
supported, except through a very lossy roundtrip through the locale's 
codepage, limiting your functionality.

I actually think it is about time to give up on trying to think about a path 
as a string. Dito for data received from os.environ or sys.argv. There are 
only very few things that are universal to them and a reliable encoding is 
none of them. Then, once you have let that idea go, meditate a bit over the 
Zen.

What I propose is that paths must be treated as OS-specific, with the only 
common reliable operations being joining them, concatenating them and 
splitting them into segments divided by the (again, OS-specific) separator. 
Other operations, like e.g. appending a string or converting it to a string 
in order to display it can fail. And if they fail, they should fail noisily. 
In 99% of all cases, using the default encoding will work and do what people 
expect, which is why I would make this conversion automatic. In all other 
cases, it will at least not fail silently (which would lead to garbage and 
data loss) and allow more sophisticated applications to handle it.

Uli

-- 
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

**
   Visit our website at 
**
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten 
bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen 
Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein 
sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, 
weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte 
Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht 
verantwortlich.

**

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Fix for frame_setlineno() in frameobject.c function

2008-12-05 Thread Fabien . Bouleau

Hello,

This concerns a known bug in the frame_setlineno() function for Python 
2.5.x and 2.6.x (maybe in earlier version too). It is not possible to use 
this function when the address or line offset are greater than 127. The 
problem comes from the lnotab variable which is typed char*, therefore 
implicitely signed char*. Any value above 127 becomes a negative number.

The fix is very simple (applied on the Python 2.6.1 version of the source 
code):

--- frameobject.c   Thu Oct 02 19:39:50 2008
+++ frameobject_fixed.c Fri Dec 05 11:27:42 2008
@@ -119,8 +119,8 @@
line = f->f_code->co_firstlineno;
new_lasti = -1;
for (offset = 0; offset < lnotab_len; offset += 2) {
-   addr += lnotab[offset];
-   line += lnotab[offset+1];
+   addr += ((unsigned char*)lnotab)[offset];
+   line += ((unsigned char*)lnotab)[offset+1];
if (line >= new_lineno) {
new_lasti = addr;
new_lineno = line;


It would be nice to fix it for Python 2.5 and above, in order to have a 
proper MSI installer for Windows.

Best regards,
Fabien Bouleau



DISCLAIMER: 
This e-mail contains proprietary information some or all of which may be 
legally privileged. It is for the intended recipient only. If an addressing or 
transmission error has misdirected this e-mail, please notify the author by 
replying to this e-mail. If you are not the intended recipient you must not 
use, disclose, distribute, copy, print, or rely on this e-mail.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final FFT

2008-12-05 Thread

http://code.activestate.com/recipes/576550/ 

This recipe shows how to use gsl FFT with python 3.

ctypes is really good!
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Jean-Paul Calderone


On Thu, 4 Dec 2008 22:05:05 -0800, Guido van Rossum <[EMAIL PROTECTED]> wrote:

On Thu, Dec 4, 2008 at 9:40 PM,  <[EMAIL PROTECTED]> wrote:

The default case, the case of the user without the wherewithal
to understand the nuances of the distinction between 2.x and 3.x, is a user
who should use 2.x.


Not at all clear. If they're not sensitive to those nuances it's just
as likely that they're a casual developer (e.g. a student just
learning to program). Such users are unlikely to start using major 3rd
party packages like Twisted or Django, which would be completely
overwhelming to someone just learning.


That seems like it would be right to me, but two or three times a month
someone shows up in the Twisted IRC channel who is learning both Python
and Twisted at the same time.  So apparently there are a lot of people
for whom this isn't overwhelming.

Jean-Paul
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Eduardo O. Padoan

On Fri, Dec 5, 2008 at 12:35 AM, A.M. Kuchling <[EMAIL PROTECTED]> wrote:
> On Thu, Dec 04, 2008 at 05:29:31PM -0800, Raymond Hettinger wrote:
>> Here's a bright idea.  On the 3.0 release page, include a box listing
>> which major third-party apps have been converted.  Update it
>> once every couple of weeks.  That way, we're not explicitly
>
> That's an excellent idea.  We could have a webpage, or start a
> topic-specific weblog for posting announcements.
>
> I've started a draft of a 3.0 FAQ in the wiki at
> .  Once it's finished we
> can move it into the 3.0 release pages.  Everyone please edit and
> improve it!

Sometime ago I started a page on the wiki to collect reports of early
migrations by the community:
http://wiki.python.org/moin/Early2to3Migrations

Maybe this would be relevant to point on the FAQ.

> --amk
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/eduardo.padoan%40gmail.com
>



-- 
Eduardo de Oliveira Padoan
http://djangopeople.net/edcrypt/
"Distrust those in whom the desire to punish is strong." -- Goethe,
Nietzsche, Dostoevsky
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fix for frame_setlineno() in frameobject.c function

2008-12-05 Thread Benjamin Peterson

Hi,
Please post this on the issue tracker. http://bugs.python.org

On Fri, Dec 5, 2008 at 4:42 AM,  <[EMAIL PROTECTED]> wrote:
> Hello,
>
> This concerns a known bug in the frame_setlineno() function for Python
> 2.5.x and 2.6.x (maybe in earlier version too). It is not possible to use
> this function when the address or line offset are greater than 127. The
> problem comes from the lnotab variable which is typed char*, therefore
> implicitely signed char*. Any value above 127 becomes a negative number.
>
> The fix is very simple (applied on the Python 2.6.1 version of the source
> code):
>
> --- frameobject.c   Thu Oct 02 19:39:50 2008
> +++ frameobject_fixed.c Fri Dec 05 11:27:42 2008
> @@ -119,8 +119,8 @@
>line = f->f_code->co_firstlineno;
>new_lasti = -1;
>for (offset = 0; offset < lnotab_len; offset += 2) {
> -   addr += lnotab[offset];
> -   line += lnotab[offset+1];
> +   addr += ((unsigned char*)lnotab)[offset];
> +   line += ((unsigned char*)lnotab)[offset+1];
>if (line >= new_lineno) {
>new_lasti = addr;
>new_lineno = line;
>




-- 
Cheers,
Benjamin Peterson
"There's nothing quite as beautiful as an oboe... except a chicken
stuck in a vacuum cleaner."
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread James Y Knight


On Dec 5, 2008, at 5:27 AM, Ulrich Eckhardt wrote:
Using the byte variant is equally fubar, because e.g. on MS Windows  
it is not

supported, except through a very lossy roundtrip through the locale's
codepage, limiting your functionality.



Yeah, IMO whole mess could have been avoided by keeping the filename/ 
args/environ simply *bytes*, like it really is, on unix. Then, make  
the Windows version of python use (always! not dependent upon locale!)  
utf-8 to decode the utf-8 bytestring to the UTF-16 that the Windows  
platform APIs expect (and vice versa). And never use the ASCII variant  
of the windows APIs.


This would mean that all *inputs* would succeed, but some *outputs*  
would not, on Windows. But that's not a new kind of failure: NUL has  
never been allowed in argv/environ, and filenames have all sorts of  
platform-dependent restrictions.


But unfortunately, it's too late for that solution...

James
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi

Terry Reedy wrote:
> Toshio Kuratomi wrote:
>>
>>> I would think life would be ultimately easier if either the file server
>>> or the shell server automatically translated file names from jis and
>>> utf8 and back, so that the PATH on the *nix shell server is entirely
>>> utf8.
>>
>> This is not possible because no part of the computer knows what the
>> encoding is.  To the computer, it's just a sequence of bytes.  Unlike
>> xml or the windows filesystem (winfs? ntfs?) where the encoding is
>> specified as part of the document/filesystem there's nothing to tell
>> what encoding the filenames are in.
> 
> I thought you said that the file server keep all filenames in shift-jis,
> and the shell server all in utf-8.

Yes.  But this is part of the setup of the example to keep things
simple.  The fileserver or shell server could themselves be of mixed
encodings (for instance, if it was serving home directories to users all
over the world each user might be using a different encoding.)

>  If so, then the shell server could
> know if it were told so.
> 

Where are you going to store that information?  In order for python to
run without errors, will it have to be configured on each system it's
installed on to know the encoding of each filename?  Or are we going to
try to talk each *NIX vendor into creating new filesystems that record
that information and after a five year span of time declare that python
will not run on other filesystems in corner cases?

I think that this way does not hold a reasonable expectation of keeping
python a portable language.

-Toshio

signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Python security: draft article on the wiki

2008-12-05 Thread Victor Stinner

Hi,

I started to write a short article about Python security on the wiki:

   http://wiki.python.org/moin/Security

Nothing useful yet.

-- 
Victor Stinner aka haypo
http://www.haypocalc.com/blog/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread skip


Martin> There is. There have been the following trove classifiers
Martin> defined for a few weeks now:

Martin> Programming Language :: Python :: 2
Martin> Programming Language :: Python :: 2.3
Martin> Programming Language :: Python :: 2.4
Martin> Programming Language :: Python :: 2.5
Martin> Programming Language :: Python :: 2.6
Martin> Programming Language :: Python :: 2.7
Martin> Programming Language :: Python :: 3
Martin> Programming Language :: Python :: 3.0
Martin> Programming Language :: Python :: 3.1

Good.  Now we just need to populate them.  I take it the classifiers without
minor numbers imply any known minor version (e.g., 2 ==> 2.3 and greater)?

Skip
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread A.M. Kuchling

On Fri, Dec 05, 2008 at 05:40:46AM -, [EMAIL PROTECTED] wrote:
> For most users, especially new users who have yet to be impressed with  
> Python's power, 2.x is much better.  It's not like "library support" is  
> one small check-box on the language's feature sheet: most of the  
> attractive things about Python are libraries.  Of course I am not free  

Here I agree, sort of.  Newbies may not understand what they're giving
up in terms of libraries.  (The 'sort of' is because, having learned
3.0, learning the changes for 2.6 is certainly much easier than
learning a first programming language is.)

> The third (albeit much less likely) option is that you're learning  
> Python to learn to interact with a system that's scriptable in embedded  
> Python, like Blender or Gimp.  I don't think there's a single system of  
> that variety which uses 3.0 yet, and these will likely be even slower to  
> move than libraries.  

Let me note that if some application embeds Python for a specialized
purpose, where the only modules imported are either user-written or
part of the application, it seems much *easier* to move to Python 3
because the scripts don't use arbitrary third-party libraries.  Python
embedded in an e-mail MTA might use libraries for DNS or file I/O or
databases and has to be cautious about versions; Python in Gimp
probably doesn't, in practice.

--amk
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python + Java Integration

2008-12-05 Thread Bill Janssen

> One thing that would help Python in this "debate" (or, perhaps simply  
> put it in the running, at least as a "next Java" candidate) would be  
> if Python had an easier migration path for Java developers that  
> currently rely upon various third-party libraries.  The wealth of  
> third-party libraries available for Java has always been one of its  
> great strengths.  Ergo, if Python had an easy-to-use, recommended way  
> to use those libraries within the Python environment, that would be a  
> significant advantage to present to Java developers and those who  
> would choose Ruby over Java.  Platform compatibility is always a huge  
> motivator for those looking to migrate or upgrade.

Personally, I'm using Andi Vajda's JCC for this purpose.  Recommended.
The nice thing about it is that it turns jar files into Python modules;
you don't need the source.

http://pypi.python.org/pypi/JCC

Bill
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Summary of Python tracker Issues

2008-12-05 Thread Python tracker


ACTIVITY SUMMARY (11/28/08 - 12/05/08)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 2233 open (+55) / 14139 closed (+41) / 16372 total (+96)

Open issues with patches:   753

Average duration of open issues: 705 days.
Median duration of open issues: 2193 days.

Open Issues Breakdown
   open  2214 (+54)
pending19 ( +1)

Issues Created Or Reopened (96)
___

Coding cookie crashes IDLE   11/28/08
CLOSED http://bugs.python.org/issue4454created  tjreedy   
   

No Windows List in IDLE if several windows have the same title   11/28/08
CLOSED http://bugs.python.org/issue4455created  amaury.forgeotdarc
   patch   

xmlrpc is broken 11/28/08
CLOSED http://bugs.python.org/issue4456created  benjamin.peterson 
   

__import__ documentation obsolete11/29/08
   http://bugs.python.org/issue4457created  stevenjd  
   

getopt.gnu_getopt() loses dash argument  11/29/08
CLOSED http://bugs.python.org/issue4458created  muntyan   
   

bdist_rpm assumes python 11/29/08
   http://bugs.python.org/issue4459created  John5342  
   

The parameter of PyInt_AsSsize_t() is not checked to see if it i 11/29/08
CLOSED http://bugs.python.org/issue4460created  CWRU_Researcher1  
   

parameters of PyLong_FromString() are not checked for NULL   11/29/08
   http://bugs.python.org/issue4461created  CWRU_Researcher1  
   patch   

result of PyList_GetItem() not validated 11/29/08
CLOSED http://bugs.python.org/issue4462created  CWRU_Researcher1  
   

Parameters and result of PyList_GetItem() are not validated  11/29/08
CLOSED http://bugs.python.org/issue4463created  CWRU_Researcher1  
   

PyList_GetItem() result and parameters not fully validated   11/29/08
CLOSED http://bugs.python.org/issue4464created  CWRU_Researcher1  
   

The result of set_copy() is not checked for NULL 11/29/08
CLOSED http://bugs.python.org/issue4465created  CWRU_Researcher1  
   

The return value of PyFile_FromFile is not checked for NULL  11/29/08
CLOSED http://bugs.python.org/issue4466created  CWRU_Researcher1  
   

return value of PyUnicode_AsEncodedString() is not checked for N 11/29/08
CLOSED http://bugs.python.org/issue4467created  CWRU_Researcher1  
   

Restore chapter enumeration in Python docs   11/30/08
CLOSED http://bugs.python.org/issue4468created  schluehk  
   

CVE-2008-5031 multiple integer overflows 11/30/08
   http://bugs.python.org/issue4469created  doko  
   

smtplib SMTP_SSL not working.11/30/08
   http://bugs.python.org/issue4470created  lcatucci  
   patch   

IMAP4 missing support for starttls   11/30/08
   http://bugs.python.org/issue4471created  lcatucci  
   patch   

Is shared lib building broken on trunk?  11/30/08
   http://bugs.python.org/issue4472created  skip.montanaro
   

POP3 missing support for starttls

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi

Victor Stinner wrote:
> Hi,
> 
> Le Thursday 04 December 2008 21:02:19 Toshio Kuratomi, vous avez écrit :
> 
>> These mixed encodings can occur for a variety of reasons.  Here's an
>> example that isn't too contrived :-)
>> (...)
>> Furthermore, they don't want to suffer from the space loss of using 
>> utf-8 to encode Japanese so they use shift-jis everywhere.
> 
> "space loss"? Really? If you configure your server correctly, you should get 
> UTF-8 even if the file system is Shift-JIS. But it would be much easier to 
> use UTF-8 everywhere.
> 
> Hum... I don't think that the discussion is about one specific server, but 
> the 
> lack of bytes environment variables in Python3 :-)
>
Yep.  I can't change the logicalness of the policies of a different
organization, only code my application to deal with it :-)

>> 1) return mixed unicode and byte types in ...
> 
> NO!
> 
It's nice that we agree... but I would prefer if you leave enough
context so that others can see that we agree as well :-)

>> 2) return only byte types in os.environ
> 
> Hum... Most users have UTF-8 everywhere (eg. all Windows users ;-)), and 
> Python3 already use Unicode everywhere (input(), open(), filenames, ...).
>
We're also in agreement here.

>> 3) silently ignore non-decodable value when accessing os.environ['PATH']
>> as we do now but allow access to the full information via
>> os.environ[b'PATH'] and os.getenvb()
> 
> I don't like os.environ[b'PATH']. I prefer to always get the same result 
> type... But os.listdir() doesn't respect that :-(
> 
>os.listdir(str) -> list of str
>os.listdir(bytes) -> list of bytes
> 
> I would prefer a similar API for easier migration from Python2/Python3
> (unicode). os.environb sounds like the best choice for me.
> 
.  After thinking about how it would be used in subprocess calls I
agree.  os.environb would allow us to retrieve the full dict as bytes.
os.environ[b''] only works on individual keys.  Also os.getenv serves
the same purpose as os.environ[b''] would whereas os.environb would have
 its own uses.

> 
> But they are open questions (already asked in the bug tracker):
> 
I answered these in the bug tracker.  Here are the answers for the
mailing list:

> (a) Should os.environ be updated if os.environb is changed? If yes, how?
>os.environb['PATH'] = '\xff' (or any invalid string in the system 
>  default encoding)
>=> os.environ['PATH'] = ???
> 
The underlying environment that both variables reflect should be updated
but what is displayed by os.environ should continue to follow the same
rules.  So if we follow option #3::
 os.environb['PATH'] = b'\xff'
 os.environ['PATH'] => raises KeyError because PATH is not a key in
the unicode decoded environment.

(option #4 would issue a UnicodeDecodeError instead of a KeyError)

Similarly, if you start with a variable in os.environb that can only be
represented as bytes and your program transforms it into something that
is decodable it should then show up in os.environ.

> (b) Should os.environb be updated if os.environ is changed? If yes, how?
> 
> The problem comes with non-Unicode locale (eg. latin-1 or ASCII): most 
> charset 
> are unable to encode the whole Unicode charset (eg. codes >= 65535).
> 
>os.environ['PATH'] = chr(0x1)
>=> os.environb['PATH'] = ???
>
Ah, this is a good question.  I misunderstood what you were getting at
when you posted this to the bug report.  I see several options but the
one that seems the most sane is to raise UnicodeEncodeError when setting
the value.  With that, proper code to set an environment variable might
look like this::

LANG=C python3.0
>>> variable = chr(0x1)
>>> try:
>>> # Unicode aware locales
>>> os.environ['MYVAR'] = variable
>>> except UnicodeEncodeError:
>>> # Non-Unicode locales
>>> os.environb['MYVAR'] = bytes(variable, encoding='utf8')

> (c) Same question when a key is deleted (del os.environ['PATH']).
> 
Update the underlying env so both os.environ and os.environb reflect the
change.  Deleting should not hold the problems that updating does.

> If Python 3.1 will have os.environ and os.environb, I'm quite sure that some 
> modules will user os.environ and other will prefer os.environb. If both 
> environments are differents, the two modules set will work differently :-/
> 
Exactly.  So making sure they hold the same information is a priority.

> It would be maybe easier if os.environ supports bytes and unicode keys. But 
> we 
> have to keep these assertions:
>os.environ[bytes] -> bytes
>os.environ[str] -> str
> 
I think the same choices have to be made here.  If LANG=C, we still have
to decide what to do when os.environ[str] is set to a non-ASCii string.

Additionally, the subprocess question makes using the key value
undesirable compared with having a separate os.environb that accesses
the same underlying data.

>> 4) raise an exception when non-decodable values are *accessed* and
>> continue as in #

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Guido van Rossum

On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt <[EMAIL PROTECTED]> wrote:
> Seriously, what would you suggest to someone that
> wants to handle paths in a portable way? Using the Unicode variants of
> functions is fubar, because encoding/decoding is not universally possible.
> Using the byte variant is equally fubar, because e.g. on MS Windows it is not
> supported, except through a very lossy roundtrip through the locale's
> codepage, limiting your functionality.

Write a lightweight abstraction layer that uses Unicode when possible
and bytes otherwise. You'd need to write a few functions for the path
handling code you need, with a platform check or two sprinkled in.

Writing such an abstraction for the purpose of one specific
application is usually simple enough. However, writing a similar
abstraction that serves all apps and all use cases is hard. I hope
that eventually someone will come up with one though -- the failure of
earlier path object proposals notwithstanding.

> I actually think it is about time to give up on trying to think about a path
> as a string. Dito for data received from os.environ or sys.argv. There are
> only very few things that are universal to them and a reliable encoding is
> none of them. Then, once you have let that idea go, meditate a bit over the
> Zen.

This sounds too pessimistic to me. I expect that in five years it will
be universally accepted that these variables must be encoded in a
standard encoding. People are never going to give up thinking about
filenames etc. as strings, because that's what they are conceptually.
The problem is purely one of encoding, and that's where Unix/Linux are
behind the curve, since (so far) they haven't taken the plunge and
picked a universal standard encoding, the way Windows and Mac OS X
have done.

> What I propose is that paths must be treated as OS-specific, with the only
> common reliable operations being joining them, concatenating them and
> splitting them into segments divided by the (again, OS-specific) separator.
> Other operations, like e.g. appending a string or converting it to a string
> in order to display it can fail. And if they fail, they should fail noisily.

That's bad though, since filenames are being displayed all the time
(e.g. in error messages).

> In 99% of all cases, using the default encoding will work and do what people
> expect, which is why I would make this conversion automatic. In all other
> cases, it will at least not fail silently (which would lead to garbage and
> data loss) and allow more sophisticated applications to handle it.

I think the "always fail noisily" approach isn't the best approach.
E.g. if I am globbing for *.py, and there's an undecodable .txt file
in a directory, its presence shouldn't cause the glob to fail.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Ted Leung


On Dec 4, 2008, at 7:59 PM, [EMAIL PROTECTED] wrote:



On 02:35 am, [EMAIL PROTECTED] wrote:

On Thu, Dec 04, 2008 at 05:29:31PM -0800, Raymond Hettinger wrote:
Here's a bright idea.  On the 3.0 release page, include a box  
listing

which major third-party apps have been converted.  Update it
once every couple of weeks.  That way, we're not explicitly


That's an excellent idea.  We could have a webpage, or start a
topic-specific weblog for posting announcements.

I've started a draft of a 3.0 FAQ in the wiki at
.  Once it's finished we
can move it into the 3.0 release pages.  Everyone please edit and
improve it!


It occurs to me that this specific idea (the box with the list of  
supported applications / libraries) should be implementable as a  
simple query against PyPI.  I don't know if it actually is :), but  
it should be.  In general it would be nice to know whether one's  
favorite tools were available for *any* new Python version.


I agree with this.   Plus it might act as an incentive for people to  
port libraries faster...


Ted
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Guido van Rossum

On Thu, Dec 4, 2008 at 11:27 PM,  <[EMAIL PROTECTED]> wrote:
> With all due respect, for me, "library support" and "serious use" are
> synonymous.

Glyph, I cannot have a discussion with you if every single post of
yours is longer than my combined daily output. Please spend some time
writing shorter posts. I'm sure I'm not the only one here with a short
attention span. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Fred Drake


On Dec 5, 2008, at 10:25 AM, [EMAIL PROTECTED] wrote:
Good.  Now we just need to populate them.  I take it the classifiers  
without
minor numbers imply any known minor version (e.g., 2 ==> 2.3 and  
greater)?



This is an excellent question, Skip.

There was already "Programming Language :: Python", provided by many  
packages.  I think version compatibility relationships meant by each  
of these classifiers should be made explicit, wherever it is that  
documentation for classifiers is provided.


I don't recall having seen any such documentation; hopefully I just  
need to be hit by another clue.



  -Fred

--
Fred Drake   

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] import docs follow-up

2008-12-05 Thread Georg Brandl

Hi,

as a follow-up to the thread a few days ago, and the bug report, I've
rewritten most of the __import__ docs.  I've attached the suggested patch
to the issue .

I'd be glad for reviews. Also, I'd like to ask about opinions if this
"winning idiom" (as a bug comment states) should be in it, instead of
the getattr() helper function:

>>> import sys
>>> __import__('x.y.z')
>>> mod = sys.modules['x.y.z']

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] ANN: new python-porting mailing list

2008-12-05 Thread Georg Brandl

Hi all,

to facilitate discussion about porting Python code between different versions
(mainly of course from 2.x to 3.x), we've created a new mailing list

   [EMAIL PROTECTED]

It is a public mailing list open to everyone.  We expect active participation
of many people porting their libraries/programs, and hope that the list can
be a help to all wanting to go this (not always smooth :-) way.

@python-dev: it would of course be nice to have more than a few developers
on that list ;-)

regards,
Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Merging flow

2008-12-05 Thread Mark Dickinson

On Thu, Dec 4, 2008 at 3:12 PM, Christian Heimes <[EMAIL PROTECTED]> wrote:
> Flow diagram
> 
>
> trunk ---> release26-maint
>   \->  py3k   ---> release30-maint
>

I'm running into problems making this work, with a trivial change:
I committed r67590 (which adds a single assert to ast.c) to the
trunk, then merged to 2.6 and py3k in r67592 and r67595 respectively.
Then I tried:

../svnmerge.py merge -r67595

from the root directory of a clean copy of the release30-maint
branch (svn status gives no output), and got conflicts on '.':

property 'svnmerge-integrated' set on '.'

property 'svnmerge-blocked' set on '.'

--- Merging r67595 into '.':
UPython/ast.c
 C   .

property 'svnmerge-integrated' set on '.'

property 'svnmerge-blocked' deleted from '.'.

I now have a new file dir_conflicts.prej that looks something like:

Trying to change property 'svnmerge-integrated' from
'/python/trunk:1-61437,...,67528,67590', but property has been locally
changed from
'/python/branches/py3k:1-67498,67522-67524,67539,67541,67559,67588' to
'/python/trunk:1-61437,...,67467,67484,67528'.

(where the ... abbreviates a big long list of revision numbers).

Did I mess up somewhere, or does svnmerge not work on
a revision that was itself the result of an svnmerge?

Mark
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] ANN: new python-porting mailing list

2008-12-05 Thread Brett Cannon

On Fri, Dec 5, 2008 at 10:36, Georg Brandl <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> to facilitate discussion about porting Python code between different versions
> (mainly of course from 2.x to 3.x), we've created a new mailing list
>
>   [EMAIL PROTECTED]
>
> It is a public mailing list open to everyone.  We expect active participation
> of many people porting their libraries/programs, and hope that the list can
> be a help to all wanting to go this (not always smooth :-) way.
>

The mailing list URL is
http://mail.python.org/mailman/listinfo/python-porting for those who
don't want to search on the mail.python.org home page (which looks
really dated at this point).

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Merging flow

2008-12-05 Thread Brett Cannon

On Fri, Dec 5, 2008 at 11:20, Mark Dickinson <[EMAIL PROTECTED]> wrote:
> On Thu, Dec 4, 2008 at 3:12 PM, Christian Heimes <[EMAIL PROTECTED]> wrote:
>> Flow diagram
>> 
>>
>> trunk ---> release26-maint
>>   \->  py3k   ---> release30-maint
>>
>
> I'm running into problems making this work, with a trivial change:
> I committed r67590 (which adds a single assert to ast.c) to the
> trunk, then merged to 2.6 and py3k in r67592 and r67595 respectively.
> Then I tried:
>
> ../svnmerge.py merge -r67595
>
> from the root directory of a clean copy of the release30-maint
> branch (svn status gives no output), and got conflicts on '.':
>
> property 'svnmerge-integrated' set on '.'
>
> property 'svnmerge-blocked' set on '.'
>
> --- Merging r67595 into '.':
> UPython/ast.c
>  C   .
>
> property 'svnmerge-integrated' set on '.'
>
> property 'svnmerge-blocked' deleted from '.'.
>
> I now have a new file dir_conflicts.prej that looks something like:
>
> Trying to change property 'svnmerge-integrated' from
> '/python/trunk:1-61437,...,67528,67590', but property has been locally
> changed from
> '/python/branches/py3k:1-67498,67522-67524,67539,67541,67559,67588' to
> '/python/trunk:1-61437,...,67467,67484,67528'.
>
> (where the ... abbreviates a big long list of revision numbers).
>
> Did I mess up somewhere, or does svnmerge not work on
> a revision that was itself the result of an svnmerge?

Someone might know better than me, but I am willing to bet you can't
svnmerge a svnmerge revision. Since the svnmerge revision contains
changes to the metadata on . that will conflict with the new svnmerge
values that the svnmerge you are trying to do causes. But if I am
right about this then won't that require blocking the svnmerge
revision on release30-maint the svnmerge revision on py3k?

Ugh. Is this getting to the point that we can only svnmerge between
trunk and py3k and the maintenance branches just have to be managed
the old-fashion way?

And I have pinged the people helping me with the DVCS PEP in hopes of
getting us moved off of svn sooner rather than later.

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Gregor Lingl




Guido van Rossum schrieb:

I hear some folks are considering advertising 3.0 as experimental or
not ready for serious use yet.

I think that's too negative -- we should encourage people to use it,
period. They'll have to decide for themselves whether they can live
with the lack of ported 3rd party libraries -- which may resolve
itself soon enough. 
I'd find it useful to have a special regularly updated index of 
libraries already ported to 3.0 somewhere on python.org


Gregor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Merging flow

2008-12-05 Thread Fred Drake


On Dec 5, 2008, at 2:20 PM, Mark Dickinson wrote:

Did I mess up somewhere, or does svnmerge not work on
a revision that was itself the result of an svnmerge?


I ran into this yesterday as well with my patch to the cgi module.   
The work-around was to revert the change to that property and edit it  
manually.


I think this is a significant issue, since editing that property is  
about as error-prone as it can be.  I've not really looked at the code  
in svnmerge.py, so I'm not sure how hard it would be to fix.



  -Fred

--
Fred Drake   

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Gregor Lingl




[EMAIL PROTECTED] schrieb:


To be fair, if someone asked me specifically about educating non- 
programmer adults about programming, I would probably at least 
*mention* py3, if not recommend it outright.  The improved consistency 
is worth a lot in an educational setting.  (But, if one is educating 
children and interested in soliciting their genuine enthusiasm, 
whiz-bang graphics are really a must-have, not a negotiable extra.)
As a non native English speaker I'm not sure if I understand correctly, 
what you mean with whiz-bang graphics. Nevertheless I'd like to point 
you to the new turtle graphics module (which is part of the standard 
librarys since 2.6). At least it was designed especially for use in the 
educational  domain. Moreover the source-distribution also contains a 
bunch of some ten example scripts.


Regards,
Gregor

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] ANN: new python-porting mailing list

2008-12-05 Thread skip


Georg>[EMAIL PROTECTED]

Georg> It is a public mailing list open to everyone.  We expect active
Georg> participation of many people porting their libraries/programs,
Georg> and hope that the list can be a help to all wanting to go this
Georg> (not always smooth :-) way.

I trust you will announce this in python-list and python-announce-list if
you haven't already?

Skip
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] ANN: new python-porting mailing list

2008-12-05 Thread Georg Brandl

[EMAIL PROTECTED] schrieb:
> Georg>[EMAIL PROTECTED]
> 
> Georg> It is a public mailing list open to everyone.  We expect active
> Georg> participation of many people porting their libraries/programs,
> Georg> and hope that the list can be a help to all wanting to go this
> Georg> (not always smooth :-) way.
> 
> I trust you will announce this in python-list and python-announce-list if
> you haven't already?

I've sent it to python-announce, it's in the moderator queue.  I'm not on
python-list so I can't answer followups.  If you'd like to do an
announcement there, I'd be happy :)

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Mike Klaas



On 5-Dec-08, at 8:40 AM, A.M. Kuchling wrote:


On Fri, Dec 05, 2008 at 05:40:46AM -, [EMAIL PROTECTED] wrote:
For most users, especially new users who have yet to be impressed  
with
Python's power, 2.x is much better.  It's not like "library  
support" is

one small check-box on the language's feature sheet: most of the
attractive things about Python are libraries.  Of course I am not  
free


Here I agree, sort of.  Newbies may not understand what they're giving
up in terms of libraries.  (The 'sort of' is because, having learned
3.0, learning the changes for 2.6 is certainly much easier than
learning a first programming language is.)


For possible insight, here is a current discussion on the topic:

http://www.reddit.com/r/programming/comments/7hlra/ask_progit_ive_got_the_itch_to_learn_python_since/

(note that these would be programmers interested in learning python,  
not people trying to learn programming)


-Mike
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi

Guido van Rossum wrote:
> On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt <[EMAIL PROTECTED]> wrote:
>> In 99% of all cases, using the default encoding will work and do what people
>> expect, which is why I would make this conversion automatic. In all other
>> cases, it will at least not fail silently (which would lead to garbage and
>> data loss) and allow more sophisticated applications to handle it.
> 
> I think the "always fail noisily" approach isn't the best approach.
> E.g. if I am globbing for *.py, and there's an undecodable .txt file
> in a directory, its presence shouldn't cause the glob to fail.
> 
But why should it make glob() fail?  This sounds like an implementation
detail of glob.  Here's some pseudo-code::

def glob(pattern):
string = False
if isinstance(pattern, str):
string = True
if platform == 'POSIX':
pattern = bytes(pattern, encoding=defaultencoding)
rawfiles = os.listdir(os.path.dirname(pattern) or pattern)
if string and platform == 'POSIX':
return [str(f) for f in rawfiles if match(f, pattern)]
else:
return rawfiles

This way the traceback occurs if anything in the result set is
undecodable.  What am I missing?

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Guido van Rossum

On Fri, Dec 5, 2008 at 12:05 PM, Toshio Kuratomi <[EMAIL PROTECTED]> wrote:
> Guido van Rossum wrote:
>> On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt <[EMAIL PROTECTED]> wrote:
>>> In 99% of all cases, using the default encoding will work and do what people
>>> expect, which is why I would make this conversion automatic. In all other
>>> cases, it will at least not fail silently (which would lead to garbage and
>>> data loss) and allow more sophisticated applications to handle it.
>>
>> I think the "always fail noisily" approach isn't the best approach.
>> E.g. if I am globbing for *.py, and there's an undecodable .txt file
>> in a directory, its presence shouldn't cause the glob to fail.
>>
> But why should it make glob() fail?  This sounds like an implementation
> detail of glob.

Glob was just an example. Many use cases for directory traversal
couldn't care less if they see *all* files.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi

Guido van Rossum wrote:
> Glob was just an example. Many use cases for directory traversal
> couldn't care less if they see *all* files.
> 
Okay.  Makes it harder to prove correct or not if I don't know what the
use case is :-)  I can't think of a single use case off-hand.

Even your example of a ??.txt file making retrieval of *.py files fail
is a little broken.  If there was a ??.py file that was undecodable the
program would most likely want to know that file existed.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Tres Seaver

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Gregor Lingl wrote:
> 
> [EMAIL PROTECTED] schrieb:
>> To be fair, if someone asked me specifically about educating non- 
>> programmer adults about programming, I would probably at least 
>> *mention* py3, if not recommend it outright.  The improved consistency 
>> is worth a lot in an educational setting.  (But, if one is educating 
>> children and interested in soliciting their genuine enthusiasm, 
>> whiz-bang graphics are really a must-have, not a negotiable extra.)
> As a non native English speaker I'm not sure if I understand correctly, 
> what you mean with whiz-bang graphics. Nevertheless I'd like to point 
> you to the new turtle graphics module (which is part of the standard 
> librarys since 2.6). At least it was designed especially for use in the 
> educational  domain. Moreover the source-distribution also contains a 
> bunch of some ten example scripts.

I'm pretty sure he that turtle graphics are not "whiz-bang" (in this
century, at least).  Begin able to do pygame-style OpenGL stuff would be
"whiz bang"[1] in my book.


[1] http://www.merriam-webster.com/dictionary/whizbang


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [EMAIL PROTECTED]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJOZPn+gerLs4ltQ4RAnE1AKCl+Z51tACSJLBmAOcp5q534Mx+2ACg1I28
re6gaV7AFEU0WS1yvUIiZS0=
=4Pda
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi

Guido van Rossum wrote:
> At the risk of bringing up something that was already rejected, let me
> propose something that follows the path taken in 3.0 for filenames,
> rather than doubling back:
> 
> For os.environ, os.getenv() and os.putenv(), I think a similar
> approach as used for os.listdir() and os.getcwd() makes sense: let
> os.environ skip variables whose name or value is undecodable, and have
> a separate os.environb() which contains bytes; let os.getenv() and
> os.putenv() do the right thing when the arguments passed in are bytes.
> 
I prefer the method used by file.read() where an error is thrown when
accessing undecodable data.  I think in time python programmers will
consider not throwing an exception a wart in python3.  However, this is
enough to allow programmers to do the right thing once an error is
reported by users and the cause has been tracked down so it doesn't
block fixing errors as the current code does.

And it's not like anyone expected python3 to be wart-free just because
the python2 warts were fixed ;-)

> For sys.argv, because it's positional, you can't skip undecodable
> values, so I propose to use error=replace for the decoding; again, we
> can add sys.argvb that contains the raw bytes values. The various
> os.exec*() and os.spawn*() calls (as well as os.system(), os.popen()
> and the subprocess module) should all accept bytes as well as strings.
> 
This also seems sane with the same comment about throwing errors.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Victor Stinner

Hi,

> > But they are open questions (already asked in the bug tracker):
>
> I answered these in the bug tracker.  Here are the answers for the
> mailing list:

Oh, sorry. I didn't follow the end of the discussion on the bug tracker.

> >os.environb['PATH'] = '\xff'
> >=> os.environ['PATH'] = ???
>
>  os.environ['PATH'] => raises KeyError because PATH is not a key in
> the unicode decoded environment.

Ok, good answer :-)

> >os.environ['PATH'] = chr(0x1)
> >=> os.environb['PATH'] = ???
>
> raise UnicodeEncodeError when setting the value.

Ok, it's consistent the current behaviour.

$ LANG=C ./python
Python 3.0rc3+ (py3k:67498M, Dec  4 2008, 17:45:54)
>>> import os
>>> os.environ['x'] = '\xff'
>>> os.environ['x']
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/haypo/prog/py3k/Lib/io.py", line 1491, in write
b = encoder.encode(s)
  File "/home/haypo/prog/py3k/Lib/encodings/ascii.py", line 22, in encode
return codecs.ascii_encode(input, self.errors)[0]
UnicodeEncodeError: 'ascii' codec can't encode character '\xff' in position 1: 
ordinal not in range(128)

Oh, that's strange :-p The error is delayed when we read the value.

> > It would be maybe easier if os.environ supports bytes and unicode keys.
> > But we have to keep these assertions:
> >os.environ[bytes] -> bytes
> >os.environ[str] -> str
>
> I think the same choices have to be made here.  If LANG=C, we still have
> to decide what to do when os.environ[str] is set to a non-ASCii string.

If the charset is US-ASCII, os.environ will drop non-ASCII values. But most 
variables are ASCII only. Examples with my shell:

$ env
XCURSOR_THEME=kubuntu
LANG=fr_FR.UTF-8
EDITOR=vim
HOME=/home/haypo
...

> Additionally, the subprocess question makes using the key value
> undesirable compared with having a separate os.environb that accesses
> the same underlying data.

The user should be able to choose bytes or unicode. Examples:
 - subprocess.Popen('ls') => use unicode environment (os.environ)
 - subprocess.Popen(b'ls') => use bytes environment (os.environb)

> Here's my problem with it, though.  With these semantics any program
> that works on arbitrary files and runs on *NIX has to check
> os.listdir(b'') and do the conversion manually.

Only programs that have to support strange environment like yours (mixing 
Shift-JIS and UTF-8) :-) Most programs don't have to support these charset 
mixture.

We can imagine an higher library working on UNIX and Windows (bytes or 
Unicode). But that would be later.

> I think the desired behaviour assuming the existence of a nondecodable
> file is this:

I prefer the current behaviour :-)

> Why do you think that glob.glob('*.py') is special and should not traceback?

It's not special. glob() reuses listdir(), and it was an example to show 
that "it just works".

> I just differ in that I think lack of tracebacks when
> UnicodeDecodeErrors are encountered is a wart in python3 that did not
> exist in python2.

Right.

-- 
Victor Stinner aka haypo
http://www.haypocalc.com/blog/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan

Toshio Kuratomi wrote:
> Guido van Rossum wrote:
>> Glob was just an example. Many use cases for directory traversal
>> couldn't care less if they see *all* files.
>>
> Okay.  Makes it harder to prove correct or not if I don't know what the
> use case is :-)  I can't think of a single use case off-hand.
> 
> Even your example of a ??.txt file making retrieval of *.py files fail
> is a little broken.  If there was a ??.py file that was undecodable the
> program would most likely want to know that file existed.

Why? Most programs won't be able to do anything with it. And if the
program *can* do something with it... that's what the bytes version of
the APIs are for.

Cheers,
Nick.


-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] import docs follow-up

2008-12-05 Thread Nick Coghlan

Georg Brandl wrote:
> Hi,
> 
> as a follow-up to the thread a few days ago, and the bug report, I've
> rewritten most of the __import__ docs.  I've attached the suggested patch
> to the issue .
> 
> I'd be glad for reviews. Also, I'd like to ask about opinions if this
> "winning idiom" (as a bug comment states) should be in it, instead of
> the getattr() helper function:
> 
 import sys
 __import__('x.y.z')
 mod = sys.modules['x.y.z']

That way is a lot cleaner than other mechanisms I've seen (including the
current mechanism in the docs). Making that the recommended way of doing
a dynamic import seems like a good idea to me.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi

Victor Stinner wrote:
>>> It would be maybe easier if os.environ supports bytes and unicode keys.
>>> But we have to keep these assertions:
>>>os.environ[bytes] -> bytes
>>>os.environ[str] -> str
>> I think the same choices have to be made here.  If LANG=C, we still have
>> to decide what to do when os.environ[str] is set to a non-ASCii string.
> 
> If the charset is US-ASCII, os.environ will drop non-ASCII values. But most 
> variables are ASCII only. Examples with my shell:
> 
Yes.  But you still have the question of what to do when:
os.environ[str] = chr(0x1)

So I don't think it makes things simpler than having separate os.environ
and os.environb that update the same data behind the scenes.

>> Additionally, the subprocess question makes using the key value
>> undesirable compared with having a separate os.environb that accesses
>> the same underlying data.
> 
> The user should be able to choose bytes or unicode. Examples:

the subprocess question was posed further up the thread as basically --
does the user need to access os.environb in order to override things in
the environment when calling subprocess?  I think the answer to that is
yes since you might want to start with your environment and modify it
slightly when you call programs via subprocess.  If you just try to copy
os.environ and os.environ only iterates through the decodable env vars,
that doesn't work.  If you have an os.environb to copy it becomes possible.

>  - subprocess.Popen('ls') => use unicode environment (os.environ)
>  - subprocess.Popen(b'ls') => use bytes environment (os.environb)
> 
That's... not expected to me :-(

If I never touch os.environ and invoke subprocess the normal way, I'd
still expect the whole environment to be passed on to the program being
called.  This is how invoking programs manually, shell scripting,
invoking programs from perl, python2, etc work.

Also, it's not really a good fit with the other things that key off of
the initial argument.  os.listdir(b'.') changes the output to bytes.
subprocess.Popen(b'ls') would change what environment gets input into
the call.

>> Here's my problem with it, though.  With these semantics any program
>> that works on arbitrary files and runs on *NIX has to check
>> os.listdir(b'') and do the conversion manually.
> 
> Only programs that have to support strange environment like yours (mixing 
> Shift-JIS and UTF-8) :-) Most programs don't have to support these charset 
> mixture.
> 
Any program that is intended to be distributed, accesses arbitrary
files, and works on *nix platforms needs to take this into account.
Just because the environment inside of my organization is sane doesn't
mean that when we release the code to customers, clients, or the free
software community that the places it runs will be as strict about these
things.

Are most programs specific to one organization or are they distributed
to other people?  I can't answer that... everything I work on (except
passwords:-) is distributed -- from sys admin cronjobs to web
applications since I'm lucky that my whole job is devoted to working on
free software.

-Toshio

signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Merging flow

2008-12-05 Thread Nick Coghlan

Fred Drake wrote:
> On Dec 5, 2008, at 2:20 PM, Mark Dickinson wrote:
>> Did I mess up somewhere, or does svnmerge not work on
>> a revision that was itself the result of an svnmerge?
> 
> I ran into this yesterday as well with my patch to the cgi module.  The
> work-around was to revert the change to that property and edit it manually.
> 
> I think this is a significant issue, since editing that property is
> about as error-prone as it can be.  I've not really looked at the code
> in svnmerge.py, so I'm not sure how hard it would be to fix.

I think we're discovering the real reasons why people generally prefer
to use a DVCS when trying to manage multiple branches :P

For now it looks like we might have to maintain 3.0 manually, with
svnmerge only helping out for trunk->2.6 and trunk->py3k...

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan

Toshio Kuratomi wrote:
> Are most programs specific to one organization or are they distributed
> to other people?

The former. That's pretty well documented in assorted IT literature
('shrink-wrap' and open source commodity software are still relatively
new players on the scene that started to shift the balance the other
way, but now the server side elements of web services are shifting it
back again).

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Merging flow

2008-12-05 Thread Christian Heimes


Nick Coghlan wrote:

I think we're discovering the real reasons why people generally prefer
to use a DVCS when trying to manage multiple branches :P

For now it looks like we might have to maintain 3.0 manually, with
svnmerge only helping out for trunk->2.6 and trunk->py3k...


The problem seems to be trunk -> py3k -> 3.0. I had no issues with py3k 
-> 3.0.


Christian
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi

Nick Coghlan wrote:
> Toshio Kuratomi wrote:
>> Are most programs specific to one organization or are they distributed
>> to other people?
> 
> The former. That's pretty well documented in assorted IT literature
> ('shrink-wrap' and open source commodity software are still relatively
> new players on the scene that started to shift the balance the other
> way, but now the server side elements of web services are shifting it
> back again).
> 
Cool.  So it's only people writing code to be shared with the larger
community or written for multiple customers that are affected by bugs
like this. :-/

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi

Nick Coghlan wrote:
> Toshio Kuratomi wrote:
>> Guido van Rossum wrote:
>>> Glob was just an example. Many use cases for directory traversal
>>> couldn't care less if they see *all* files.
>>>
>> Okay.  Makes it harder to prove correct or not if I don't know what the
>> use case is :-)  I can't think of a single use case off-hand.
>>
>> Even your example of a ??.txt file making retrieval of *.py files fail
>> is a little broken.  If there was a ??.py file that was undecodable the
>> program would most likely want to know that file existed.
> 
> Why? Most programs won't be able to do anything with it. And if the
> program *can* do something with it... that's what the bytes version of
> the APIs are for.
> 
Nonsense.  A program can do tons of things with a non-decodable
filename.  Where it's limited is non-decodable filedata.

For instance, if you have a graphical text editor, you need to let the
user select files to load.  To do that you need to list all the files in
a directory, even the ones that aren't decodable.  The ones that aren't
decodable need to substitute something like:
  str(filename, errors='replace') + '(Filename not encoded in UTF8)'
in the file listing that the user sees.  When the file is loaded, it
needs to access the actual raw filename.  The file can then be loaded
and operated upon and even saved back to disk using the raw, undecodable
filename.

If you have a file manager, you need to code something that let's the
user move the file around.  Once again, the program loads the raw
filenames.  It transforms the name into something representable to the
user.  It displays that.  The user selects it and asks that it be moved
to another location.  Then the program uses the raw filename to move
from one location to another.

If you have a backup program, you need to list all the files in a
directory.  Then you need to copy those files to another location.  Once
again you have to retrieve the byte version of any non-decodable filenames.

-Toshio

signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Merging flow

2008-12-05 Thread Fred Drake


On Dec 5, 2008, at 5:31 PM, Nick Coghlan wrote:

I think we're discovering the real reasons why people generally prefer
to use a DVCS when trying to manage multiple branches :P


Really?  I don't.  The issue has nothing to do with someone  
maintaining private change sets, or wanting to do development with  
local commits without having access to commit to the project.


I expect (and someone from work has said they do as well) that  
Subversion 1.5's merge tracking would have handled this situation.



For now it looks like we might have to maintain 3.0 manually, with
svnmerge only helping out for trunk->2.6 and trunk->py3k...



I don't know if I'll have time to look at svnmerge this weekend (with  
house guests and all), but I really don't expect it's a difficult  
problem to solve in the tool.  The behavior suggests that this tiered  
set of branch relationships wasn't expected.



  -Fred

--
Fred Drake   

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Merging flow

2008-12-05 Thread Jim Jewett

Nick Coghlan wrote:

> For now it looks like we might have to maintain 3.0 manually, with
> svnmerge only helping out for trunk->2.6 and trunk->py3k

Does it make the bookkeeping horrible if you merge from trunk straight
to 3.0, and then blocked svnmerged changes from propagating?

-jJ
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Martin v. Löwis

> Good.  Now we just need to populate them.  I take it the classifiers without
> minor numbers imply any known minor version (e.g., 2 ==> 2.3 and greater)?

Perhaps. As usual, they mean what people use them for.

I intended them to mean 2.x and 3.x, respectively, with no constraint on
x (i.e. including possibly 2.0 and 2.1). In particular, presence of "2"
and absence of "3" is meant to indicate "I know that it won't work on
Python 3".

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread rdmurray


On Fri, 5 Dec 2008 at 12:11, Guido van Rossum wrote:

On Fri, Dec 5, 2008 at 12:05 PM, Toshio Kuratomi <[EMAIL PROTECTED]> wrote:

Guido van Rossum wrote:

On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt <[EMAIL PROTECTED]> wrote:

In 99% of all cases, using the default encoding will work and do what people
expect, which is why I would make this conversion automatic. In all other
cases, it will at least not fail silently (which would lead to garbage and
data loss) and allow more sophisticated applications to handle it.


I think the "always fail noisily" approach isn't the best approach.
E.g. if I am globbing for *.py, and there's an undecodable .txt file
in a directory, its presence shouldn't cause the glob to fail.


But why should it make glob() fail?  This sounds like an implementation
detail of glob.


Glob was just an example. Many use cases for directory traversal
couldn't care less if they see *all* files.


I agree with Toshio.  The only use case I can think of for not seeing
all files is when selecting a subset, and if the thing that does the
selecting only generates a traceback if a file that falls into the
subset is undecodable, then I don't see a problem.  That is, if I'm
selecting a subset of the files in a directory, and one of that subset
is undecodable, I _want_ a traceback, because I'll be wanting _all_
of the files that match my selection criteria.(*)

So I'm curious to hear your use cases where undecodable files are
"don't care".

(*) More specifically, I want the program of a developer who didn't think
about the fact that users might have files with undecodable filenames
in their directory to generate a traceback rather than silently losing
those files.  (This is spoken to both by the principle of least
surprise and the zen rule that errors should never pass silently :)

--RDM
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan

Toshio Kuratomi wrote:
> Nick Coghlan wrote:
>> Toshio Kuratomi wrote:
>>> Guido van Rossum wrote:
 Glob was just an example. Many use cases for directory traversal
 couldn't care less if they see *all* files.

>>> Okay.  Makes it harder to prove correct or not if I don't know what the
>>> use case is :-)  I can't think of a single use case off-hand.
>>>
>>> Even your example of a ??.txt file making retrieval of *.py files fail
>>> is a little broken.  If there was a ??.py file that was undecodable the
>>> program would most likely want to know that file existed.
>> Why? Most programs won't be able to do anything with it. And if the
>> program *can* do something with it... that's what the bytes version of
>> the APIs are for.
>>
> Nonsense.  A program can do tons of things with a non-decodable
> filename.  Where it's limited is non-decodable filedata.

You can't display a non-decodable filename to the user, hence the user
will have no idea what they're working on. Non-filesystem related apps
have no business trying to deal with insane filenames.

Linux is moving towards a standard of UTF-8 for filenames, and once we
get to the point where the idea of encoding filenames and environment
variables any other way is seen as crazy, then the Python 3 approach
will work seamlessly.

In the meantime, raw bytes APIs will provide an alternative for those
that disagree with that philosophy.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Thomas Wouters

On Fri, Dec 5, 2008 at 19:10, Guido van Rossum <[EMAIL PROTECTED]> wrote:

> On Thu, Dec 4, 2008 at 11:27 PM,  <[EMAIL PROTECTED]> wrote:
> > With all due respect, for me, "library support" and "serious use" are
> > synonymous.
>
> Glyph, I cannot have a discussion with you if every single post of
> yours is longer than my combined daily output. Please spend some time
> writing shorter posts. I'm sure I'm not the only one here with a short
> attention span. :-)

Allow me to paraphrase glyph (with whom I'm in complete agreement, for what
it's worth): many newbies will be disappointed by Python if they start with
Python 3.0 and discover that most of the cool possibilities they had heard
about are 'being worked on' and not quite ready. I don't doubt that 3.0 will
be easier for the new programmer to learn, but I do not believe the average
"Oh, I heard about Python, let's learn it" person should be pointed to 3.0
right now. They should be encouraged to learn 2.6 -- or even 2.5.

In spite of Python being a programming language, there is a difference
between 'casual user of the language' and 'library developer'; 3.0 is
certainly a must for all actual library developers, and I'm sure most of
them know about 3.0 by now. We're talking about first impressions for people
without that knowledge.

-- 
Thomas Wouters <[EMAIL PROTECTED]>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Michael Urman

On Fri, Dec 5, 2008 at 18:48, Nick Coghlan <[EMAIL PROTECTED]> wrote:
> Toshio Kuratomi wrote:
>> Nick Coghlan wrote:
>>> Toshio Kuratomi wrote:
 Guido van Rossum wrote:
> Glob was just an example. Many use cases for directory traversal
> couldn't care less if they see *all* files.
>
 Okay.  Makes it harder to prove correct or not if I don't know what the
 use case is :-)  I can't think of a single use case off-hand.

 Even your example of a ??.txt file making retrieval of *.py files fail
 is a little broken.  If there was a ??.py file that was undecodable the
 program would most likely want to know that file existed.
>>> Why? Most programs won't be able to do anything with it. And if the
>>> program *can* do something with it... that's what the bytes version of
>>> the APIs are for.
>>>
>> Nonsense.  A program can do tons of things with a non-decodable
>> filename.  Where it's limited is non-decodable filedata.
>
> You can't display a non-decodable filename to the user, hence the user
> will have no idea what they're working on. Non-filesystem related apps
> have no business trying to deal with insane filenames.

And what of python's batteries---does a library that takes filenames
or directories from a controlling program and processes the contents
of the file need to care whether the file can be encoded properly? Is
said library filesystem related or not?

Won't it be awful when it's the directory name, and processing the
file works if you change into its directory, but not if you're outside
of it? And if there's an error during processing and the library
reports a full filename using os.abspath("file.ext"), but cannot get
the results?

> Linux is moving towards a standard of UTF-8 for filenames, and once we
> get to the point where the idea of encoding filenames and environment
> variables any other way is seen as crazy, then the Python 3 approach
> will work seamlessly.
>
> In the meantime, raw bytes APIs will provide an alternative for those
> that disagree with that philosophy.

And until that time, it's agony for the library writers who didn't
think they needed to care, but find that their users (other
developers) do.
-- 
Michael Urman
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Steven D'Aprano

On Sat, 6 Dec 2008 09:18:47 am Nick Coghlan wrote:
> Toshio Kuratomi wrote:
> > Guido van Rossum wrote:
> >> Glob was just an example. Many use cases for directory traversal
> >> couldn't care less if they see *all* files.
> >
> > Okay.  Makes it harder to prove correct or not if I don't know what
> > the use case is :-)  I can't think of a single use case off-hand.
> >
> > Even your example of a ??.txt file making retrieval of *.py files
> > fail is a little broken.  If there was a ??.py file that was
> > undecodable the program would most likely want to know that file
> > existed.
>
> Why? Most programs won't be able to do anything with it.

But the program can report a sensible error message, so the user can fix 
the problem.

I'd rather have the Python API report errors then silence them, at least 
by default. I don't suppose it's on the table for functions to grow an 
extra argument that tells them to skip broken file names and 
environment variables? 

What I have in mind is something like:

os.listdir(path, silence_errors=False) -> list_of_strings

By default, if a filename in path is not a valid string, an exception is 
raised, with the guilty file name given in bytes as an attribute of the 
exception. If silence_errors is true, the invalid file names are 
silently skipped.

-- 
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan

Toshio Kuratomi wrote:
> Nick Coghlan wrote:
>> Toshio Kuratomi wrote:
>>> Are most programs specific to one organization or are they distributed
>>> to other people?
>> The former. That's pretty well documented in assorted IT literature
>> ('shrink-wrap' and open source commodity software are still relatively
>> new players on the scene that started to shift the balance the other
>> way, but now the server side elements of web services are shifting it
>> back again).
>>
> Cool.  So it's only people writing code to be shared with the larger
> community or written for multiple customers that are affected by bugs
> like this. :-/

True, but it's still a fairly important problem to have a solution to.
Even internally in large organisations there can be some pretty insane
environments as cruft accumulates over the years.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Martin v. Löwis

> There was already "Programming Language :: Python", provided by many
> packages.  I think version compatibility relationships meant by each of
> these classifiers should be made explicit, wherever it is that
> documentation for classifiers is provided.
> 
> I don't recall having seen any such documentation; hopefully I just need
> to be hit by another clue.

There is no documentation for classifiers whatsoever. I don't think
nuances matter much, anyway.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Martin v. Löwis

>> 5) represent all environment variables in Unicode strings,
>>including the ones that currently fail to decode.
>>(then do the same to file names, then drop the byte-oriented
>> file operations again)
> 
> Please, don't do that! Bytes are not characters!

And environment variables, command line arguments, and file names
are not bytes, but characters.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread James Y Knight


On Dec 5, 2008, at 7:48 PM, Nick Coghlan wrote:

You can't display a non-decodable filename to the user, hence the user
will have no idea what they're working on. Non-filesystem related apps
have no business trying to deal with insane filenames.


Sigh, same arguments, all over again.

Again, *both* KDE and Gnome apps display non-decodable filenames to  
the user, and let the user work with the files. They display as good a  
rendition as they can, using a replacement character as appropriate.  
In some earlier versions, KDE did not work at all on poorly-encoded  
files, and, users submitted bug reports. People do care, it does  
happen in real life, and it is a bug in your software if you cannot  
deal with the users' files. They just want the software to work. If it  
shows something weird in the window titlebar, that's a bit irritating  
but at least it doesn't get in the way of working.



Linux is moving towards a standard of UTF-8 for filenames, and once we
get to the point where the idea of encoding filenames and environment
variables any other way is seen as crazy, then the Python 3 approach
will work seamlessly.


I seriously doubt that would ever enforce utf-8 filenames/env vars/ 
command arguments. Oddly encoded strings will always be with us in  
some form or another.


Now, perhaps you use crontab? At least on the systems I have, programs  
run by cron don't have any locale environment variables set, and so  
default to the "C" locale. So utf-8 encoded filenames/etc will fail,  
by default, for any python3 program run under cron.


I'd like to make an analogy: what if Python3 couldn't deal with  
filenames with spaces in them on unix? Most filenames don't have  
spaces in them, so it should be okay, right? And those people who  
really need to deal with space-containing filenames can use this other  
API variant, instead of the recommended and most obvious one. That'd  
be okay, right? No, of course it wouldn't be okay!


James
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Guido van Rossum

On Fri, Dec 5, 2008 at 4:49 PM, Thomas Wouters <[EMAIL PROTECTED]> wrote:
> On Fri, Dec 5, 2008 at 19:10, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>>
>> On Thu, Dec 4, 2008 at 11:27 PM,  <[EMAIL PROTECTED]> wrote:
>> > With all due respect, for me, "library support" and "serious use" are
>> > synonymous.
>>
>> Glyph, I cannot have a discussion with you if every single post of
>> yours is longer than my combined daily output. Please spend some time
>> writing shorter posts. I'm sure I'm not the only one here with a short
>> attention span. :-)
>
> Allow me to paraphrase glyph (with whom I'm in complete agreement, for what
> it's worth): many newbies will be disappointed by Python if they start with
> Python 3.0 and discover that most of the cool possibilities they had heard
> about are 'being worked on' and not quite ready. I don't doubt that 3.0 will
> be easier for the new programmer to learn, but I do not believe the average
> "Oh, I heard about Python, let's learn it" person should be pointed to 3.0
> right now. They should be encouraged to learn 2.6 -- or even 2.5.

Thanks for the summary! Maybe Glyph should just pipe his email through you. :-)

Without more context it's impossible to make a good recommendation.
Most people probably want to learn Python because they want to access
some system for which Python is required -- whether that's Blender,
Google App Engine, their Nokia cell phone, or something that some of
their colleagues have written (most Googlers learning Python fall in
that category :-). In that case they don't have a choice -- they
should learn the version that is used by the system they want to use.
Obviously that's going to be 2.x in most cases, at least for a while.

But I disagree that "most of the cool possibilities they have heard
about" are necessarily third party libraries. Python's standard
library has lots of stuff to offer.

> In spite of Python being a programming language, there is a difference
> between 'casual user of the language' and 'library developer'; 3.0 is
> certainly a must for all actual library developers, and I'm sure most of
> them know about 3.0 by now. We're talking about first impressions for people
> without that knowledge.

Well if most library developers already know 3.0 by now, I would hope
they aren't going to sit on their hands, and solve the issues at hand!
In the mean time, I don't mind if people learn 3.0 first and 2.6
second. It's probably easier that way than the other way around. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Michael Urman

On Fri, Dec 5, 2008 at 19:22, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
>> Please, don't do that! Bytes are not characters!
>
> And environment variables, command line arguments, and file names
> are not bytes, but characters.

On Windows NT, sure. On Unix they're still bytes no matter how much we
want them to be characters.

This difference, and secondarily the way python 3 tries to sweep it
under the rug, seem to be the roots of the problem.

-- 
Michael Urman
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Steven D'Aprano

On Sat, 6 Dec 2008 12:47:45 pm Guido van Rossum wrote:
> But I disagree that "most of the cool possibilities they have heard
> about" are necessarily third party libraries. Python's standard
> library has lots of stuff to offer.

+1 on that. I've been using Python for a decade now, and the first third 
party library I've downloaded and used was Pyparsing a month or two 
ago. I'll be the first to admit that my programs tend to be on the 
small size, but they're useful to me. The lack of third party libraries 
to Python 3 is not necessarily a show-stopper.


-- 
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Martin v. Löwis

>> And environment variables, command line arguments, and file names
>> are not bytes, but characters.
> 
> On Windows NT, sure. On Unix they're still bytes no matter how much we
> want them to be characters.

Only in the API of the OS itself. Treating them as bytes in the
application is a mistake. The bytes are intended to represent
characters, so Python should treat them as what they are.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Steven D'Aprano

On Sat, 6 Dec 2008 11:48:27 am Nick Coghlan wrote:
> Toshio Kuratomi wrote:
> > Nick Coghlan wrote:
...
> >> Why? Most programs won't be able to do anything with it. And if
> >> the program *can* do something with it... that's what the bytes
> >> version of the APIs are for.
> >
> > Nonsense.  A program can do tons of things with a non-decodable
> > filename.  Where it's limited is non-decodable filedata.
>
> You can't display a non-decodable filename to the user, hence the
> user will have no idea what they're working on. Non-filesystem
> related apps have no business trying to deal with insane filenames.

I don't agree. Putting my user's hat on, I know what I would expect: the 
app should display *some* name, it doesn't matter exactly what, so long 
as:

* it's as close as possible to the "real" name; 

* it is unique in that directory (doesn't shadow another file); and

* it's enough to identify the file so I can read/save/delete/rename the 
file.

I think there are analogous situations: long-time Windows users will be 
used to seeing files listed as "longfilename.txt" in some applications 
and "longfi~1.txt" in another. Under POSIX, file names can contain 
unprintable ctrl characters, and the shell will print them at least 
three ways, depending on context. E.g. for a file containing a 
formfeed, I get one of ? \f or ^L in bash.

Applications can deal with such weird file names. KDE's file manager 
(konqueror) and file selection dialog both show the character as a 
small square, presumably the font's missing character glyph, and KDE 
apps can open and save the file. Still speaking as a user, I think it 
is quite reasonable to expect applications to deal with undisplayable 
filenames: displaying the name and opening the file are orthogonal 
concepts, although I accept that command-line interfaces will have 
difficulty with file names that can't be typed by the user!

I appreciate that broken unicode is more difficult to deal with than 
unprintable control characters, but the basic principle is the same.

-- 
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread Bill Janssen

Thomas Wouters <[EMAIL PROTECTED]> wrote:

> Allow me to paraphrase glyph (with whom I'm in complete agreement, for what
> it's worth): many newbies will be disappointed by Python if they start with
> Python 3.0 and discover that most of the cool possibilities they had heard
> about are 'being worked on' and not quite ready. I don't doubt that 3.0 will
> be easier for the new programmer to learn, but I do not believe the average
> "Oh, I heard about Python, let's learn it" person should be pointed to 3.0
> right now. They should be encouraged to learn 2.6 -- or even 2.5.

I think that's right.

I was asked this question today, and it comes up (to me) fairly often at
PARC.  I usually suggest using the Python version that's standard for
the user's platform, if they use OS X or Linux (and most do), which is
typically 2.5 (for OS X Leopard), and 2.4 (for Linux -- may be out of date).
For Windows users, I suggest the latest release (2.6).

Bill
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Tres Seaver

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Ulrich Eckhardt wrote:
> On Friday 05 December 2008, Guido van Rossum wrote:
>> At the risk of bringing up something that was already rejected, let me
>> propose something that follows the path taken in 3.0 for filenames,
>> rather than doubling back:
>>
>> For os.environ, os.getenv() and os.putenv(), I think a similar
>> approach as used for os.listdir() and os.getcwd() makes sense: let
>> os.environ skip variables whose name or value is undecodable, and have
>> a separate os.environb() which contains bytes; let os.getenv() and
>> os.putenv() do the right thing when the arguments passed in are bytes.
>>
>> For sys.argv, because it's positional, you can't skip undecodable
>> values, so I propose to use error=replace for the decoding; again, we
>> can add sys.argvb that contains the raw bytes values. The various
>> os.exec*() and os.spawn*() calls (as well as os.system(), os.popen()
>> and the subprocess module) should all accept bytes as well as strings.
>>
>> On Windows, the bytes APIs should probably not exist.
>>
>> I predict that most developers can get away with not using the bytes
>> APIs at all. The small minority that needs to be robust if not all
>> filenames use the system encoding can use the bytes APIs.
> 
> I know some of those developers, you can contact them via 
> [EMAIL PROTECTED] Seriously, what would you suggest to someone that 
> wants to handle paths in a portable way? Using the Unicode variants of 
> functions is fubar, because encoding/decoding is not universally possible. 
> Using the byte variant is equally fubar, because e.g. on MS Windows it is not 
> supported, except through a very lossy roundtrip through the locale's 
> codepage, limiting your functionality.
> 
> I actually think it is about time to give up on trying to think about a path 
> as a string. Dito for data received from os.environ or sys.argv. There are 
> only very few things that are universal to them and a reliable encoding is 
> none of them. Then, once you have let that idea go, meditate a bit over the 
> Zen.
> 
> What I propose is that paths must be treated as OS-specific, with the only 
> common reliable operations being joining them, concatenating them and 
> splitting them into segments divided by the (again, OS-specific) separator. 
> Other operations, like e.g. appending a string or converting it to a string 
> in order to display it can fail. And if they fail, they should fail noisily. 
> In 99% of all cases, using the default encoding will work and do what people 
> expect, which is why I would make this conversion automatic. In all other 
> cases, it will at least not fail silently (which would lead to garbage and 
> data loss) and allow more sophisticated applications to handle it.

Amen!  the idea that paths, environment varioables, and stuff pulled off
of sockets can be treated as text rather than strings is just wishful
thinking.


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [EMAIL PROTECTED]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJOgYd+gerLs4ltQ4RArQFAKDUZLXjwsIvNfNji4hbqM/aOZ0lMQCfRBq/
DHdYt2GGA1CrYA4a5pj+AZ4=
=4CcT
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread rdmurray


On Sat, 6 Dec 2008 at 13:06, Steven D'Aprano wrote:

Applications can deal with such weird file names. KDE's file manager
(konqueror) and file selection dialog both show the character as a
small square, presumably the font's missing character glyph, and KDE
apps can open and save the file. Still speaking as a user, I think it
is quite reasonable to expect applications to deal with undisplayable
filenames: displaying the name and opening the file are orthogonal


Agreed.  I would file a bug report if an application couldn't
handle a file that validly exists in my file system, no matter
how broken the filename might appear to be.


concepts, although I accept that command-line interfaces will have
difficulty with file names that can't be typed by the user!


Difficult, but not impossible: tab completion in the shell can allow
the user to submit otherwise difficult to type filenames to a program.
Which means python should be able to handle such things in argument
strings, so that my python utilities can manipulate such files when
specified as command line argumentsand a sensible error should be
generated by default if the program hasn't been written in such a way
that it can handle such input.

It would be wonderful if all Unix variants would switch to all UTF-8 (I
have done so on my own machines...I think :).  But it is a slow process.

--RDM
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread glyph



On 5 Dec, 06:10 pm, [EMAIL PROTECTED] wrote:

On Thu, Dec 4, 2008 at 11:27 PM,  <[EMAIL PROTECTED]> wrote:

With all due respect, for me, "library support" and "serious use" are
synonymous.


Glyph, I cannot have a discussion with you if every single post of
yours is longer than my combined daily output. Please spend some time
writing shorter posts. I'm sure I'm not the only one here with a short
attention span. :-)


I already spend a lot of time trying to remove extraneous details.  The 
drafts of these messages are usually 3x as long :).  So, trying to keep 
it short:


Thomas paraphrased my point pretty well.  The importance of libraries 
cannot be overemphasized.  Maybe you're right and the stdlib is enough 
for a large audience, but I don't know that audience.  Everyone I know 
who uses Python, uses it because of a library.  In some cases, an 
equivalent library exists for another language, and Python wins because 
it has a nicer syntax.  But, in no case does Python win where it 
*doesn't* have the library.


I think that the marketing for py3 needs to target library vendors 
before targeting novices.  If the novices are targeted first, they are 
going to have a bad experience when "python" libraries don't work with 
py3, and library maintainers are going to have a bad experience when 
clueless newbies harass them to update their software without 
understanding the magnitude of the work to do so.


I've been predicting this for years, but two days into Python 3's 
release, I've already seen real-world examples of this pattern in 
#twisted.  I can tell these people to "downgrade" to py2 when they come 
ask me for help, but I don't think most of them ask for help.  They just 
get angry and learn Java instead.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Stephen J. Turnbull

Nick Coghlan writes:

 > True, but it's still a fairly important problem to have a solution to.
 > Even internally in large organisations there can be some pretty insane
 > environments as cruft accumulates over the years.

M&A and globalization makes it inevitable.

Toshio will remember the Mizuho April Fool's Day fiasco (a couple of
large banks merged, and when they reopened as a merged entity called
"Mizuho", the ATM system immediately crashed).

Japan being a country that doesn't believe in GAAP, such mergers are a
very difficult problem.  I don't know the details, but I wouldn't even
be surprised if encodings played a role in that mess because Japanese
companies often have their own internal variants of the national
standard JIS encoding.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Stephen J. Turnbull

"Martin v. Löwis" writes:
 > >> 5) represent all environment variables in Unicode strings,
 > >>including the ones that currently fail to decode.
 > >>(then do the same to file names, then drop the byte-oriented
 > >> file operations again)
 > > 
 > > Please, don't do that! Bytes are not characters!
 > 
 > And environment variables, command line arguments, and file names
 > are not bytes, but characters.

Unfortunately, both POSIX and OS implementation practice (including,
for example, VFAT file systems: NT-derived OSes are not safe!) say
otherwise, and that makes your line of argument extremely dangerous.

Remember, in a fight between human custom and machine programming, the
machine can always win by crashing.  For that reason, bytes must be
the underlying representation, always available, although I think it's
essential to make a text representation easily accessible, and even
the default.  Humans who would rather kvetch about the machine's
breakage than get a useful answer can (and should---problems will be
rare for most usage patterns) use the text representation.  Humans who
want reliability or debuggability, on the other hand, should have
something that cannot be mistaken for text immediately available.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Stephen J. Turnbull

Guido van Rossum writes:

 > This sounds too pessimistic to me. I expect that in five years it will
 > be universally accepted that these variables must be encoded in a
 > standard encoding.

Archival material will not catch up until the plastic rots.  And I bet
it takes ten years before the Japanese accept the same standard
encoding as the rest of the world (the Japanese cellphone system and
iMode still speak Shift JIS).  Five years should be plenty of time,
but big Japanese companies are very sensitive (and resistant to)
anything that might tend to open their turf to invaders.

 > People are never going to give up thinking about filenames etc. as
 > strings, because that's what they are conceptually.

People can't win this one 100%, they have to choose between
convenience with occasional fatal errors, or reliability.  Python
should not make it hard to achieve either.  The default should be
convenience, of course, but there should be a layer where "decodable
per standard" values and "not decoded" values are different types.
This is why Martin's proposal (or any other proposal to use strings
with invalid values) is nearly unacceptable, really.

What those who want reliability would have to do is to immediately
decode all strings from the system into something like what Toshio
proposes.  This would be a lot more reliable if done by Python rather
than an explicitly imported library, though, and would be available
for debugging of cases where the default "values are text"
representation falls down.

The same "text on the surface, bytes in the background" type could be
used by the email module (which already implements something like
this).

 > The problem is purely one of encoding,

No, it's not.  It's that strings (as understood by people) and system
"text" are different types (even on Mac: VFAT and NFS filesystem
filenames for example), and Python is not type-safe in this sense.

There ought to be a "you think this is text but I'm keeping an
accurate backup just in case" type for this.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RELEASED Python 3.0 final

2008-12-05 Thread glyph


On 01:47 am, [EMAIL PROTECTED] wrote:

In spite of Python being a programming language, there is a difference
between 'casual user of the language' and 'library developer'; 3.0 is
certainly a must for all actual library developers, and I'm sure most 
of
them know about 3.0 by now. We're talking about first impressions for 
people

without that knowledge.


Well if most library developers already know 3.0 by now, I would hope
they aren't going to sit on their hands, and solve the issues at hand!


The best thing for 3.0 adoption would be a 3.0 "welcoming committee".  A 
group of hackers wandering from one popular open source library to 
another, writing patches for 3.x compatibility issues.  There must be 
lots of people who care about 3.x adoption, and this is probably the 
most effective way they can reach that goal.


Each time I am going to fix a 3.0 compatibility issue, I have a choice: 
I can either make Twisted itself better (add features, fix bugs), or I 
can keep Twisted exactly the same but do lots of work so it will work on 
3.0.  It seems pretty clear to me that, to the extent that I have time 
for Twisted, fixing bugs in the HTTP implementation would be a better 
deal than puzzling through a megabyte of diffs generated by 2to3, trying 
to understand where it went wrong, and how.


This doesn't mean I'm "sitting on my hands".  It just means I have 
better things to be doing with my hands.  (To be precise, 1054 better 
things to do, re: Twisted.  Add in the Divmod projects and it's more 
like 3000.)


Of course the distant threat of an unmaintained 2.x series is enough to 
motivate me to push a *little* in this direction, but it doesn't make me 
happy about it.


I think this is exactly what the marketing effort around 3.0 needs to be 
doing: making a positive case for library and application authors to 
spend time to update to 3.x.  This is a lot of work, and many (I might 
even say most) of us need a lot of cajoling.  Free patches are a good 
incentive :).

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Bugbee, Larry


There has been some discussion here that users should use the str or
byte function variant based on what is relevant to their system, for
example when getting a list of file names or opening a file.  That
thought process really doesn't do much for those of us that write code
that needs to run on any platform type, without alteration or the
addition of complex if-statements and/or exceptions.

Whatever the resolution here, and those of you addressing this thorny
issue have my admiration, the solution should be such that it gives
consistent behavior regardless of platform type and doesn't require the
programmer to know of all the minute details of each possible target
platform.  

That may not be possible for a while, so interim solutions should be
such that it minimizes later pain.  If that means hiding "implementation
details" behind a new function, so be it.  Then, at least, the body of
one's app is not burdened with this problem later when conditions
change.

I'm glad I'm not the only one with hard problems.  ;-)

Larry

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

78 matches

Mail list logo