* Adam Olsen wrote: 

> On Thu, Dec 4, 2008 at 1:02 PM, Toshio Kuratomi <[EMAIL PROTECTED]> 
wrote:
> > I opened up bug http://bugs.python.org/issue4006 a while ago and it was
> > suggested in the report that it's not a bug but a feature and so I
> > should come here to see about getting the feature changed :-)
> >
> > I have a specific problem with os.environ and a somewhat less important
> > architectural issue with the unicode/bytes handling in certain os.*
> > modules.  I'll start with the important one:
> >
> > Currently in python3 there's no way to get at environment variables
> > that are not encoded in the system default encoding.  My understanding
> > is that this isn't a problem on Windows systems but on *nix this is a
> > huge problem.  environment variables on *nix are a sequence of non-null
> > bytes.  These bytes are almost always "characters" but they do not have
> > to be.  Further, there is nothing that requires that the characters be
> > in the same encoding; some of the characters could be in the UTF-8
> > character set while others are in latin-1, shift-jis, or big-5.
>
> Multiple encoding environments are best described as "batshit insane".
>  It's impossible to handle any of it correctly *as text*, which is why
> UTF-8 is becoming a universal standard.  For everybody's sanity python
> should continue to push it.

Here's an example which will become popular soon, I guess: CGI scripts and, 
of course WSGI applications. All those get their environment in an unknown 
encoding. In the worst case one can blow up the application by simply 
sending strange header lines over the wire. But there's more: consider 
running the server in C locale, then probably even a single 8 bit char 
might break something (?).

> However, some pragmatism is also possible.  Many uses of PATH may
> allow it to be treated as black-box bytes, rather than text.  The
> minimal solution I see is to make os.getenv() and os.putenv() switch
> to byte modes when given byte arguments, as os.listdir() does.  This
> use case doesn't require the ability to iterate over all environment
> variables, as os.environb would allow.
>
> I do wonder if controlling the environment given to a subprocess
> requires os.environb, but it may be too obscure to really matter.

IMHO, environment variables are no text. They are bytes by definition and 
should be treated as such.
I know, there's windows having unicode enabled env vars on demand, but 
there's only trouble with those over there in apache's httpd (when passing 
them to CGI scripts, oh well...).

nd
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to