[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-07 Thread Gregory P. Smith

Gregory P. Smith g...@krypto.org added the comment:

+.. function:: fsencode(value)
+
+   Encode *value* to bytes for use in the file system, environment variables or
+   the command line.  Use :func:`sys.getfilesystemencoding` and
+   ``'surrogateescape'`` error handler for str, and keep bytes unchanged.

I'd word the latter sentence as:

Uses :func:`sys.getfilesystemencoding` and ``'surrogateescape'`` error handler 
for strings and returns bytes unchanged.


Otherwise I think this patch looks good.  +1

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-06 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@haypocalc.com:


Removed file: http://bugs.python.org/file17096/os_path_fs_encode_decode-3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-06 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@haypocalc.com:


Removed file: http://bugs.python.org/file17154/issue8514.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-06 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

New short, simple and clean path: add os.fsencode() for Unix only.

--

Don't create it for Windows to encourage the usage of unicode on Windows (and 
use MBCS is a bad idea). fsdecode() was a also bad idea: it's better to keep 
bytes unchanged on Unix, and it's now possible thanks to os.environb and 
os.getenvb().

--
Added file: http://bugs.python.org/file17241/fsencode.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-04 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

I think that fsencode() (and fsdecode()) should be specific to POSIX. I don't 
know any good reason to encode a nice and correctly encoded unicode string to 
the ugly MBCS encoding.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-03 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

I agree with Martin regarding the os.environ changes. Victor, please
open a new ticket for this.

Martin: As you probably know, these issues are managed as micro-
mailing lists. Discussions on these lists often result in new
aspects which then drift off to new issues. That's normal business
and we are all well aware of this. Please stop yelling all about the
place and change your tone ! Thanks.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-03 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

loewis I really, really, REALLY think that it is bad to mix issues.
loewis This makes patch review impossible.

I tried to, but it looks difficult :-) Anyway, I opened #8603.

 This specific issue is about introducing an fsdecode and fsencode 
 function; this is what the bug title says, and what the initial patch
 did.

I know, but the two topics (fs*code() and os.environb) are very close and 
related. My os.environb implementation uses fsencode()/fsdecode().

 FWIW, I'm +0 on adding these functions. MAL, please stop messing
 issue subjects. (...)

I think that we cannot decide correctly about fs*code() until we decided for 
os.environb.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-03 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

 I think that we cannot decide correctly about fs*code() until we decided for 
 os.environb.

Why is that? In msg104063, you claim that you want to create these
functions to deal with file names (not environment variables), in
msg104064, you claim that #8513 (which is about the program name in
subprocess) would benefit from these functions. Do these use cases
become invalid if os.environb becomes available?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-03 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

STINNER Victor wrote:
 STINNER Victor victor.stin...@haypocalc.com added the comment:
 
 Why is that? In msg104063, you claim that you want to create these
 functions to deal with file names (not environment variables)
 
 Yes, but my os_path_fs_encode_decode-3.patch uses it in getenv() which 
 is maybe a bad idea: os.environb may avoid this.

IIUC, that usage is an equivalent transformation, i.e. the code doesn't
change its behavior. It is mere refactorization.

So *if* these functions are accepted, this change is a good idea
regardless of the os.environb introduction (unless I'm missing
something, and there is indeed a behavior change).

 in msg104064, you claim that #8513 (which is about the program name in
 subprocess) would benefit from these functions. Do these use cases
 become invalid if os.environb becomes available?
 
 #8513 is also related to environment variables: subprocess._execute_child() 
 calls os.get_exec_path() which search the PATH environment variable.
 It would be nice to support bytes environment variable in the env
 argument of Popen constructor (bytes key and/or value).

I still fail to see why this would make this issue block on the
os.environb introduction. Whether this gets introduced or not, the
program name issue remains, no?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-03 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

 IIUC, that usage is an equivalent transformation, i.e. the code doesn't
 change its behavior. It is mere refactorization.

I changed os.getenv() to accept byte string key (in a previous commit), but I 
don't like this hack. If we have os.environb, os.getenv() shouldn't support 
bytes anymore (but use str only, as before).

--

I worked a little more on fsencode()/os.environb, trying to fix all issues. 
fsdecode() is no more needed if we have os.environb, and fsencode() can be 
simplified to:

  def fsencode(value):
 return value.encode(sys.getfilesystemencoding(), 'surrogateescape')

fsdecode() leads to mojibake.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-02 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

I really, really, REALLY think that it is bad to mix issues. This makes patch 
review impossible.

This specific issue is about introducing an fsdecode and fsencode function; 
this is what the bug title says, and what the initial patch did.

Whether or not byte-oriented access to environment variables is also needed is 
a *separate* issue. -1 on dealing with that in this report.

FWIW, I'm +0 on adding these functions. MAL, please stop messing issue 
subjects. If you are fundamentally opposed to adding such functions, please 
request that a PEP be written or something. Otherwise, I accept the original 
patch.

I'm -1 on issue8514.patch; it is out-of-scope of the issue.

--
resolution:  - accepted

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-05-01 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

In posixmodule.c, the following snippet doesn't make sense anymore:

if (k == NULL) {
PyErr_Clear();
continue;
}

If memory allocation of the bytes object fails, we should error out.
(same for if (v == NULL) a bit later)

--
nosy: +pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-30 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

STINNER Victor wrote:
 
 STINNER Victor victor.stin...@haypocalc.com added the comment:
 
 Le lundi 26 avril 2010 13:06:48, vous avez écrit :
 I don't see what environment variables have to do with the file
 system.
 
 A POSIX system only offers *one* function about the encoding: 
 nl_langinfo(CODESET) and Python3 uses it for the filenames, environment 
 variables and the command line arguments.
 
 Are you suggesting that Python3 should support a encoding different for 
 environment variables and the file system? How would the user configure it?

It's better to let the application decide how to solve this problem
and in order to allow for this, the encodings must be adjustable.

By using fsencode() and fsdecode() in stdlib functions, you basically
prevent this kind of adjustment, since they hardcode the use of
a single encoding which is guessed by looking at nl_langinfo(CODESET).

Note that application may well use completely different encodings
in the environment and for things like pipes than what the user
setup for her GUI environment.

In the end, this will only lead to the same kind of mess we've
had with sys.setdefaultencoding() in Python 2.x, only this
time with sys.setfilesystemencoding() and I'd like to avoid that.

 Since Python3 choosed to store environment variables as unicode string on 
 Windows and POSIX, in this specific case you should reconvert the value to 
 byte strings using fsencode() and then manipulate byte strings. Because 
 Python3 uses surrogateescape, you will get the original byte string values.

Well, yes, but that's a cludge isn't it ?

If you know that e.g. your environment variables are going to have
Latin-1 data (say some content-type variable has this information),
but the user's default LANG setting is UTF-8, Python will fetch the
data as broken Unicode data, you then have to convert it back to bytes
and then back to Unicode using the correct Latin-1 encoding.

It would be a lot better to have the application provide the
encoding to the os.getenv() function and have Python do the
correct decoding right from the start.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-30 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Le vendredi 30 avril 2010 15:58:28, vous avez écrit :
 It's better to let the application decide how to solve this problem
 and in order to allow for this, the encodings must be adjustable.

On POSIX, use byte strings to avoid encoding issues. Examples:

   subprocess.call(['env'], {b'TEST: b'a\xff-'}) # env
   subprocess.call(['echo', b'a\xff-']) # command line
   open('a\xff-') # filename
   os.getenv(b'a\xff-') # get env (result as unicode)

Are you talking about issues on Windows?

 By using fsencode() and fsdecode() in stdlib functions, you basically
 prevent this kind of adjustment, ...

Not if you use byte strings. On POSIX, an unicode string is always converted 
at the end for the system call (using sys.getfilesystemencoding()).

 If you know that e.g. your environment variables are going to have
 Latin-1 data (say some content-type variable has this information),
 but the user's default LANG setting is UTF-8, Python will fetch the
 data as broken Unicode data, you then have to convert it back to bytes
 and then back to Unicode using the correct Latin-1 encoding.
 
 It would be a lot better to have the application provide the
 encoding to the os.getenv() function and have Python do the
 correct decoding right from the start.

You mean that os.getenv() should have an optionnal argument? Something like:

  def getenv(key, default=None, encoding=None):
 value = environ.get(key, default)
 if encoding:
value = value.encode(sys.getfileystemencoding(), 'surrogateescape')
value = value.decode(encoding, 'surrogateescape')
 return value

There are many indirect calls to os.getenv() (eg. by using os.environ.get()):
 - curses uses TERM
 - webbrowser uses PROGRAMFILES (path)
 - distutils.msvc9compiler uses VS%0.f0COMNTOOLS % version (path)
 - wsgiref.util uses HTTP_HOST, SERVER_NAME,  SCRIPT_NAME, ... (url)
 - platform uses PROCESSOR_ARCHITEW6432
 - sysconfig uses PYTHONUSERBASE, APPDATA, ... (path)
 - idlelib.PyShell uses IDLESTARTUP and PYTHONSTARTUP (path)
 - ...

How would you specify the correct encoding in indirect calls?

If your application gets variables in *mixed* encoding, I think that your 
program should start by reencoding variables:

  for name, encoding in (('PATH', 'latin1'), ...):
 value = os.getenv(name)
 value = value.encode(sys.getfileystemencoding(), 'surrogateescape')
 value = value.decode(encoding, 'surrogateescape')
 os.setenv(name, value)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-30 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

STINNER Victor wrote:
 
 STINNER Victor victor.stin...@haypocalc.com added the comment:
 
 Le vendredi 30 avril 2010 15:58:28, vous avez écrit :
 It's better to let the application decide how to solve this problem
 and in order to allow for this, the encodings must be adjustable.
 
 On POSIX, use byte strings to avoid encoding issues. Examples:
 
subprocess.call(['env'], {b'TEST: b'a\xff-'}) # env
subprocess.call(['echo', b'a\xff-']) # command line
open('a\xff-') # filename
os.getenv(b'a\xff-') # get env (result as unicode)
 
 Are you talking about issues on Windows?

The issues normally occur on the way in, not the way out of Python,
so I don't see how using bytes would help.

 By using fsencode() and fsdecode() in stdlib functions, you basically
 prevent this kind of adjustment, ...
 
 Not if you use byte strings. On POSIX, an unicode string is always converted 
 at the end for the system call (using sys.getfilesystemencoding()).

Right and that's a problem since the file system encoding
doesn't need to have anything to do with what you have in
the environment.

 If you know that e.g. your environment variables are going to have
 Latin-1 data (say some content-type variable has this information),
 but the user's default LANG setting is UTF-8, Python will fetch the
 data as broken Unicode data, you then have to convert it back to bytes
 and then back to Unicode using the correct Latin-1 encoding.

 It would be a lot better to have the application provide the
 encoding to the os.getenv() function and have Python do the
 correct decoding right from the start.
 
 You mean that os.getenv() should have an optionnal argument? Something like:

Yes.

   def getenv(key, default=None, encoding=None):
  value = environ.get(key, default)
  if encoding:
 value = value.encode(sys.getfileystemencoding(), 'surrogateescape')
 value = value.decode(encoding, 'surrogateescape')
  return value

No, you store the environment data as bytes and only
decode in getenv() based on the given encoding or using
the file system encoding or default encoding (UTF-8)
as default.

It would probably also worthwhile adding the encoding
parameter to os.environ.get().

 There are many indirect calls to os.getenv() (eg. by using os.environ.get()):
  - curses uses TERM
  - webbrowser uses PROGRAMFILES (path)
  - distutils.msvc9compiler uses VS%0.f0COMNTOOLS % version (path)
  - wsgiref.util uses HTTP_HOST, SERVER_NAME,  SCRIPT_NAME, ... (url)
  - platform uses PROCESSOR_ARCHITEW6432
  - sysconfig uses PYTHONUSERBASE, APPDATA, ... (path)
  - idlelib.PyShell uses IDLESTARTUP and PYTHONSTARTUP (path)
  - ...
 
 How would you specify the correct encoding in indirect calls?

In all of the above cases, the application (in this case the
various modules) knows which encoding to expect and can
add the right encoding parameter to the os.getenv() call.

E.g. the cgi module can use the content-type passed in as
environment parameter to determine the encoding, most other
modules will just use ASCII or the file system encoding
if they are dealing with paths or file names.

 If your application gets variables in *mixed* encoding, I think that your 
 program should start by reencoding variables:
 
   for name, encoding in (('PATH', 'latin1'), ...):
  value = os.getenv(name)
  value = value.encode(sys.getfileystemencoding(), 'surrogateescape')
  value = value.decode(encoding, 'surrogateescape')
  os.setenv(name, value)

Which is a cludge as I mentioned in my previous comment:

value = os.getenv(name, encoding=encoding)
my_environ[name] = value

reads much better.

Also note that os.setenv() won't work since that'll use the
file system encoding for encoding the value back into the C
process environment array. You'd end up with mojibake in
your C environment array.

The point I want to make is that adding fsencode() and
fsdecode() will help refactor the code a bit, but it
shouldn't be used as excuse for not making the encoding
explicit.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-30 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

 No, you store the environment data as bytes and only
 decode in getenv() ...

Yes, this is the best solution for POSIX. We need maybe also a 
os.getenvb()-bytes function, maybe only on POSIX.

But I think that Windows should continue to use unicode environment variables. 
Should os.getenv(key, encoding=...) reencode the value on Windows?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-30 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

STINNER Victor wrote:
 
 STINNER Victor victor.stin...@haypocalc.com added the comment:
 
 No, you store the environment data as bytes and only
 decode in getenv() ...
 
 Yes, this is the best solution for POSIX. We need maybe also a 
 os.getenvb()-bytes function, maybe only on POSIX.

Yes, plus a os.setenvb() function to pass the data back to the C level
array.

 But I think that Windows should continue to use unicode environment 
 variables. Should os.getenv(key, encoding=...) reencode the value on Windows?

Good idea. That would make applications more easily portable between
Windows and POSIX.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-30 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Ok, here is a first version of my patch to implement os.environb:
 - os.environb is the bytes version of os.environ, both are synchronized
 - os.environ(b).data stores bytes keys and values on POSIX (but unicode on 
Windows)
 - create os.getenvb()-bytes
 - os.environb and os.getenvb() are not available on Windows nor OS/2
 - os.environ(b) et os.getenv(b)() accept both byte and unicode keys: that's 
maybe a stupid idea, I don't know yet :-)
 - fix #8513: subprocess: support bytes program name on POSIX
 - create os.fsencode() and os.fsdecode()

The patch is not done (the documentation should be updated), but it's a new 
step to help the discussion. I didn't tried it on Windows.

I already try twice to write os.environb some months ago, but I failed (it was 
too complex for me). os.environ and os.environb now share the same data 
dictionary, and their methods converts inputs and outputs if necessary.

--
Added file: http://bugs.python.org/file17154/issue8514.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-26 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

 They're also useful for dealing with environment variables 
 which are not strictly filesystem (fs) related but also suffer 
 from the same issue requiring surrogate escape.

Yes, Python3 decodes environment variables using 
sys.getfilesystemencoding()+surrogateescape. And since my last fix on 
os.execve(), subprocess (and os.execv(p)e) uses also surrogateescape to encode 
environment variables.

And yes again, I also patched os.getenv() to decode bytes name to unicode using 
sys.getfilesystemencoding()+surrogateescape.

 But other than just calling these os.encode and os.decode

*fs*encode() and *fs*decode() is a reference to the encoding: 
sys.get*filesystem*encoding().

 I just wanted to point the other use out

See also issue #8513.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-26 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Oh! In Python3, ntpath.expanduser() supports bytes path and uses 
sys.getfilesystemencoding() to encode an unicode environment variable to a byte 
string.

Should we remove bytes path support in ntpath.expanduser(), or support bytes in 
ntpath.fsencode()/.fsdecode()?

(sys.getfilesystemencoding() is mbcs on Windows)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-26 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Le lundi 26 avril 2010 13:06:48, vous avez écrit :
 I don't see what environment variables have to do with the file
 system.

A POSIX system only offers *one* function about the encoding: 
nl_langinfo(CODESET) and Python3 uses it for the filenames, environment 
variables and the command line arguments.

Are you suggesting that Python3 should support a encoding different for 
environment variables and the file system? How would the user configure it?

About filenames, Python3 choose the encoding using the locale, but the user 
cannot change it: sys.setfilesystemencoding() is removed by the site module.

 Also note that mbcs on Windows is a meta-encoding. The
 implementation of that encoding depends on the locale used by
 the Windows user. It's just a coincidence that this may actually
 work for the environment variables on Windows as well, but there's
 no guarantee.

os.getenv() should raise a TypeError on Windows if key is a byte string.

os.getenv() didn't support byte string. I patched it to support byte string 
(issue #8391, r80421). But I don't like my fix because we should reject 
support byte string *on Windows*. I would like to factorize the type check for 
all operations on the file system and environment variables in 
fsencode()/fsdecode().

 On Unix, you often have the case that the environment variables
 use mixed encodings, e.g. the CGI interface is a good example
 where this happens per definition. The CGI environment can
 includes file system paths, data encoded in Latin-1 (or some
 other encoding), etc.

Since Python3 choosed to store environment variables as unicode string on 
Windows and POSIX, in this specific case you should reconvert the value to 
byte strings using fsencode() and then manipulate byte strings. Because 
Python3 uses surrogateescape, you will get the original byte string values.

My patch should help both cases: people using unicode objects and people using 
the native OS type (bytes on POSIX). As written in my previous message, you 
can still use byte strings if you want. My patch doesn't change that (on POSIX 
systems).

--
title: Create fs_encode() and fs_decode() functions in os.path - Create 
fsencode() and fsdecode() functions in os.path

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-26 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Version 3 of the patch: fix also os.getenv() which rejects now bytes on Windows 
(one of the goals of this issue).

--
Added file: http://bugs.python.org/file17096/os_path_fs_encode_decode-3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-26 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@haypocalc.com:


Removed file: http://bugs.python.org/file17082/os_path_fs_encode_decode-2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-25 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Update path: rename fs_encode/fs_decode to fsencode/fsdecode.

--
title: Create fs_encode() and fs_decode() functions in os.path - Create 
fsencode() and fsdecode() functions in os.path
Added file: http://bugs.python.org/file17082/os_path_fs_encode_decode-2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-25 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@haypocalc.com:


Removed file: http://bugs.python.org/file17061/os_path_fs_encode_decode.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-25 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Oops, Update path: I mean Update patch ;-)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8514] Create fsencode() and fsdecode() functions in os.path

2010-04-25 Thread Gregory P. Smith

Gregory P. Smith g...@krypto.org added the comment:

i'm +0.7 on fsencode/fsdecode going into os.path.

My bikeshed 0.7?  They're also useful for dealing with environment variables 
which are not strictly filesystem (fs) related but also suffer from the same 
issue requiring surrogate escape.  But other than just calling these os.encode 
and os.decode I don't have any brilliant alternate naming suggestions.  
thoughts?  I could easily live with os.path.fsencode/fsdecode, I just wanted to 
point the other use out.

--
nosy: +gregory.p.smith

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8514
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com