[issue11022] locale.setlocale() doesn't change I/O codec, os.environ does

2011-02-01 Thread Steffen Daode Nurpmeso

Steffen Daode Nurpmeso sdao...@googlemail.com added the comment:

Most of this is much too loud for a newbie who is about to read PEP 7 anyway.  
And if this community has chosen to try (?!?) not to break compatibility with 
code which does not have a notion of a locale setting (i.e. naively uses other 
code in that spirit), you know, then this is simply the way it is.  Thus: 
you're right.  I do agree with what you say, we here have a (8-bit) C++ library 
which does this in it's setup():

// Initialize those Locale variables we're responsible for
Locale::_ctype_cclass = Locale::_posix_cclass;
Locale::_ctype_ccase = Locale::_posix_ccase;

(Like i said: we here went completely grazy and avoid system libraries whenever 
possible and at least directly, doing the stuff ourselfs and only with 
syscalls.)

Besides that i would agree with me that unthreaded init, optional embeddor 
locale argument, cleanup of .getprefer...() and other drops of setlocale() 
are/would be good design decisions.  And of course: keeping the thing simple 
and understandable is a thing to keep in mind in respect to a normal user.

After the end (i have to excuse myself once again for a book):
I, f.e., opened an issue 11059 on saturday because the HG repo was (2.7 may 
still be) not cloneable, and i did so at selenic, too.  Notes on that:
- pitrou closed it because this tracker is of course for Python bugs.   (I 
asked him to decide - thanks.)
- The selenic people told me that i added my trace to a completely wrong issue. 
 (Just searched - that's more than shown in trace dump.)
- I've found out that many, *many* issues seem to have been created due to this 
repo failure at python.org (at selenic), and i've added a note that they 
possibly should include a prominent notice that people should look for most 
recent call last before creating a new one.  (I guess that most of these 
people are programmers - who else uses HG?)
- Conclusion: maybe even os.environ[]= == locale.setlocale() is not simple 
minded enough.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ does

2011-01-31 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Attached patch replaces locale.getpreferredencoding() by 
locale.getpreferredencoding(False) in _io.TextIOWrapper and _pyio.TextIOWrapper.

--
keywords: +patch
Added file: http://bugs.python.org/file20637/io_dont_set_locale.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ does

2011-01-31 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
stage:  - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ does

2011-01-31 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

Steffan: I'm not sure what your post means, but I think there is a chance you 
might be confused about something.  Python should *never* change the locale 
from the C locale.  A Python *program* can do so, by calling setlocale, but 
Python itself should not.  This is because when an arbitrary Python program is 
run, it needs to run in the C locale *unless it chooses otherwise*.  To do 
anything else would produce a myriad portability problems for any code that is 
affected by locale settings (especially when the programmer doesn't know that 
it is so affected).

This is orthogonal to the issue of deciding what encoding to use for various 
bits of I/O, where Python may need to discover what locale the user has chosen 
as a default.  It's too bad libc makes this so hard to do safely.

--
nosy: +r.david.murray

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ does

2011-01-29 Thread Steffen Daode Nurpmeso

Steffen Daode Nurpmeso sdao...@googlemail.com added the comment:

Also in respect to Issue 6203 i could talk about a project which did not link 
against anything in the end, only ld(1) and syscalls and the undocumented third 
'char **envp' arg to UNIX main()s.
Thus: all of you should be *very* happy about the warm and cosy environment of 
LibC etc.!
You've decided to re-Python as Py3k, is guess it has got something to do with, 
let me describe it as, UNICODE.
Thus: you need a locale.

- Environment: has an encoding, though keys are ok to parse in ASCII
  (unless your OS allows wide characters *optionally*).
  Still, LC_ values may be specified in a *lot* of different ways,
  but one thing is true: it's a hard to do in plain C without being
  able to use stuff which *may* depend upon an initialized library
- Path names: have an encoding
- Console I/O: has an encoding
- File I/O: this is all dumb bytes, just do what you want

Conclusion: you need a locale.

- Hardcode defaults
- Spread specific things all across the implementation.
  I.e., in path access, use some os.path._sysdep.default_codeset(),
  in console I/O do os.console._sysdep.default_codeset() etc.
  (i'm lying about names)
- Perform an initial global initialization

So - what are you all talking about?
Noone - and i really mean NOONE - can assume that a fully blown environment 
like python(1) can be used as an isolated sandbox thing
like ECMAScript!  File I/O, child processes ...  Shall an entire interpreter 
lifecycle be possible in a signal(3) handler
(uuhh, just kiddin')?  Even if that would be true for 2.7 (don't know), in Py3k 
there is graceful and neatless UNICODE support.
You need a locale.

I would indeed insist on the following:
- The interpreter *has* to be initialized in the cosy LibC
  (or whatever native thing) environment.
  Like this it embeds itself neatlessly in there.
  This *has* to be performed in an *unthreaded* state.
  If you are really concerned about anything here,
  add an additional argument (or is it there yet?  I did *not*
  look in there - i would/will need long months to get an idea
  of the entire python(1) system) to your interpreter's setup()
  like thing, or allow NULL to nevertheless use setlocale() directly.
  Like this the embedder can choose herself which approach she
  wants to adhere.
- Even if 3.DID_IT ends up with a lot of 'encoding=STRING' instead
  of 'codec=None' (aka 'codec=codec_instance'), i would implement
  the system in a way that a change at a single place is automatically
  reflected all through the system (on a no-arg-then-use-default)
  base.

After the end:
someone who earned about 150 bucks from me for two books i bought
almost a decade ago once i've started Thinking In ... programming
said some years ago (as i've read in the german magazine c't):
In Python i am even more productive than with Java.
(I always was in doubt about that person - someone who is productive
in Java, who may that be?)
Thanks for python(1), and have a nice weekend.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ does

2011-01-28 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:


--
title: locale.setlocale() doesn't change I/O codec, os.environ - 
locale.setlocale() doesn't change I/O codec, os.environ does

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ[] does

2011-01-27 Thread Steffen Daode Nurpmeso

New submission from Steffen Daode Nurpmeso sdao...@googlemail.com:

This bug may be based on same problem as Issue 6203.
- My system locale is en_GB.UTF-8.
- Given a latin1 text file, open()+ will fail with
  'UnicodeDecodeError: 'utf8' codec can't decode byte 0xf6...'
- Using locale.setlocale(..., ...)
- Re-open causes same error, I/O layer codec has not been changed!
- Using os.environ[LC_ALL] = ...
- Re-open works properly, I/O layer codec has been changed.
P.S.: i am new to Python, please don't assume i can help in solving the problem!

--
components: Library (Lib)
messages: 127177
nosy: sdaoden
priority: normal
severity: normal
status: open
title: locale.setlocale() doesn't change I/O codec, os.environ[] does
versions: Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ[] does

2011-01-27 Thread Steffen Daode Nurpmeso

Changes by Steffen Daode Nurpmeso sdao...@googlemail.com:


--
type:  - behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ[] does

2011-01-27 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

 - Using locale.setlocale(..., ...)
 - Re-open causes same error, I/O layer codec has not been changed!

Yes, this is the expected behaviour with the current code.

TextIOWrapper uses indirectly locale.getpreferredencoding() to choose your file 
encoding. If locale has the CODESET constant, this function sets LC_CTYPE to  
and uses nl_langinfo(CODESET) to get the locale encoding.

locale.getpreferredencoding() has an option to not set the LC_CTYPE to : 
locale.getpreferredencoding(False).

Example:
---
$ python3.1
Type help, copyright, credits or license for more information.
 from locale import getpreferredencoding, setlocale, LC_CTYPE
 from locale import nl_langinfo, CODESET

 setlocale(LC_CTYPE, None)
'fr_FR.utf8'
 getpreferredencoding()
'UTF-8'
 getpreferredencoding(False)
'UTF-8'

 setlocale(LC_CTYPE, 'fr_FR.iso88591')
'fr_FR.iso88591'
 nl_langinfo(CODESET)
'ISO-8859-1'
 getpreferredencoding()
'UTF-8'
 getpreferredencoding(False)
'ISO-8859-1'
---

Setting LC_CTYPE does change directly nl_langinfo(CODESET) result, but not 
getpreferredencoding() result because getpreferredencoding() doesn't care of 
the current locale: it uses its own LC_CTYPE value ().

getpreferredencoding(False) uses the current locale and give the expected 
result.

 - Using os.environ[LC_ALL] = ...
 - Re-open works properly, I/O layer codec has been changed.

Set LC_ALL works because getpreferredencoding() sets the LC_CTYPE to  which 
will read the current value of the LC_ALL and LC_CTYPE environment 
variables.

--

Actually, TextIOWrapper doesn't use the current locale, it only uses 
(indirectly) the environment variables. I don't know which behaviour is better.

If you would like that TextIOWrapper uses your current locale, use: 
open(filename, encoding=locale.getpreferredencoding(True)).

Anyway, I don't know understand why do you change your locale, because you know 
that your file encoding is Latin1. Why don't you use directly: open(filename, 
encoding='latin1')?

--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ[] does

2011-01-27 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

 This bug may be based on same problem as Issue 6203.

Nope, both issues are different. Here you want that TextIOWrapper reads your 
current locale, and not your environment variables. Issue #6203 asks why 
LC_CTYPE is not C by default, but the user locale LC_CTYPE (read from LC_ALL or 
LC_CTYPE environment variables).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ[] does

2011-01-27 Thread Steffen Daode Nurpmeso

Steffen Daode Nurpmeso sdao...@googlemail.com added the comment:

 Anyway, I don't know understand why do you change your locale,
 because you know that your file encoding is Latin1. Why don't you
 use directly: open(filename, encoding='latin1')?

Fortunately Issue 9124 is being solved soon due to the very active
happy hacker haypo ...
I have read haypo's add-on to Issue 6203 and, since he refers to
this issue here, i'll add some thoughts of mine, though they possibly
should not belong into a bug tracker ...

My misunderstanding was based upon an old project of mine,
where i've used the environment to initialize the library state
upon program startup only, but afterwards the entire handling was centralized 
upon some Locale class (changes therein were dispatched
to and thus reflected by a TextCodec etc. - you may see Issue 9727,
though my solution was hardwired).
Like that turning one screw managed the entire system.

If Python would be my project, i would change this code,
because i do not see a real difference in os.environ[LC_]=
and locale.setlocale(LC_,)!
Both cases indicate the users desire to change a specific locale
setting and thus - of course - all the changes which that implies!
So why should there be a difference?

What i really have to say is that the (3.1) implementation of 
getpreferredencoding() is horror, not only in respect to SMP
(it's a no-go, then, even with locking, but that's not present).
If Python would be mine (after thinking one hour without any
feedback of anybody else), i would do the following:
- upon program startup, init LibC environment:
  setlocale(LC_ALL, );
  (see 
http://pubs.opengroup.org/onlinepubs/009695399/functions/setlocale.html)
  Then init this very basic codeset in an unthreaded state:
  global_very_default_codeset = nl_langinfo(CODESET);
  After that, rename this terrible do_setlocale argument to
  use_locale_active_on_program_startup.
  Then i would start a discussion wether such an argument is useful at
  all, because you possibly always ever say False.
  Do ya???

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ[] does

2011-01-27 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:


--
nosy: +Arfrever

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11022] locale.setlocale() doesn't change I/O codec, os.environ[] does

2011-01-27 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 Both cases indicate the users desire to change a specific locale
 setting and thus - of course - all the changes which that implies!
 So why should there be a difference?

I don't think it's intentional. I would be +1 on changing to 
getpreferredencoding(False).

--
components: +IO
nosy: +loewis, pitrou
versions: +Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11022
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com