On 20/01/2010 21:37, M.-A. Lemburg wrote:
David Malcolm wrote:
I'm thinking of making this downstream change to Fedora's site.py (and
possibly in future RHEL releases) so that the default encoding
automatically picks up the encoding from the locale:

  def setencoding():
      """Set the string encoding used by the Unicode implementation.  The
      default is 'ascii', but if you're willing to experiment, you can
      change this."""
      encoding = "ascii" # Default value set by _PyUnicode_Init()
-    if 0:
+    if 1:
          # Enable to support locale aware default string encodings.
          import locale
          loc = locale.getdefaultlocale()
          if loc[1]:
              encoding = loc[1]
      if 0:
          # Enable to switch off string to Unicode coercion and implicit
          # Unicode to string conversion.
          encoding = "undefined"
      if encoding != "ascii":
          # On Non-Unicode builds this will raise an AttributeError...
          sys.setdefaultencoding(encoding) # Needs Python Unicode build !

I've written up extensive notes on the change and the history of the
issue here:
https://fedoraproject.org/wiki/Features/PythonEncodingUsesSystemLocale

Please let me know if there are any errors on that page!

The aim is to avoid strange behavior changes when running a script
within a shell pipeline/cronjob as opposed to at a tty (and to capture
some of the bizarre cornercases, for example, I found the behavior of
the pango/pygtk modules particularly surprising).

I mention it here as a "heads-up" about the change:
   - in case other distributions may want to do the same (or already do
so, though in my very brief survey no-one else seemed to), and
   - in case doing so breaks things in a way I'm not expecting; can
anyone see any flaws in my arguments?
   - in case other people find my notes on the issue useful

Hope this is helpful; can anyone see any potential problems with this
change?
Yes: such a change is unsupported by Python. The code you are
changing should really have been removed many releases ago -
it was originally only intended to serve as basis for experimentation
on choosing the "right" default encoding.

The only supported default encodings in Python are:

  Python 2.x: ASCII
  Python 3.x: UTF-8

Is this true? I thought the default encoding in Python 3 was platform specific (i.e. cp1252 on Windows). That means files written using the default encoding on one platform may not be read correctly on another platform. Slightly off topic for this discussion I realise.

Michael

If you change these, you are on your own and strange things will
start to happen. The default encoding does not only affect
the translation between Python and the outside world, but also
all internal conversions between 8-bit strings and Unicode.

Hacks like what's happening in the pango module (setting the
default encoding to 'utf-8' by reloading the site module in
order to get the sys.setdefaultencoding() API back) are just
downright wrong and will cause serious problems since Unicode
objects cache their default encoded representation.

Please don't enable the use of a locale based default encoding.

If all you want to achieve is getting the encodings of
stdout and stdin correctly setup for pipes, you should
instead change the .encoding attribute of those (only).



--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of 
your employer, to release me from all obligations and waivers arising from any 
and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, 
clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and 
acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your 
employer, its partners, licensors, agents and assigns, in perpetuity, without 
prejudice to my ongoing rights and privileges. You further represent that you 
have the authority to release me from any BOGUS AGREEMENTS on behalf of your 
employer.


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to