[issue19846] Python 3 raises Unicode errors with the C locale

2017-12-18 Thread STINNER Victor
STINNER Victor added the comment: Follow-up: the PEP 538 (bpo-28180) and PEP 540 (bpo-29240) have been accepted and implemented in Python 3.7! -- ___ Python tracker ___ __

[issue19846] Python 3 raises Unicode errors with the C locale

2016-12-20 Thread Nick Coghlan
Nick Coghlan added the comment: Also see http://bugs.python.org/issue28180 for a more recent proposal to tackle this by coercing the C locale to the C.UTF-8 locale -- ___ Python tracker ___

[issue19846] Python 3 raises Unicode errors with the C locale

2016-04-22 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- Removed message: http://bugs.python.org/msg263975 ___ Python tracker ___ ___ Python-bugs-list mailing

[issue19846] Python 3 raises Unicode errors with the C locale

2016-04-22 Thread SilentGhost
Changes by SilentGhost : -- nosy: +Sworddragon, a.badger, bkabrda, haypo, jwilk, larry, lemburg, loewis, ncoghlan, pitrou, r.david.murray, serhiy.storchaka, terry.reedy ___ Python tracker _

[issue19846] Python 3 raises Unicode errors with the C locale

2016-04-22 Thread lissacoffey
lissacoffey added the comment: Using an environment variable is not the holy grail for this. On writing a non-single-user application you can't expect the user to set extra environment variables. If compatibility is the only reason in my opinion it would be much better to include something li

[issue19846] Python 3 raises Unicode errors with the C locale

2015-05-17 Thread Terry J. Reedy
Changes by Terry J. Reedy : -- stage: patch review -> resolved ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue19846] Python 3 raises Unicode errors with the C locale

2014-12-07 Thread Terry J. Reedy
Terry J. Reedy added the comment: Since Viktor's alternative in #19977 has been applied, should this issue be closed? -- ___ Python tracker ___ _

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-21 Thread Jakub Wilk
Changes by Jakub Wilk : -- nosy: +jwilk ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-13 Thread Nick Coghlan
Nick Coghlan added the comment: Thanks Victor - I now agree that trying to guess another encoding is a bad idea, and that enabling surrogateescape for the standard streams under the C locale is a better way to go. -- ___ Python tracker

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-13 Thread STINNER Victor
STINNER Victor added the comment: I propose to modify the error handler, the encoding cannot be modified. See my following message explaining why it's not possible to change the encoding: http://bugs.python.org/issue19846#msg205675 -- ___ Python trac

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-13 Thread STINNER Victor
STINNER Victor added the comment: I created the issue #19977 as a follow up of this one: "Use surrogateescape error handler for sys.stdout on UNIX for the C locale". -- ___ Python tracker _

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-13 Thread Toshio Kuratomi
Toshio Kuratomi added the comment: It's not a bug for upstart, systemd, sysvinit, cron, etc to use LANG=C. The POSIX locale is the only locale guaranteed to exist on a system. Therefore these low level services should be using LANG=C. Embedded systems, thin clients, and other low memory or

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-13 Thread Sworddragon
Sworddragon added the comment: > https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/1235483 After opening many hundred tickets I would say: With luck this ticket will get a response within the next year. But in the worst case it will be simply refused. > I found examples using "LANG=$LAN

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-13 Thread STINNER Victor
STINNER Victor added the comment: By the way, Java behaves as Python: with LANG=C, Java uses ASCII: http://stackoverflow.com/questions/13415975/cant-read-utf-8-filenames-when-launched-as-an-upstart-service > udev and Upstart are not setting LANG So it's an issue in udev and Upstart. See for ex

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-13 Thread Sworddragon
Sworddragon added the comment: By the way I have found a valid use case for LANG=C. udev and Upstart are not setting LANG which will result in the ascii encoding for invoked Python scripts. This could be a problem since these applications are commonly dealing with non-ascii filesystems. -

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-13 Thread Nick Coghlan
Nick Coghlan added the comment: There's an alternative to trying to force a different encoding for the standard streams when the OS claims ASCII as the OS encoding: we can default to surrogateescape as the error handler, on the assumption that whatever the *real* OS encoding is, it definitely isn

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-13 Thread Sworddragon
Sworddragon added the comment: > Instead, open() determines the default encoding by calling the same function > that's used to initialize Py_FileSystemDefaultEncoding: get_locale_encoding() > in Python/pythonrun.c. Which on POSIX systems calls the POSIX function > nl_langinfo(). open() will

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-13 Thread Larry Hastings
Larry Hastings added the comment: > "The fact that write() -> open() relies on sys.getfilesystemencoding() > (respectively locale.getpreferredencoding()) at default as encoding is > either a defect or a bad design (I leave the decision to you)." > > Or am I overlooking something? First, you shou

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-13 Thread Sworddragon
Sworddragon added the comment: >> The fact that write() uses sys.getfilesystemencoding() is either >> a defect or a bad design (I leave the decision to you). > I have good news for you. write() does not cal sys.getfilesystemencoding(), > because the encoding is set at the time the > file is op

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-10 Thread Toshio Kuratomi
Toshio Kuratomi added the comment: Yes, it returns a list but unless I'm missing something in the general case it's the caller's responsibility to loop through the charsets to test for failure and try again. This is not done automatically. In the specific case we're talking about, first get_f

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-10 Thread STINNER Victor
STINNER Victor added the comment: > It would interesting to test this approach (try utf-8 or use the locale > encoding) ... Oh, it may be easy to implement it for decoders, but what about encoders? Should os.fsencode() always use UTF-8?? -- ___ Pyth

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-10 Thread STINNER Victor
STINNER Victor added the comment: 2013/12/10 Toshio Kuratomi : > if G_FILENAME_ENCODING: > charset = the first charset listed in G_FILENAME_ENCODING > if charset == '@locale': > charset = charset of user's locale > elif G_BROKEN_FILENAMES: > charset = charset of user's locale

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-10 Thread Toshio Kuratomi
Toshio Kuratomi added the comment: Looking at the glib code, this looks like the SO post is closer to the truth. The API documentation for g_filename_to_utf8() is over-simplified to the point of confusion. This section of the glib API document is closer to what the code is doing: https://de

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-10 Thread STINNER Victor
STINNER Victor added the comment: 2013/12/10 Martin v. Löwis : > >From what I read, it appears that the SO posting is plain wrong. Consider, > >for example, > > https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion.html#g-filename-to-utf8 > > # Converts a string which is in the e

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread Martin v . Löwis
Martin v. Löwis added the comment: >From what I read, it appears that the SO posting is plain wrong. Consider, for >example, https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion.html#g-filename-to-utf8 # Converts a string which is in the encoding used by GLib for filenames #

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread Nick Coghlan
Nick Coghlan added the comment: I confess I didn't independently verify the glib claim in the Stack Overflow post. However, Toshio's post covers the specific error case we were discussing at Flock (and I had misremembered), where the standard streams are classed as "OS APIs" for the purpose of d

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread Antoine Pitrou
Antoine Pitrou added the comment: > It's simply not always true: some Linux distros would be better handled > like OS X, where we always use UTF-8, regardless of what the locale says. Perhaps by the 3.5 timeframe we can default to utf-8 on all Unix systems? -- _

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread Martin v . Löwis
Martin v. Löwis added the comment: Nick: which glib functions are you specifically referring to? Many of them don't deal with strings at all, and of those that do, many are encoding-agnostic (i.e. it is correct to claim that they operate on UTF-8, but likewise also correct that they operate on

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread Nick Coghlan
Nick Coghlan added the comment: There's a wrong assumption here: glib applications on Linux use UTF-8 regardless of locale. That's the part I have a problem with: the assumption that the locale will correctly specify the encoding to use for OS APIs on modern Linux systems. It's simply not always

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread Toshio Kuratomi
Toshio Kuratomi added the comment: Ahh... added to the nosy list and bug closed all before I got up for the day ;-) A few words: I do think that python is broken here. I do not think that translating everything to utf-8 if ascii is the locale's encoding is the solution. As I would state it,

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread STINNER Victor
STINNER Victor added the comment: > There is a big difference between environment variables and internal calls: > Environment variables are user-space while builtin/library functions are > developer-space. You can reopen sys.stdout with a different encoding and replace sys.stdout. I don't rem

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread Sworddragon
Sworddragon added the comment: > If the environment variable is not enough There is a big difference between environment variables and internal calls: Environment variables are user-space while builtin/library functions are developer-space. > I have good news for you. write() does not cal

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread Larry Hastings
Larry Hastings added the comment: > The fact that write() uses sys.getfilesystemencoding() is either > a defect or a bad design (I leave the decision to you). I have good news for you. write() does not cal sys.getfilesystemencoding(), because the encoding is set at the time the file is opened.

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread STINNER Victor
STINNER Victor added the comment: > The fact that write() uses sys.getfilesystemencoding() is either a defect or > a bad design (I leave the decision to you). "Standard streams (sys.stdin, sys.stdout, sys.stderr) uses the locale encoding. sys.stdin and sys.stdout use the strict error handler,

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread Sworddragon
Sworddragon added the comment: > I'm closing the issue as invalid, because Python 3 behaviour is correct > and > must not be changed. The fact that write() uses sys.getfilesystemencoding() is either a defect or a bad design (I leave the decision to you). But I'm still missing a reply to my s

[issue19846] Python 3 raises Unicode errors with the C locale

2013-12-09 Thread STINNER Victor
STINNER Victor added the comment: I'm closing the issue as invalid, because Python 3 behaviour is correct and must not be changed. Standard streams (sys.stdin, sys.stdout, sys.stderr) uses the locale encoding. sys.stdin and sys.stdout use the strict error handler, sys.stderr uses the backsla