[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-06-20 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:


New changeset f87d2038fadd9c067d50fb2f1d7c2f37b9f3893a by Miss Islington (bot) 
in branch '3.10':
bpo-43667: Add news fragment for Solaris changes (GH-26405) (GH-26498)
https://github.com/python/cpython/commit/f87d2038fadd9c067d50fb2f1d7c2f37b9f3893a


--
nosy: +pablogsal

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-06-02 Thread miss-islington


Change by miss-islington :


--
pull_requests: +25094
pull_request: https://github.com/python/cpython/pull/26498

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-27 Thread STINNER Victor


STINNER Victor  added the comment:

I merged your PR and backported it to add a NEWS entry, thanks.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-27 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 427232f9d221d54870fa3e89bd1dac55cf42243f by Miss Islington (bot) 
in branch '3.9':
bpo-43667: Add news fragment for Solaris changes (GH-26405) (GH-26410)
https://github.com/python/cpython/commit/427232f9d221d54870fa3e89bd1dac55cf42243f


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-27 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 0574b0686d76e6f9199f800b5f32bd56eaff3c77 by Miss Islington (bot) 
in branch '3.10':
bpo-43667: Add news fragment for Solaris changes (GH-26405) (GH-26409)
https://github.com/python/cpython/commit/0574b0686d76e6f9199f800b5f32bd56eaff3c77


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-27 Thread miss-islington


Change by miss-islington :


--
nosy: +miss-islington
nosy_count: 3.0 -> 4.0
pull_requests: +25003
pull_request: https://github.com/python/cpython/pull/26409

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-27 Thread miss-islington


Change by miss-islington :


--
pull_requests: +25004
pull_request: https://github.com/python/cpython/pull/26410

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-27 Thread STINNER Victor

STINNER Victor  added the comment:


New changeset 164a4f46d1606e21d82babc010e397a9116e6730 by Jakub Kulík in branch 
'main':
bpo-43667: Add news fragment for Solaris changes (GH-26405)
https://github.com/python/cpython/commit/164a4f46d1606e21d82babc010e397a9116e6730


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-27 Thread Jakub Kulik


Change by Jakub Kulik :


--
pull_requests: +24998
pull_request: https://github.com/python/cpython/pull/26405

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-25 Thread STINNER Victor


STINNER Victor  added the comment:

I close the issue, but you can still reference the bpo issue number for your PR 
with the changelog (NEWS) entry.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-25 Thread Jakub Kulik


Jakub Kulik  added the comment:

Sorry for delayed response.

Considering that we are not delivering or using 3.8 in any way and this issue 
doesn't seem to impact anybody else, we can omit the backport to 3.8. I will 
prepare another PR with a news fragment, and after that, this can be considered 
solved and closed.

--
versions: +Python 3.11 -Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-25 Thread STINNER Victor


STINNER Victor  added the comment:

Do you want to attempt to backport the fix to 3.8, or can this issue be closed?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-21 Thread STINNER Victor


STINNER Victor  added the comment:

Backport to 3.8 may be more complicated. It's up to you to decide if you want 
to backport it or not. I merged your 3.9 backport, it looks very close to the 
change made in the main branch.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-21 Thread STINNER Victor

STINNER Victor  added the comment:


New changeset d3cc68900dc99966007112f884779895daefc7db by Jakub Kulík in branch 
'3.9':
[3.9] bpo-43667: Fix broken Unicode encoding in non-UTF locales on Solaris 
(GH-25096) (GH-25847)
https://github.com/python/cpython/commit/d3cc68900dc99966007112f884779895daefc7db


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-03 Thread Jakub Kulik


Change by Jakub Kulik :


--
components: +Unicode -Tests
versions: +Python 3.10, Python 3.8, Python 3.9 -Python 3.11

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-03 Thread Sujal Patel


Change by Sujal Patel :


--
components: +Tests -Unicode
versions: +Python 3.11 -Python 3.10, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-05-03 Thread Jakub Kulik


Change by Jakub Kulik :


--
pull_requests: +24530
pull_request: https://github.com/python/cpython/pull/25847

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-04-30 Thread STINNER Victor

STINNER Victor  added the comment:


New changeset 9032cf5cb1e33c0349089cfb0f6bf11ed3c30e86 by Jakub Kulík in branch 
'master':
bpo-43667: Fix broken Unicode encoding in non-UTF locales on Solaris (GH-25096)
https://github.com/python/cpython/commit/9032cf5cb1e33c0349089cfb0f6bf11ed3c30e86


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-03-30 Thread Jakub Kulik


Jakub Kulik  added the comment:

I forgot to mention: this affects Oracle Solaris. I tested this on SmartOS, and 
I cannot reproduce it there as it seems that they are using Unicode 
representation for all locales. Based on the documentation, this might also 
affect other systems as well (e.g. HP UIX specifically says: 'These values may 
not be compatible with values obtained by specifying other locales that are 
supported'), but it's hard to tell without testing that.

This one liner breaks with ValueError: character U+3069 is not in range 
[U+; U+10] if the issue is present:
python3.7 -c 'import datetime; import locale; 
locale.setlocale(locale.LC_ALL,"es_ES.ISO8859-1"); datetime.date(2001, 1, 
3).strftime("%a")'

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-03-30 Thread Jakub Kulik


Change by Jakub Kulik :


--
keywords: +patch
pull_requests: +23840
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/25096

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43667] Solaris: Fix broken Unicode encoding in non-UTF locales

2021-03-30 Thread Jakub Kulik


New submission from Jakub Kulik :

On Linux, wchar_t values are mapped to their UTF-8 counterparts; however, that 
does not have to be the case as the standard allows any arbitrary 
representation to be used, and this is the case for Solaris.

In Oracle Solaris, the internal form of wchar_t is specific to a locale; in the 
Unicode locales, wchar_t has the UTF-32 Unicode encoding form, and other 
locales have different representations [1].

This is an issue because Python expects wchar_t to correspond with Unicode, 
which on Oracle Solaris with non-UTF locale results either in errors (values 
are outside the Unicode range) or in output with different symbols.

Unicode locales work as expected, but they are not an acceptable workaround for 
some Oracle Solaris users that cannot use Unicode encoding for various reasons.


Because of that, we fixed it a few months ago with a patch to 
`PyUnicode_FromWideChar`, which handles conversion to unicode (attached in PR). 
It was tested over the last half a year, and we didn't see any related issues 
since.

Is something like this acceptable or should it be fixed on a different place/in 
a different way? All comments are appreciated.

[1] https://docs.oracle.com/cd/E36784_01/html/E39536/gmwkm.html

--
components: Unicode
messages: 389813
nosy: ezio.melotti, kulikjak, vstinner
priority: normal
severity: normal
status: open
title: Solaris: Fix broken Unicode encoding in non-UTF locales
versions: Python 3.10, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com