[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2017-06-20 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

This will change the documented behavior. Even if allow this change in a new 
feature release, it can't be made in maintained releases.

A tuple of integers is memory excessive and slow. A bytes object is more 
compact (but may be less compact than a string) and faster. But on 
little-endian platform every wchar_t should be converted to big-endian for 
supporting comparison of bytes objects.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2017-06-20 Thread STINNER Victor

STINNER Victor added the comment:

> Agree, it's more a Python limitation.

Why do you think of changing locale.strxfrm() from str to bytes or tuple? I 
prefer a tuple.

But again, I'm not super motivated by this change. IMHO there are more severe 
issues that should be fixed in Solaris.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2017-06-20 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Agree, it's more a Python limitation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2017-06-20 Thread STINNER Victor

STINNER Victor added the comment:

> It is possible to use the special "encoding" for transformed strings on 
> platforms with broken wcsxfrm().

I wouldn't say that the function is wrong. wchar_t is 32-bit long, the
function is free to use numbers > 0x10. It's more a Python
limitation, no?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2017-06-20 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
components: +Extension Modules -Interpreter Core

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2017-06-20 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

It is possible to use the special "encoding" for transformed strings on 
platforms with broken wcsxfrm().

All codes < 0x1 are not changed. Codes >= 0x1 are encoded as a pair: 
0x1 + (code >> 16), code & 0x.

--
components: +Interpreter Core
type:  -> behavior
versions: +Python 3.5, Python 3.6, Python 3.7 -Python 3.3, Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2017-06-20 Thread Antoine Pitrou

Changes by Antoine Pitrou :


--
nosy:  -pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2017-06-20 Thread STINNER Victor

STINNER Victor added the comment:

Currently, the function is documented to return a string:
https://docs.python.org/dev/library/locale.html#locale.strxfrm
"Transforms a string to one that can be used in locale-aware comparisons."

The problem is that we don't have enough developers who care of 
Solaris/Illimios to fix these issues (propose patches).

test_locale is just *one* example. The curses module is broken for years on 
Solaris if I recall correctly...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2017-06-20 Thread STINNER Victor

STINNER Victor added the comment:

A solution for that would be to return the raw byte string or to return a list 
of integers, rather than an unicode string.

I don't think that locale.strxfrm() result is supposed to be displayed in a 
terminal, it should only be used to sort two strings, or to be used as a key 
function for list.sort() for example.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2017-06-20 Thread Peter

Peter added the comment:

I'm getting the same 2 errors in Python 3.4.6 on Solaris 11.

Comes up when you run 'gmake test' or

./python -W default -bb -E -W error::BytesWarning -m test -r -w -j 0 -v 
test_locale.py

--
nosy: +petriborg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2017-03-10 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

May be issue15954 is related to this issue. Is this issue still reproduced?

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2012-10-17 Thread Trent Nelson

Trent Nelson added the comment:

With the caveat that I know absolutely nothing about locales, here's what I've 
been able to reduce the problem down to:

zinc (alias s11, Solaris 11 x64):
 locale.setlocale(locale.LC_ALL, 'C')
'C'
 locale.strxfrm('a')
'a'
 locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
'en_US.UTF-8'
 locale.strxfrm('a')
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: character U+10105a3 is not in range [U+; U+10]
 

nitrogen (alias s10, Solaris 10 SPARC):

 locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
'en_US.UTF-8'
 locale.strxfrm('a')
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: character U+101010e is not in range [U+; U+10]

Not sure how relevant it is, but on both those Solaris boxes, locale.LC_ALL 
returns 6, whereas on BSD and OS X it always seems to return 0.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16258
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2012-10-17 Thread Jesús Cea Avión

Jesús Cea Avión added the comment:

I can reproduce this on my x86 Solaris 10 update 10.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16258
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2012-10-17 Thread Antoine Pitrou

Antoine Pitrou added the comment:

With the system Python on s10:

Python 2.6.8 (unknown, Apr 13 2012, 17:08:12) [C] on sunos5
Type help, copyright, credits or license for more information.
 import locale
 locale.strxfrm('a')
'a'
 locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
'en_US.UTF-8'
 locale.strxfrm('a')
'\x01\x01\x01\x0e\x01\x01\x01\x01\x01\x01\x01\x02\x01\x01\x0fi\x01\x01\x01\x01'
 locale.strxfrm('a').decode('utf-8')
u'\x01\x01\x01\x0e\x01\x01\x01\x01\x01\x01\x01\x02\x01\x01\x0fi\x01\x01\x01\x01'

The difference between Python 2 and Python 3 is that Python 3 uses wcsxfrm, not 
strxfrm. Apparently Solaris' wcsxfrm is some broken thing that returns the same 
thing as strxfrm, cast to a wchar_t *, hence the character U+101010e 
(corresponding to the '\x01\x01\x01\x0e' bytestring above).

--
nosy: +loewis, pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16258
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2012-10-17 Thread Jesús Cea Avión

Jesús Cea Avión added the comment:

BTW, this works in python 3.2:

x86, 32 bit python, Solaris 10 update 10:


Python 3.2.3 (default, Apr 12 2012, 13:29:13) 
[GCC 4.7.0] on sunos5
Type help, copyright, credits or license for more information.
 import locale
 locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
'en_US.UTF-8'
 locale.strxfrm('a')
'���\U00010f69�'


--
keywords: +3.3regression

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16258
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2012-10-17 Thread Antoine Pitrou

Antoine Pitrou added the comment:

It only works on Python 3.2 because PyUnicode_FromWideChar is more permissive, 
it seems. The first character in the wchar_t string returned by Solaris is 
still 0x101010e.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16258
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2012-10-17 Thread Antoine Pitrou

Antoine Pitrou added the comment:

(by the way, I also tried a memset() before calling wcsxfrm(): no change)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16258
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2012-10-17 Thread Jesús Cea Avión

Changes by Jesús Cea Avión j...@jcea.es:


--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16258
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2012-10-17 Thread STINNER Victor

STINNER Victor added the comment:

Python 3.2 rejects characters outside the range U+-U+10 in
some operations, but not everywhere. I fixed Python 3.3 to be more
strict and always reject characters outside this range. I noticed the
Solaris issue with mbstowcs() on locale encodings different than
UTF-8: #13441. I asked if it's more important to be strict on Unicode,
or if we need to handle the wcsxfrm() issue on python-dev:
http://mail.python.org/pipermail/python-dev/2011-December/114759.html

Stefan Krah answered: Yes, if the cause is a broken mbstowcs() that
sounds good.
http://mail.python.org/pipermail/python-dev/2011-December/114781.html

I asked for help on OpenIndiana IRC channel, but nobody had a locale
encoding different than UTF-8. I didn't have access to a Solaris box,
so I chose to skip failing tests on Solaris.

My commit 2a2d0872d993 (and 7ffe3d304487) skips many locales to
workaround this issue in test__locale.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16258
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2012-10-16 Thread Trent Nelson

New submission from Trent Nelson:

==
ERROR: test_strxfrm (test.test_locale.TestEnUSCollation)
--
Traceback (most recent call last):
  File 
/home/cpython/buildslave/3.x.snakebite-solaris10-u10ga2-sparc64-1/build/Lib/test/test_locale.py,
 line 346, in test_strxfrm
self.assertLess(locale.strxfrm('a'), locale.strxfrm('b'))
ValueError: character U+101010e is not in range [U+; U+10]

==
ERROR: test_strxfrm_with_diacritic (test.test_locale.TestEnUSCollation)
--
Traceback (most recent call last):
  File 
/home/cpython/buildslave/3.x.snakebite-solaris10-u10ga2-sparc64-1/build/Lib/test/test_locale.py,
 line 367, in test_strxfrm_with_diacritic
self.assertLess(locale.strxfrm('à'), locale.strxfrm('b'))
ValueError: character U+101010e is not in range [U+; U+10]

--

Haven't investigated yet.

--
messages: 173124
nosy: trent
priority: normal
severity: normal
status: open
title: test_local.TestEnUSCollection failures on Solaris 10
versions: Python 3.3, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16258
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16258] test_local.TestEnUSCollection failures on Solaris 10

2012-10-16 Thread Jesús Cea Avión

Changes by Jesús Cea Avión j...@jcea.es:


--
nosy: +jcea

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16258
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com