Mark Dickinson dicki...@gmail.com added the comment:
Committed to py3k, r70452.
Since this is partway between a bugfix and a new feature, I suggest that
it's not worth merging it to 3.0 (or 2.6). It should be backported to
2.7, however; I'll do this after verifying that the py3k buildbots
Mark Dickinson dicki...@gmail.com added the comment:
Backported to the trunk in r70454. Thanks, all!
--
resolution: - fixed
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4474
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file:
http://bugs.python.org/file13167/unicode_fromwidechar_surrogate-6.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4474
___
Mark Dickinson dicki...@gmail.com added the comment:
Good catch! Added defined(SIZEOF_WCHAR) to the testcapi code as well,
and removed the change to PC/pyconfig.h, since we don't need it any
more...
Added file:
http://bugs.python.org/file13210/unicode_fromwidechar_surrogate-7.patch
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file:
http://bugs.python.org/file13166/unicode_fromwidechar_surrogate-5.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4474
___
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file:
http://bugs.python.org/file12890/unicode_fromwidechar_surrogate-4.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4474
___
STINNER Victor victor.stin...@haypocalc.com added the comment:
add defined(SIZEOF_WCHAR_T) check
I don't understand why SIZEOF_WCHAR_T could be unset, but the patch
version 6 only checks defined(SIZEOF_WCHAR_T) in unicodeobject.c, not
in _testcapimodule.c (#if SIZEOF_WCHAR_T == 4).
Mark Dickinson dicki...@gmail.com added the comment:
Updated Victor's patch:
- applies cleanly against newly whitespace-normalized unicodeobject.c
- renamed USE_WCHAR_SURROGATE to CONVERT_WCHAR_TO_SURROGATES
- add defined(SIZEOF_WCHAR_T) check
I find the patched version of
Marc-Andre Lemburg m...@egenix.com added the comment:
On 2009-02-24 20:39, Mark Dickinson wrote:
Mark Dickinson dicki...@gmail.com added the comment:
Updated Victor's patch:
- applies cleanly against newly whitespace-normalized unicodeobject.c
- renamed USE_WCHAR_SURROGATE to
Mark Dickinson dicki...@gmail.com added the comment:
It would be better to have a single #ifdef #else #endif
Yes, of course it would. :)
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4474
___
Mark Dickinson dicki...@gmail.com added the comment:
New patch, with two separate versions of PyUnicode_FromWideChar.
Added file:
http://bugs.python.org/file13167/unicode_fromwidechar_surrogate-6.patch
___
Python tracker rep...@bugs.python.org
Marc-Andre Lemburg m...@egenix.com added the comment:
On 2009-02-24 21:50, Mark Dickinson wrote:
Mark Dickinson dicki...@gmail.com added the comment:
New patch, with two separate versions of PyUnicode_FromWideChar.
Thanks, much better :-)
___
Python
STINNER Victor victor.stin...@haypocalc.com added the comment:
For lemburg, updated patch:
- Move USE_WCHAR_SURROGATE define outside PyUnicode_FromWideChar()
(and indent the defines, sorry)
- Add #define SIZEOF_WCHAR_T 2 to PC/pyconfig.h
Added file:
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file:
http://bugs.python.org/file12822/unicode_fromwidechar_surrogate-3.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4474
___
Marc-Andre Lemburg m...@egenix.com added the comment:
On 2009-01-26 17:56, STINNER Victor wrote:
STINNER Victor victor.stin...@haypocalc.com added the comment:
@marketdickinson, @lemburg: ping! I updated the patch, does it look
better?
Yes, but there are a few things that still need
STINNER Victor victor.stin...@haypocalc.com added the comment:
@marketdickinson, @lemburg: ping! I updated the patch, does it look
better?
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4474
___
STINNER Victor victor.stin...@haypocalc.com added the comment:
Also note that on platforms with 16-bit wchar_t, the comparison
(0x *w) will always be false, so an additional check for
(Py_UNICODE_SIZE 2) is needed.
Yes, but the right test is (SIZEOF_WCHAR_T 2). I wrote a new test:
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file:
http://bugs.python.org/file12776/unicode_fromwidechar_surrogate-2.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4474
___
Marc-Andre Lemburg m...@egenix.com added the comment:
On 2009-01-18 22:59, Mark Dickinson wrote:
Mark Dickinson dicki...@gmail.com added the comment:
Looks good to me.
I'm not in a position to test with 16-bit wchar_t, but I can't see why
anything would go wrong. I think we can take
Mark Dickinson dicki...@gmail.com added the comment:
Looks good to me.
I'm not in a position to test with 16-bit wchar_t, but I can't see why
anything would go wrong. I think we can take our chances: check this in
and watch the buildbots for signs of trouble.
Some minor whitespace issues in
Mark Dickinson dicki...@gmail.com added the comment:
Thanks for the patch, Victor!
Looks pretty good at first glance, except that it seems that the UTF-32 to
UTF-16 translation is skipped if HAVE_USABLE_WCHAR_T is defined. Is that
deliberate?
A test would be good, too.
STINNER Victor victor.stin...@haypocalc.com added the comment:
Looks pretty good at first glance, except that it seems that the UTF-32 to
UTF-16 translation is skipped if HAVE_USABLE_WCHAR_T is defined. Is that
deliberate?
#ifdef HAVE_USABLE_WCHAR_T
memcpy(unicode-str, w, size *
Mark Dickinson dicki...@gmail.com added the comment:
I understand this code as: sizeof(wchar_t) == sizeof(Py_UNICODE). If I
misunderstood the code, it's a a heap overflow :-)
Yep, sorry. You're right.
A test would be good, too.
PyUnicode_FromWideChar() is not a public API. Should I write
Marc-Andre Lemburg m...@egenix.com added the comment:
On 2009-01-17 14:00, STINNER Victor wrote:
A test would be good, too.
PyUnicode_FromWideChar() is not a public API. Should I write a function in
_testcapi?
It is a public C API. Regardless of this aspect, we should always
add tests for
Marc-Andre Lemburg m...@egenix.com added the comment:
On 2009-01-17 14:00, STINNER Victor wrote:
STINNER Victor victor.stin...@haypocalc.com added the comment:
Looks pretty good at first glance, except that it seems that the UTF-32 to
UTF-16 translation is skipped if HAVE_USABLE_WCHAR_T is
STINNER Victor victor.stin...@haypocalc.com added the comment:
Updated patch including a test in _testcapi module: create two
PyUnicode objects from wide string (PyUnicode_FromWideChar) and
another from utf-8 (PyUnicode_FromString) and compare the value. Patch
is still for py3k branch and can
STINNER Victor victor.stin...@haypocalc.com added the comment:
I run my test on py3k on Linux with 32 bits wchar_t:
- 16 bits Py_UNICODE: test is failing without
PyUnicode_FromWideChar() patch
- 32 bits Py_UNICODE: test pass without the patch, so the issue only
impact 16 bits Py_UNICODE
Can
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file:
http://bugs.python.org/file12773/unicode_fromwidechar_surrogate.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4474
___
STINNER Victor victor.stin...@haypocalc.com added the comment:
(with the full patch, all tests pass with 16 or 32 bits Py_UNICODE)
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4474
___
STINNER Victor victor.stin...@haypocalc.com added the comment:
Patch fixing PyUnicode_FromWideChar() for UCS-2 build: create
surrogates for character U+ like PyUnicode_FromOrdinal() does.
--
keywords: +patch
Added file:
STINNER Victor victor.stin...@haypocalc.com added the comment:
Note: I wrote my patch against py3k r68646.
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4474
___
Mark Dickinson [EMAIL PROTECTED] added the comment:
Just to be clear, the defect in PyUnicode_FromWideChar is present both in
Python 2.x and Python 3.x.
The problem with command-line arguments only occurs in Python 3.x, since
2.x doesn't use PyUnicode_FromWideChar in converting arguments.
I
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment:
This is due to the function downcasting the wchar_t values to
Py_UNICODE, which is a 2-byte value if you build Python as UCS2 version
on Unix.
Most Unixes ship with UCS4 builds, so you don't see the problem there.
Mac OS X ships with a
Roumen Petrov [EMAIL PROTECTED] added the comment:
Marc-Andre explain all. For the protocol my version is from trunk,
python is build with default options. Since system tcl limit UTF-8 to 3
bytes, python is build for UCS-2.
In the report output from python is with character 010d(UCS-2).
May
Changes by STINNER Victor [EMAIL PROTECTED]:
--
nosy: +haypo
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue4474
___
___
Python-bugs-list mailing list
New submission from Mark Dickinson [EMAIL PROTECTED]:
On systems (Linux, OS X) where sizeof(wchar_t) is 4 and wchar_t arrays are
usually encoded as UTF-32, it looks as though PyUnicode_FromWideChar
simply truncates the 32-bit characters to 16-bits, thus giving incorrect
results for characters
Changes by Martin v. Löwis [EMAIL PROTECTED]:
--
versions: +Python 2.6, Python 2.7, Python 3.0
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue4474
___
Mark Dickinson [EMAIL PROTECTED] added the comment:
it is fine on linux
Interesting. Which version of Python is that? And is PyUNICODE 2 bytes
or 4 bytes for that build of Python?
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue4474
38 matches
Mail list logo