[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-01-07 Thread Ezio Melotti
Changes by Ezio Melotti : -- title: UnicodeEncodeError - I can't even see license -> Use Py_UCS4 instead of Py_UNICODE in unicodectype.c versions: +Python 3.1, Python 3.2 -Python 3.0 ___ Python tracker

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-01-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: I don't see the point in changing the various conversion APIs in the unicode database to return Py_UCS4 when there are no conversions that map code points between BMP and non-BMP. In order to solve the problem in question (unicode_repr() failing), we shou

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-01-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: > I don't see the point in changing the various conversion APIs in the > unicode database to return Py_UCS4 when there are no conversions that > map code points between BMP and non-BMP. For consistency: if Py_UNICODE_ISPRINTABLE is changed to take Py_UCS4

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-01-10 Thread Ezio Melotti
Changes by Ezio Melotti : -- superseder: -> UCS4 build incorrectly translates cases for non-BMP code points ___ Python tracker ___ ___

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-01-10 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Amaury Forgeot d'Arc wrote: > > Amaury Forgeot d'Arc added the comment: > >> I don't see the point in changing the various conversion APIs in the >> unicode database to return Py_UCS4 when there are no conversions that >> map code points between BMP and n

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-07 Thread Amaury Forgeot d'Arc
Changes by Amaury Forgeot d'Arc : -- assignee: -> amaury.forgeotdarc ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscr

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-07 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: Now I wonder whether it's reasonable to consider this character U+1 (LINEAR B SYLLABLE B008 A) as printable with repr(). Yes, its category is "Lo", but is there a font which can display it? -- ___ Pyt

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-07 Thread Ezio Melotti
Ezio Melotti added the comment: Given that '\U0001'.isprintable() returns True, I would say yes. If someone needs to print this char and has an appropriate font to do it, I don't see why it shouldn't work. -- ___ Python tracker

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Ezio Melotti wrote: > > Ezio Melotti added the comment: > > Given that '\U0001'.isprintable() returns True, I would say yes. If > someone needs to print this char and has an appropriate font to do it, I > don't see why it shouldn't work. Note that

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Ezio Melotti
Ezio Melotti added the comment: [This should probably be discussed on python-dev or in another issue, so feel free to move the conversation there.] The current implementation considers printable """all the characters except those characters defined in the Unicode character database as followi

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: I suggest to go ahead and apply this patch, at least it correctly selects "printable" characters, whatever this means. I filed issue9198 to decide whether chr(0x1) should be printable. -- ___ Python tracke

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Ezio Melotti wrote: > > Ezio Melotti added the comment: > > [This should probably be discussed on python-dev or in another issue, so feel > free to move the conversation there.] > > The current implementation considers printable """all the characters ex

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Amaury Forgeot d'Arc wrote: > > Amaury Forgeot d'Arc added the comment: > > I suggest to go ahead and apply this patch, at least it correctly selects > "printable" characters, whatever this means. > I filed issue9198 to decide whether chr(0x1) should

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Ezio Melotti
Ezio Melotti added the comment: Amaury, before applying the patch consider replacing the tab characters before the comments with spaces. The use of tabs is discouraged. Marc-Andre Lemburg wrote: > I was never a fan of the Unicode repr() change to begin with. The > repr() of an object should w

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Ezio Melotti wrote: > Marc-Andre Lemburg wrote: >> I was never a fan of the Unicode repr() change to begin with. The >> repr() of an object should work in almost all cases. > > I still think that #5110 should be fixed (there's also a patch to fix the > iss

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: > consider replacing the tab characters before the comments with spaces It's actually already the case in my working copy. -- ___ Python tracker _

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: A new patch, generated on top of r82662 -- Added file: http://bugs.python.org/file17909/unicodectype_ucs4_4.patch ___ Python tracker ___ _

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Amaury Forgeot d'Arc wrote: > > Amaury Forgeot d'Arc added the comment: > > A new patch, generated on top of r82662 Could you explain what this bit is about ? @@ -349,7 +313,7 @@ configure Python using --with-wctype-functions. This reduces the

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: A new patch that doesn't remove an important check, avoids a crash when the C macro is called with a huge number. thanks Ezio. -- Added file: http://bugs.python.org/file17911/unicodectype_ucs4_5.patch ___ Pyth

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: > Could you explain what this bit is about ? > -#if defined(HAVE_USABLE_WCHAR_T) && defined(WANT_WCTYPE_FUNCTIONS) > +#if defined(Py_UNICODE_WIDE) && defined(WANT_WCTYPE_FUNCTIONS) On Windows at least, HAVE_USABLE_WCHAR_T is True, this means that Py_Unico

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-09 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Amaury Forgeot d'Arc wrote: > > Amaury Forgeot d'Arc added the comment: > >> Could you explain what this bit is about ? >> -#if defined(HAVE_USABLE_WCHAR_T) && defined(WANT_WCTYPE_FUNCTIONS) >> +#if defined(Py_UNICODE_WIDE) && defined(WANT_WCTYPE_FUNCTION

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-09 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Amaury Forgeot d'Arc wrote: > > Amaury Forgeot d'Arc added the comment: > > A new patch that doesn't remove an important check, avoids a crash when the C > macro is called with a huge number. thanks Ezio. Could you please be more specific on what you ch

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-09 Thread Ezio Melotti
Ezio Melotti added the comment: The 'if' in 'gettyperecord'. (I would also rewrite that as "if (code > 0x10)", it looks more readable to me.) The patch seems OK to me. In the NEWS message 'python' should be capitalized and I would also mention .isprintable() and possibly other functions t

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-09 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Ezio Melotti wrote: > > Ezio Melotti added the comment: > > The 'if' in 'gettyperecord'. (I would also rewrite that as "if (code > > 0x10)", it looks more readable to me.) Ah, good catch ! > The patch seems OK to me. In the NEWS message 'python' sh

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-09 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: str.isprintable() &co are not changed by this patch, because they enumerate Py_UNICODE units and do not join surrogates. See issue9200 -- ___ Python tracker

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-07-09 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: In this 6th patch, the wctype part was changed as suggested. there is one more condition, Py_UNICODE_WIDE: -#if defined(HAVE_USABLE_WCHAR_T) && defined(WANT_WCTYPE_FUNCTIONS) +#if defined(WANT_WCTYPE_FUNCTIONS) && defined(HAVE_USABLE_WCHAR_T) && defined(

[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

2010-08-19 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: Committed with r84177. -- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed ___ Python tracker ___ __