Ezio Melotti <[email protected]> added the comment:
As I said in msg142175 I think the Py_UNICODE_IS{HIGH|LOW|}SURROGATE and
Py_UNICODE_JOIN_SURROGATES can be committed without trailing _ in 3.3 and with
trailing _ in 2.7/3.2. They should go in unicodeobject.h and be public in 3.3+.
Regarding the name, it would be fine with me to use
PyUNICODE_IS_HIGH_SURROGATE. Other IS* macros don't use spaces, but
JOIN_SURROGATES and other proposed macros (e.g. PUT_NEXT/WRITE_NEXT) do. Also
these macros are not related to any existing API like e.g. isalpha. I think
HIGH/LOW are fine, we can mention lead/trail in the doc.
Regarding the implementation, we could use Victor's one if it's faster and it
has no other side effects.
Regarding the other macros:
* _Py_UNICODE_NEXT and _Py_UNICODE_PUT_NEXT are useful, so once we have agreed
about the name they can go in. They can be private in all the 3 branches and
made public in 3.4 if they work well;
* IS_NONBMP doesn't simplify much the code but makes it more readable. ICU
has U_IS_BMP, but in most of the cases we want to check for non-BMP, so if we
add this macro it might be ok to check for non-BMP;
* I'm not sure HIGH_SURROGATE/LOW_SURROGATE are useful with _Py_UNICODE_NEXT.
If they are they should get a better name because the current one is not clear
about what they do.
Unless someone disagrees I'll prepare a patch with
PyUNICODE_IS_{HIGH_|LOW_|}SURROGATE and Py_UNICODE_JOIN_SURROGATES for
unicodeobject.h, using them where necessary, using with Victor implementation
and commit it (after a review).
We can think about the rest later.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue10542>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com