Hi again,
Stefan Behnel wrote:
> ------------------------
> @@ -4309,7 +4336,11 @@ static int __Pyx_InitStrings(__Pyx_Strin
> if (t->is_unicode) {
> *t->p = PyUnicode_DecodeUTF8(t->s, t->n - 1, NULL);
> } else {
> + #if PY_MAJOR_VERSION < 3
> *t->p = PyString_FromStringAndSize(t->s, t->n - 1);
> + #else
> + *t->p = PyUnicode_FromStringAndSize(t->s, t->n - 1);
> + #endif
> }
> if (!*t->p)
> return -1;
> ------------------------
I think this should read
if (t->is_unicode) {
#if PY_MAJOR_VERSION < 3
*t->p = PyUnicode_DecodeUTF8(t->s, t->n - 1, NULL);
#else
*t->p = PyUnicode_FromStringAndSize(t->s, t->n - 1);
#endif
} else {
*t->p = PyString_FromStringAndSize(t->s, t->n - 1);
}
Also this:
------------------------
@@ -4289,7 +4312,11 @@ static int __Pyx_InternStrings(__Pyx_Int
""","""
static int __Pyx_InternStrings(__Pyx_InternTabEntry *t) {
while (t->p) {
+ #if PY_MAJOR_VERSION < 3
*t->p = PyString_InternFromString(t->s);
+ #else
+ *t->p = PyUnicode_InternFromString(t->s);
+ #endif
if (!*t->p)
return -1;
++t;
------------------------
The thing here is that we currently do not intern unicode strings at all, so
this must continue to return byte strings.
The actualy problem should be fixed in the compiler, which should know how to
distinguish byte strings from unicode strings in its interned string
dictionary, and generate similar code as for the normal string table (i.e.
with a "unicode" flag). See the add_py_string() method in Symtab.py for a start.
I noticed that cython-devel-py3 is up, but I'll wait for Lisandro to commit
his patch before I start working on it.
Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev