2011/11/7 Stefan Behnel <[email protected]>: > Vitja Makarov, 07.11.2011 19:28: >> >> 2011/11/6 Stefan Behnel: >>> >>> Vitja Makarov, 06.11.2011 18:10: >>>> >>>> When file encoding is specified cython generates two PyObject entries >>>> for string consts one for the variable name and one for the string >>>> constant. >>> >>> That's because the content may actually become different after decoding, >>> even if the encoded byte sequence is identical. Note that decoding is >>> only >>> done in Py3. In Py2, the byte sequence is used, so both values are >>> identical. >> >> If they are the identical after decoding isn't it better to have only >> one of them? > > Well, yes. That's not trivial, though, because the decision is taken at C > compile time. And the benefit tends to be negligible, because this case is > really rare and the affected strings tend to be quite short. > > >>>> Here is minimal example: >>>> $ cat cplus.pyx >>>> # -*- coding: koi8-r -*- >>>> wtf = 'wtf' >>>> >>>> Generaets the following code: >>>> >>>> /* Implementation of 'cplus' */ >>>> static char __pyx_k__wtf[] = "wtf"; >>>> static char __pyx_k____main__[] = "__main__"; >>>> static char __pyx_k____test__[] = "__test__"; >>>> static PyObject *__pyx_n_s____main__; >>>> static PyObject *__pyx_n_s____test__; >>>> static PyObject *__pyx_n_s__wtf; >>>> static PyObject *__pyx_n_s__wtf; >>>> >>>> ... >>>> >>>> static __Pyx_StringTabEntry __pyx_string_tab[] = { >>>> {&__pyx_n_s____main__, __pyx_k____main__, sizeof(__pyx_k____main__), >>>> 0, 0, 1, 1}, >>>> {&__pyx_n_s____test__, __pyx_k____test__, sizeof(__pyx_k____test__), >>>> 0, 0, 1, 1}, >>>> {&__pyx_n_s__wtf, __pyx_k__wtf, sizeof(__pyx_k__wtf), "koi8-r", 0, 1, >>>> 1}, >>>> {&__pyx_n_s__wtf, __pyx_k__wtf, sizeof(__pyx_k__wtf), 0, 0, 1, 1}, >>>> {0, 0, 0, 0, 0, 0, 0} >>>> }; >>> >>> Both Python object variables should have different cnames. >> >> What's about adding encoding suffix? > > Yes, I think that would fix it, although it could be a bit misleading when > reading the C code with a Py3 context in mind. But using a counter doesn't > make it very readable, either. >
Ok. I've fixed it here https://github.com/vitek/cython/compare/file_encoding_T770 Now it produces the following identifiers: static PyObject *__pyx_n_s__wtf; static PyObject *__pyx_n_s_koi8r__wtf; -- vitja. _______________________________________________ cython-devel mailing list [email protected] http://mail.python.org/mailman/listinfo/cython-devel
