On 5/11/08, Robert Bradshaw <[EMAIL PROTECTED]> wrote: > It also offers the > advantage that the lookup strings don't need to be re-allocated each > time they're needed.
Thats very true on Cython as it has the table. Interestingly, Python sources uses PyString_InternFromString() in many places, but this function do actually create a new tmp string, but in the end you get the interned string. > > On 5/10/08, Stefan Behnel <[EMAIL PROTECTED]> wrote: > >> Hi, > >> > >> I'm wondering how to continue the support for this feature given > >> the fact that > >> identifiers are Unicode strings in Py3. We currently only intern > >> byte strings > >> that look like Python identifiers, so in Py3, they simply no > >> longer look like > >> identifiers, as they are not Unicode strings. > >> > >> I can see four ways how to deal with this: > >> > >> 1) drop string interning completely > >> > >> 2) disable string interning in Py3 and use normally created byte > >> strings instead > >> > >> 3) keep separate sets of identifier-like byte strings and unicode > >> strings in > >> the compiler and write them into the C file. Then, depending on > >> the Python > >> version, either intern the byte strings or the unicode strings, > >> and create the > >> other set as un-interned strings. > >> > >> 4) keep the information if a string should be interned for all > >> strings we deal > >> with (bytes and unicode), remove the intern tab and merge it with > >> the general > >> string tab by adding an additional field "intern". Then > >> __Pyx_InitStrings() > >> would create the strings differently depending on the compile > >> time Python > >> version, i.e., it would intern Unicode identifiers in Py3 and > >> byte string > >> identifiers in Py2, and create everything else as normal strings. > >> > >> Personally, I favour 4) - although I could live with 1) - but > >> since I'm not > >> quite sure what the original intention of string interning was > >> (saving > >> memory?), I'd like to hear other opinions first. > >> > >> Stefan > >> _______________________________________________ > >> Cython-dev mailing list > >> [email protected] > >> http://codespeak.net/mailman/listinfo/cython-dev > >> > > > > > > -- > > Lisandro Dalcín > > --------------- > > Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) > > Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) > > Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) > > PTLC - Güemes 3450, (3000) Santa Fe, Argentina > > Tel/Fax: +54-(0)342-451.1594 > > _______________________________________________ > > Cython-dev mailing list > > [email protected] > > http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > [email protected] > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalcín --------------- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
