Re: [Python-Dev] PEP 393 review

Martin v. Löwis Mon, 29 Aug 2011 12:22:55 -0700

Am 29.08.2011 11:03, schrieb Dirkjan Ochtman:
> On Sun, Aug 28, 2011 at 21:47, "Martin v. Löwis" <[email protected]> wrote:
>>  result strings. In PEP 393, a buffer must be scanned for the
>>  highest code point, which means that each byte must be inspected
>>  twice (a second time when the copying occurs).
> 
> This may be a silly question: are there things in place to optimize
> this for the case where two strings are combined? E.g. highest
> character in combined string is max(highest character in either of the
> strings).


Unicode_Concat goes like this

    maxchar = PyUnicode_MAX_CHAR_VALUE(u);
    if (PyUnicode_MAX_CHAR_VALUE(v) > maxchar)
        maxchar = PyUnicode_MAX_CHAR_VALUE(v);

    /* Concat the two Unicode strings */
    w = (PyUnicodeObject *) PyUnicode_New(
                            PyUnicode_GET_LENGTH(u) +
PyUnicode_GET_LENGTH(v),
                            maxchar);
    if (w == NULL)
        goto onError;
    PyUnicode_CopyCharacters(w, 0, u, 0, PyUnicode_GET_LENGTH(u));
    PyUnicode_CopyCharacters(w, PyUnicode_GET_LENGTH(u), v, 0,
                             PyUnicode_GET_LENGTH(v));

> Also, this PEP makes me wonder if there should be a way to distinguish
> between language PEPs and (CPython) implementation PEPs, by adding a
> tag or using the PEP number ranges somehow.

Well, no. This would equally apply to every single patch, and is just
not feasible. Instead, alternative implementations typically target a
CPython version, and then find out what features they need to implement
to claim conformance.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 review

Reply via email to