On 11/7/2013 4:38 PM, Victor Stinner wrote:
> 2013/11/7 Benjamin Peterson <[email protected]>:
>> 2013/11/7 victor.stinner <[email protected]>:
>>> http://hg.python.org/cpython/rev/99afa4c74436
>>> changeset: 86995:99afa4c74436
>>> user: Victor Stinner <[email protected]>
>>> date: Thu Nov 07 13:33:36 2013 +0100
>>> summary:
>>> Fix _Py_normalize_encoding(): ensure that buffer is big enough to store
>>> "utf-8"
>>> if the input string is NULL
>>>
>>> files:
>>> Objects/unicodeobject.c | 2 ++
>>> 1 files changed, 2 insertions(+), 0 deletions(-)
>>>
>>>
>>> diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c
>>> --- a/Objects/unicodeobject.c
>>> +++ b/Objects/unicodeobject.c
>>> @@ -2983,6 +2983,8 @@
>>> char *l_end;
>>>
>>> if (encoding == NULL) {
>>> + if (lower_len < 6)
>>
>> How about doing something like strlen("utf-8") rather than hardcoding that?
>
> Full code:
>
> if (encoding == NULL) {
> if (lower_len < 6)
> return 0;
> strcpy(lower, "utf-8");
> return 1;
> }
>
> On my opinion, it is easy to guess that 6 is len("utf-8") + 1 byte for NUL.
>
> Calling strlen() at runtime may slow-down a function in the fast-path
> of PyUnicode_Decode() and PyUnicode_AsEncodedString() which are
> important functions. I know that some developers can execute strlen()
> during compilation, but I don't see the need of replacing 6 with
> strlen("utf-8")+1.
Then how about at least a comment about how 6 is derived?
if (lower_len < 6) /* 6 == strlen("utf-8") + 1 */
return 0;
Eric.
_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com