STINNER Victor <vstin...@python.org> added the comment:

> Since the behavior was changed, I think we need a versionchanged directive.

Done.


> But I am not sure that it could be triggered in real code without using some 
> broken extensions.

_PyUnicode_Ready() can fail in two cases: the Py_UNICODE* string contains a 
character outside [U+0000; U+10ffff] character range, or a memory allocation 
failed. Both cases look very unlikely, knowning that _PyUnicode_Ready() is only 
called if PyUnicode_IsIdentifier() is called on a string using the Python 
legacy C API (Py_UNICODE*).


> It may be safer to return 0 instead of returning -1 or crashing if 
> PyUnicode_READY() fails.

Functions which don't use PyObject* as return type have the tradition of 
returning -1. A few examples:

* PyUnicode_GetLength()
* PyUnicode_CopyCharacters()
* PyUnicode_ReadChar() (return "(Py_UCS4)-1" on error)
* PyLong_AsUnsignedLong() (return "(unsigned long)-1" on error)
* etc.

I don't see why PyUnicode_IsIdentifier() deserves to behave differently.


> If return -1 and set an exception, it can lead to an unexpected behavior if 
> the calling code expects only 0/1, and maybe even to crash in debug build due 
> to set an exception and returning value.

I updated my PR to document the behavior change in the "Changes in the C API" 
section of What's New in Python 3.9 documentation.

Previously, the code crashed Python (Py_FatalError). So I'm confident that this 
case never occurred ever in the wild. We got zero bug report about that.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39500>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to