https://github.com/python/cpython/commit/e792f4bc2e712bb6e2143599d2b88dd339de83e6 commit: e792f4bc2e712bb6e2143599d2b88dd339de83e6 branch: main author: Peter Bierma <[email protected]> committer: vstinner <[email protected]> date: 2025-01-20T16:54:29+01:00 summary:
Docs C API: Clarify what happens when null bytes are passed to `PyUnicode_AsUTF8` (#127458) Co-authored-by: Stan U. <[email protected]> Co-authored-by: Tomas R. <[email protected]> Co-authored-by: Victor Stinner <[email protected]> files: M Doc/c-api/unicode.rst diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst index f19b86a8dbfb66..94110d48ed7d85 100644 --- a/Doc/c-api/unicode.rst +++ b/Doc/c-api/unicode.rst @@ -1054,6 +1054,15 @@ These are the UTF-8 codec APIs: As :c:func:`PyUnicode_AsUTF8AndSize`, but does not store the size. + .. warning:: + + This function does not have any special behavior for + `null characters <https://en.wikipedia.org/wiki/Null_character>`_ embedded within + *unicode*. As a result, strings containing null characters will remain in the returned + string, which some C functions might interpret as the end of the string, leading to + truncation. If truncation is an issue, it is recommended to use :c:func:`PyUnicode_AsUTF8AndSize` + instead. + .. versionadded:: 3.3 .. versionchanged:: 3.7 _______________________________________________ Python-checkins mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-checkins.python.org/ Member address: [email protected]
