New submission from william.ayd <[email protected]>:
With the attached extension module, if I run the following in the REPL:
>>> import libtest
>>>
>>> libtest.error_if_not_utf8("foo")
'foo'
>>> libtest.error_if_not_utf8("\ud83d")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud83d' in position
0: surrogates not allowed
>>> libtest.error_if_not_utf8("foo")
'foo'
Things seem OK. But the next invocation of
>>> libtest.error_if_not_utf8("\ud83d")
Then causes a segfault. Note that the order of the input seems important;
simply repeating the call with the invalid surrogate doesn't cause the segfault
----------
files: testmodule.c
messages: 358755
nosy: william.ayd
priority: normal
severity: normal
status: open
title: PyUnicode_AsUTF8AndSize Sometimes Segfaults With Incomplete Surrogate
Pair
Added file: https://bugs.python.org/file48798/testmodule.c
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue39113>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com