[issue20368] Tkinter: handle the null character
Roundup Robot added the comment: New changeset d83ce3a2d954 by Christian Heimes in branch '3.3': Issue #20515: Fix NULL pointer dereference introduced by issue #20368 http://hg.python.org/cpython/rev/d83ce3a2d954 New changeset 145032f626d3 by Christian Heimes in branch 'default': Issue #20515: Fix NULL pointer dereference introduced by issue #20368 http://hg.python.org/cpython/rev/145032f626d3 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20368 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20368] Tkinter: handle the null character
Serhiy Storchaka added the comment: But, how many of the replacement sites are exercised by the tests? I added tests for most the replacement sites and updated tests has even more tests. split() and splitlist() -- tested. Unfortunately they are tested only for bytes argument because these methods reject unicode string argument with NUL. Tcl_Obj.string, Tcl_Obj.typename and Tcl_Obj.__str__() -- not tested. There are no explicit tests for these properties and methods. Seems as Tcl_Obj.typename can't be tested for NUL. eval(), evalfile() -- tested. Variable's methods -- tested. exprstring() -- tested. I added tests for exprstring(), exprdouble(), exprlong(), exprboolean() in the patch. record() -- not tested. There are no explicit tests for record() and I have no ideas how it can be used in Python. C functions: FromObj() and Tkapp_CallResult() -- implicitly tested in a lot of tests, in particular in test_passing_values and test_user_command. PythonCmd() -- tested in test_user_command. There are a few changes that seem unrelated to nulls, which might have been left for another patch. They are just make code more robust. For example Tcl can be compiled with TCL_UTF_MAX=6. In this case Python will work correctly most time but can work incorrectly or crash on specific rare data. With proposed changes it will raise SystemError early. Yes, it is worth separate issue. Do you know if this code block is tested. It is implicitly tested in many tests which tests non-ASCII strings. -- Added file: http://bugs.python.org/file33884/tkinter_null_character_2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20368 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20368] Tkinter: handle the null character
Terry J. Reedy added the comment: With the additional tests, it seems reasonable to apply. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20368 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20368] Tkinter: handle the null character
Roundup Robot added the comment: New changeset a6ba6db9edb4 by Serhiy Storchaka in branch '2.7': Issue #20368: Add tests for Tkinter methods exprstring(), exprdouble(), http://hg.python.org/cpython/rev/a6ba6db9edb4 New changeset 825c8db8b1e2 by Serhiy Storchaka in branch '3.3': Issue #20368: Add tests for Tkinter methods exprstring(), exprdouble(), http://hg.python.org/cpython/rev/825c8db8b1e2 New changeset 28ec384e7dcc by Serhiy Storchaka in branch 'default': Issue #20368: Add tests for Tkinter methods exprstring(), exprdouble(), http://hg.python.org/cpython/rev/28ec384e7dcc New changeset 65c29c07bb31 by Serhiy Storchaka in branch '2.7': Issue #20368: The null character now correctly passed from Tcl to Python (in http://hg.python.org/cpython/rev/65c29c07bb31 New changeset 08e3343f01a5 by Serhiy Storchaka in branch '3.3': Issue #20368: The null character now correctly passed from Tcl to Python. http://hg.python.org/cpython/rev/08e3343f01a5 New changeset 321b714653e3 by Serhiy Storchaka in branch 'default': Issue #20368: The null character now correctly passed from Tcl to Python. http://hg.python.org/cpython/rev/321b714653e3 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20368 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20368] Tkinter: handle the null character
Changes by Serhiy Storchaka storch...@gmail.com: -- resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20368 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20368] Tkinter: handle the null character
Serhiy Storchaka added the comment: If there are no objections I'll commit this patch tomorrow. -- assignee: - serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20368 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20368] Tkinter: handle the null character
Terry J. Reedy added the comment: The core of the patch is a wrapper that traps UnicodeDecodeErrors, corrects the strings, and re-decodes. A Python version might look like def unicodeFromTclStringAndSize(s, size): try: return PyUnicode_DecodeUTF8(s, size, NULL) except UnicodeDecodeError: if b'\xc0\x80' in s: s.replace(b'\xc0\x80', b'\x00') return PyUnicode_DecodeUTF8(s, size, NULL) else: raise This is used in a couple of additional wrappers and all direct decode calls are replaced with wrappers. New tests are added. Overall, a great idea, and I want to see this patch in 3.4. But, how many of the replacement sites are exercised by the tests? There are a few changes that seem unrelated to nulls, which might have been left for another patch. Example: -#if TCL_UTF_MAX==3 return PyUnicode_FromKindAndData( -PyUnicode_2BYTE_KIND, Tcl_GetUnicode(value), +sizeof(Tcl_UniChar), Tcl_GetUnicode(value), Tcl_GetCharLength(value)); -#else -return PyUnicode_FromKindAndData( -PyUnicode_4BYTE_KIND, Tcl_GetUnicode(value), -Tcl_GetCharLength(value)); -#endif Do you know if this code block is tested. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20368 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20368] Tkinter: handle the null character
New submission from Serhiy Storchaka: Tcl/Tk uses modified UTF-8 encoding to represent strings as C strings (char*). Because C strings are NUL-terminated, the null character represented as illegal UTF-8 sequence \xc0\x80. Current Tkinter code is not very aware about this. It has special handling the \xc0\x80 string (i.e. encoded single null character) in one place, but doesn't handle encoded null character contained in larger string. As result Tkinter may truncate strings contained the null character, or return wrong result. The proposed patch fixes many issues with the null character (converting from Tcl to Python strings). NUL is still forbidden in string arguments of many methods. Also the patch enhances error handling for variable-related commands. -- components: Extension Modules, Tkinter files: tkinter_null_character.patch keywords: patch messages: 208954 nosy: gpolo, kbk, loewis, roger.serwy, serhiy.storchaka, terry.reedy priority: normal severity: normal stage: patch review status: open title: Tkinter: handle the null character type: behavior versions: Python 2.7, Python 3.3, Python 3.4 Added file: http://bugs.python.org/file33659/tkinter_null_character.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20368 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com