Serhiy Storchaka added the comment:

> I am also not clear on the relation between the UnicodeDecodeError and tuple 
> splitting. Does '_flatten((self._w, cmd)))' call split or splitlist on the 
> tuple arg? Is so, do you know why a problem with that would lead to the 
> UDError? Does your patch fix the leading '0' regression?

The traceback is misleading. Full statement is:

            for x in self.tk.split(
                    self.tk.call(_flatten((self._w, cmd)))):

Where cmd is ('entryconfigure', index). The UnicodeDecodeError error was raised 
neither by _flatten() nor call(), but by split().

When run `./python -m idlelib.idle \\0.py` call() returns and split() gets a 
tuple of tuples: (('-activebackground', '', '', '', ''), ('-activeforeground', 
'', '', '', ''), ('-accelerator', '', '', '', ''), ('-background', '', '', '', 
''), ('-bitmap', '', '', '', ''), ('-columnbreak', '', '', 0, 0), ('-command', 
'', '', '', '3067328620open_recent_file'), ('-compound', 'compound', 
'Compound', <index object: 'none'>, 'none'), ('-font', '', '', '', ''), 
('-foreground', '', '', '', ''), ('-hidemargin', '', '', 0, 0), ('-image', '', 
'', '', ''), ('-label', '', '', '', '1 /home/serhiy/py/cpython/\\0.py'), 
('-state', '', '', <index object: 'normal'>, 'normal'), ('-underline', '', '', 
-1, 0)). When set wantobjects in Lib/tkinter/__init__.py to 0, it will get a 
string r"{-activebackground {} {} {} {}} {-activeforeground {} {} {} {}} 
{-accelerator {} {} {} {}} {-background {} {} {} {}} {-bitmap {} {} {} {}} 
{-columnbreak {} {} 0 0} {-command {} {} {} 3067013228open_recent_file} 
{-compound comp
 ound Compound none none} {-font {} {} {} {}} {-foreground {} {} {} {}} 
{-hidemargin {} {} 0 0} {-image {} {} {} {}} {-label {} {} {} {1 
/home/serhiy/py/cpython/\0.py}} {-state {} {} normal normal} {-underline {} {} 
-1 0}".  Then split() try recursively split its argument. When it splits '1 
/home/serhiy/py/cpython/\\0.py' it interprets '\\0' as backslash substitution 
of octal code 0 which means a character with code 0. Tcl uses modified UTF-8 
encoding in which null code is encoded as b'\xC0\x80'. This bytes sequence is 
invalid UTF-8. That is why UnicodeDecodeError was raised (patch for issue13153 
handles b'\xC0\x80' more correctly). When you will try '\101.py', it will be 
translated by split() to 'A.py'.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19020>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to