Steve Dower added the comment: I'm working on this as part of my fix for issue1602. Not yet sure how this will come out - compatibility with GNU readline seems to be the biggest issue, as if we want to keep that then we can't allow embedded '\0' in the encoded text (i.e. UTF-16 cannot be used, which implies that sys.stdin.encoding cannot always be used directly).
Adding __readlinehook__ as an alternative may be feasible, but a decent amount of work given how we call into the current readline implementation. Unfortunately, it looks like detecting when a readline hook has been added is going to involve significant changes to the tokenizer, which I really don't want to do. The easiest approach wrt issue1602 seems to be to special case the console by reencoding from utf-16-le to utf-8 and forcing the encoding in the tokenizer to utf-8 (instead of sys.stdin.encoding) in this case. I'll start here so that at least we can parse Unicode from the interactive prompt. ---------- assignee: -> steve.dower versions: +Python 3.6 -Python 3.4 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue17620> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com