This situation is a bit different from coding cookies. They are used when we have bytes from a source file, but we don't know its encoding. During interactive session the tokenizer always knows the encoding of the bytes. I would think that in the case of interactive session the PyCF_SOURCE_IS_UTF8 should be always set so the bytes containing encoded non-ASCII characters are interpreted correctly. Why I'm talking about PyCF_SOURCE_IS_UTF8? eval(u"u'\u03b1'") -> u'\u03b1' but eval(u"u'\u03b1'".encode('utf-8')) -> u'\xce\xb1'. I understand that in the second case eval has no idea how are the given bytes encoded. But the first case is actually implemented by encoding to utf-8 and setting PyCF_SOURCE_IS_UTF8. That's why I'm talking about the flag.
Regards, Drekin On Wed, Apr 29, 2015 at 9:25 AM, Nick Coghlan <ncogh...@gmail.com> wrote: > On 29 April 2015 at 06:20, Adam Bartoš <dre...@gmail.com> wrote: > > Hello, > > > > is it possible to somehow tell Python 2.7 to compile a code entered in > the > > interactive session with the flag PyCF_SOURCE_IS_UTF8 set? I'm > considering > > adding support for Python 2 in my package > > (https://github.com/Drekin/win-unicode-console) and I have run into the > fact > > that when u"α" is entered in the interactive session, it results in > > u"\xce\xb1" rather than u"\u03b1". As this seems to be a highly > specialized > > question, I'm asking it here. > > As far as I am aware, we don't have the equivalent of a "coding > cookie" for the interactive interpreter, so if anyone else knows how > to do it, I'll be learning something too :) > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia >
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com