I have code that uses the modulefinder:
mf = modulefinder.ModuleFinder()
mf.run_script( self.main_program )
with python 3.7 all works without problems. But python 3.8 tracebacks (TB),
here is the end of the TB:
File "C:\Python38.Win64\lib\modulefinder.py", line 326, in import_module
m = self.load_module(fqname, fp, pathname, stuff)
File "C:\Python38.Win64\lib\modulefinder.py", line 344, in load_module
co = compile(fp.read()+'\n', pathname, 'exec')
File "C:\Python38.Win64\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 308:
character maps to <undefined>
I added this debug print in both the 3.7 and 3.8 code of modulefinder.py:
def load_module(self, fqname, fp, pathname, file_info):
print('QQQ load_module(%r, %r, %r, %r)' % (fqname, fp, pathname,
file_info))
The file that causes the TB is functools.py as there is text that is non-ASCII
at offset 308.
The debug shows this for functools.py:
QQQ load_module('functools', <_io.TextIOWrapper
name='C:\\Python37.win64\\lib\\functools.py' mode='r' encoding='utf-8'>,
'C:\\Python37.win64\\lib\\functools.py', ('.py', 'r', 1))
QQQ load_module('functools', <_io.TextIOWrapper
name='C:\\Python38.Win64\\lib\\functools.py' mode='r' encoding='cp1252'>,
'C:\\Python38.Win64\\lib\\functools.py', ('.py', 'r', 1))
In 3.7 the fp is opened with encoding UTF-8, but on 3.8 its cp1252.
The code in modulefinder does not seem to handle encoding when opening .py
files.
Adding an explicit coding comment to functools.py did not work.
So the default encoding will be used; which is
locale.getpreferredencoding(False) according
to the docs for open().
How did modulefinder end up wth utf-8 encoding being used? It does seem to look
at
chcp setting.
On both 3.7 and 3.8 I see that locale.getpreferredencoding(False) returns
'cp1252'.
I have not figured out how the 3.7 code manages to use utf-8 that is required
to get things
working.
I can workaround this by setting PYTHONUTF8=1, but I want to change the
behavour from within python.
I have failed to find a way to change what is returned by
locale.getpreferredencoding(False) from
within python. Is the only way to set the PYTHONUTF8?
Barry
--
https://mail.python.org/mailman/listinfo/python-list