New submission from Nicholas Feix <snip3rn...@googlemail.com>:

The modulefinder._find_module(...) function returns file objects text mode for 
source modules using the system encoding.
ModuleFinder.load_module(...) can run into decoding issues when the source file 
encoding does not match the system default.

The prior implementation imp.find_module(...) detected the encoding correctly 
using the tokenize.detect_encoding(...) function.
With the following code segment the detection would work again with UTF-8 BOM 
and PEP 263 type cookies.

    encoding = None
    if 'b' not in mode:
        with open(file_path, 'rb') as file:
            encoding = tokenize.detect_encoding(file.readline)[0]

    file = open(file_path, mode, encoding=encoding)

----------
components: Library (Lib)
messages: 359259
nosy: Nicholas Feix
priority: normal
severity: normal
status: open
title: Modulefinder does not consider source file encoding
type: behavior
versions: Python 3.8, Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39206>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to