New submission from Graham Dumpleton:

In am embedded system, as the 'python' executable is itself not run and the 
Python interpreter is initialised in process explicitly using PyInitialize(), 
in order to find the location of the Python installation, an elaborate sequence 
of checks is run as implemented in calculate_path() of Modules/getpath.c.

The primary mechanism is usually to search for a 'python' executable on PATH 
and use that as a starting point. From that it then back tracks up the file 
system from the bin directory to arrive at what would be the perceived 
equivalent of PYTHONHOME. The lib/pythonX.Y directory under that for the 
matching version X.Y of Python being initialised would then be used.

Problems can often occur with the way this search is done though.

For example, if someone is not using the system Python installation but has 
installed a different version of Python under /usr/local. At run time, the 
correct Python shared library would be getting loaded from /usr/local/lib, but 
because the 'python' executable is found from /usr/bin, it uses /usr as 
sys.prefix instead of /usr/local.

This can cause two distinct problems.

The first is that there is no Python installation at all under /usr 
corresponding to the Python version which was embedded, with the result of it 
not being able to import 'site' module and therefore failing.

The second is that there is a Python installation of the same major/minor but 
potentially a different patch revision, or compiled with different binary API 
flags or different Unicode character width. The Python interpreter in this case 
may well be able to start up, but the mismatch in the Python modules or 
extension modules and the core Python library that was actually linked can 
cause odd errors or crashes to occur.

Anyway, that is the background.

For an embedded system the way this problem was overcome was for it to use 
Py_SetPythonHome() to forcibly override what should be used for PYTHONHOME so 
that the correct installation was found and used at runtime.

Now this would work quite happily even for Python virtual environments 
constructed using 'virtualenv' allowing the embedded system to be run in that 
separate virtual environment distinct from the main Python installation it was 
created from.

Although this works for Python virtual environments created using 'virtualenv', 
it doesn't work if the virtual environment was created using pyvenv.

One can easily illustrate the problem without even using an embedded system.

$ which python3.4
/Library/Frameworks/Python.framework/Versions/3.4/bin/python3.4

$ pyvenv-3.4 py34-pyvenv

$ py34-pyvenv/bin/python
Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 00:54:21)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.prefix
'/private/tmp/py34-pyvenv'
>>> sys.path
['', '/Library/Frameworks/Python.framework/Versions/3.4/lib/python34.zip', 
'/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4', 
'/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/plat-darwin', 
'/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/lib-dynload', 
'/private/tmp/py34-pyvenv/lib/python3.4/site-packages']

$ PYTHONHOME=/tmp/py34-pyvenv python3.4
Fatal Python error: Py_Initialize: unable to load the file system codec
ImportError: No module named 'encodings'
Abort trap: 6

The basic problem is that in a pyvenv virtual environment, there is no 
duplication of stuff in lib/pythonX.Y, with the only thing in there being the 
site-packages directory.

When you start up the 'python' executable direct from the pyvenv virtual 
environment, the startup sequence checks know this and consult the pyvenv.cfg 
to extract the:

home = /Library/Frameworks/Python.framework/Versions/3.4/bin

setting and from that derive where the actual run time files are.

When PYTHONHOME or Py_SetPythonHome() is used, then the getpath.c checks 
blindly believe that is the authoritative value:

 * Step 2. See if the $PYTHONHOME environment variable points to the
 * installed location of the Python libraries.  If $PYTHONHOME is set, then
 * it points to prefix and exec_prefix.  $PYTHONHOME can be a single
 * directory, which is used for both, or the prefix and exec_prefix
 * directories separated by a colon.

    /* If PYTHONHOME is set, we believe it unconditionally */
    if (home) {
        wchar_t *delim;
        wcsncpy(prefix, home, MAXPATHLEN);
        prefix[MAXPATHLEN] = L'\0';
        delim = wcschr(prefix, DELIM);
        if (delim)
            *delim = L'\0';
        joinpath(prefix, lib_python);
        joinpath(prefix, LANDMARK);
        return 1;
    }
Because of this, the problem above occurs as the proper runtime directories for 
files aren't included in sys.path. The result being that the 'encodings' module 
cannot even be found.

What I believe should occur is that PYTHONHOME should not be believed 
unconditionally. Instead there should be a check to see if that directory 
contains a pyvenv.cfg file and if there is one, realise it is a pyvenv style 
virtual environment and do the same sort of adjustments which would be made 
based on looking at what that pyvenv.cfg file contains.

For the record this issue is affecting Apache/mod_wsgi and right now the only 
workaround I have is to tell people that in addition to setting the 
configuration setting corresponding to PYTHONHOME, to use configuration 
settings to have the same effect as doing:

PYTHONPATH=/Library/Frameworks/Python.framework/Versions/3.4/lib/python34.zip:/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4:/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/plat-darwin:/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/lib-dynload

so that the correct runtime files are found.

I am still trying to work out a more permanent workaround I can add to mod_wsgi 
code itself since can't rely on a fix for existing Python versions with pyvenv 
support.

Only other option is to tell people not to use pyvenv and use virtualenv 
instead.

Right now I can offer no actual patch as that getpath.c code is scary enough 
that not even sure at this point where the check should be incorporated or how.

Only thing I can surmise is that the current check for pyvenv.cfg being before 
the search for the prefix is meaning that it isn't consulted.

    /* Search for an environment configuration file, first in the
       executable's directory and then in the parent directory.
       If found, open it for use when searching for prefixes.
    */

    {
        wchar_t tmpbuffer[MAXPATHLEN+1];
        wchar_t *env_cfg = L"pyvenv.cfg";
        FILE * env_file = NULL;

        wcscpy(tmpbuffer, argv0_path);

        joinpath(tmpbuffer, env_cfg);
        env_file = _Py_wfopen(tmpbuffer, L"r");
        if (env_file == NULL) {
            errno = 0;
            reduce(tmpbuffer);
            reduce(tmpbuffer);
            joinpath(tmpbuffer, env_cfg);
            env_file = _Py_wfopen(tmpbuffer, L"r");
            if (env_file == NULL) {
                errno = 0;
            }
        }
        if (env_file != NULL) {
            /* Look for a 'home' variable and set argv0_path to it, if found */
            if (find_env_config_value(env_file, L"home", tmpbuffer)) {
                wcscpy(argv0_path, tmpbuffer);
            }
            fclose(env_file);
            env_file = NULL;
        }
    }

    pfound = search_for_prefix(argv0_path, home, _prefix, lib_python);

----------
messages: 225434
nosy: grahamd
priority: normal
severity: normal
status: open
title: pyvenv style virtual environments unusable in an embedded system
versions: Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue22213>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to