Two comments:
1. (bug): The acquire() call should be *outside* the try ... finally
block. You do not want to release a lock that you did not aquire.
2. (optimization): If you are not planning to change the path, you do
not have to aquire the lock. Aquiring a lock is quite expensive, so the
double-check will have much less impact on performance.
if config.has_key("PythonPath"):
pathstring = config["PythonPath"]
if not _path_cache.has_key(pathstring):
# acquire must be done outside the try block
_path_cache_lock.acquire()
try:
# double-check, because two threads might come
# to the same conclusion up until here.
if not _path_cache.has_key(pathstring):
newpath = eval(pathstring)
_path_cache[pathstring] = None
sys.path[:] = newpath
finally:
_path_cache_lock.release()
Mike Looijmans
Philips Natlab / Topic Automation
Graham Dumpleton (JIRA) wrote:
Problems with PythonPath directive.
-----------------------------------
Key: MODPYTHON-114
URL: http://issues.apache.org/jira/browse/MODPYTHON-114
Project: mod_python
Type: Bug
Components: core
Versions: 3.2, 3.1.4
Reporter: Graham Dumpleton
The "PythonPath" setting can be used to define the value of the Python
"sys.path" variable. It is this variable which defines the list of directories that
Python will search in when looking for a module to be imported.
Although the actual reassignment of "sys.path" by mod_python does not in itself
present a problem due to assignment in Python being thread safe by definition, the
context in which the assignment occurs is not thread safe and a race condition exists.
This exists as the top level mod_python dispatcher will consult the existing value of "sys.path" and the last
value for the "PythonPath" setting encountered before then making a decision to modify "sys.path".
If multiple requests are being serviced in distinct threads within the context of the same interpreter instance, and
each at the same time decide they want to modify the value of "sys.path", only one might ultimately succeed
in setting it to the value it wants and any modification required by the other may be lost.
In the worst case scenario, this can result in the importation of any subsequent modules within
that request failing due to a required directory not being present in "sys.path". It is
possible that this situation may resolve itself and go away on a subsequent request, but due to how
mod_python caches the last value of "PythonPath" in a global variable this will be
dependent on what other requests arrive.
At the least, for mod_python to resolve the problem itself would require a request to arrive in the interim which targeted the URL which
was not the last to cache its raw setting for "PythonPath". This only works though due to a further issue whereby alternate
requests against URLs with different "PythonPath" settings will cause "sys.path" to be extended everytime if the
"PythonPath" setting references "sys.path". This results in "sys.path" continually growing over time due to
directories being added multiple times.
The problematic code in apache.py is:
if config.has_key("PythonPath"):
# we want to do as little evaling as possible,
# so we remember the path in un-evaled form and
# compare it
global _path
pathstring = config["PythonPath"]
if pathstring != _path:
_path = pathstring
newpath = eval(pathstring)
if sys.path != newpath:
sys.path[:] = newpath
To fix the problems, the processing of PythonPath directive should be protected
by a thread mutex lock and a dictionary should be used to store all seen values
of PythonPath rather than a single variable. If a value for PythonPath has ever
been seen, and not just from the last change, then no update would be necessary.
if config.has_key("PythonPath"):
try:
_path_cache_lock.acquire()
pathstring = config["PythonPath"]
if not _path_cache.has_key(pathstring):
newpath = eval(pathstring)
_path_cache[pathstring] = None
sys.path[:] = newpath
finally:
_path_cache_lock.release()
There shouldn't really be a need to check whether the new path is different to
the current value of sys.path as that scenario shouldn't occur at that point.
The two parts of this problem were previously catalogued as ISSUE 15 and ISSUE
16 in my list of problems with the module importing system. The list can be
found at:
http://www.dscpl.com.au/articles/modpython-003.html