[ https://issues.apache.org/jira/browse/THRIFT-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15843432#comment-15843432 ]
Chandler May commented on THRIFT-4042: -------------------------------------- Updated with minimal working example. > ExtractionError when using accelerated thrift in a multiprocess test > -------------------------------------------------------------------- > > Key: THRIFT-4042 > URL: https://issues.apache.org/jira/browse/THRIFT-4042 > Project: Thrift > Issue Type: Bug > Components: Python - Library > Affects Versions: 0.10.0 > Reporter: Chandler May > > We recently switched to thrift 0.10.0 with accelerated protocols and started > seeing ExtractionError being sporadically raised in tests that use the > accelerated protocols (fastbinary module) and multiprocessing module. > Example traceback: > {code} > Traceback (most recent call last): > File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main > "__main__", fname, loader, pkg_name) > File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code > exec code in run_globals > File "/tmp/thrift-4042/thrift4042.py", line 15, in <module> > load_fastbinary() > File "/tmp/thrift-4042/thrift4042.py", line 9, in load_fastbinary > return TCompactProtocolAccelerated(None, fallback=False) > File "build/bdist.linux-x86_64/egg/thrift/protocol/TCompactProtocol.py", > line 449, in __init__ > File "build/bdist.linux-x86_64/egg/thrift/protocol/fastbinary.py", line 7, > in <module> > File "build/bdist.linux-x86_64/egg/thrift/protocol/fastbinary.py", line 4, > in __bootstrap__ > File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line > 1203, in resource_filename > self, resource_name > File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line > 1715, in get_resource_filename > self._extract_resource(manager, self._eager_to_zip(name)) > File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line > 1745, in _extract_resource > self.egg_name, self._parts(zip_path) > File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line > 1270, in get_cache_path > self.extraction_error() > File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line > 1250, in extraction_error > raise err > pkg_resources.ExtractionError: Can't extract file(s) to egg cache > The following error occurred while trying to extract file(s) to the Python egg > cache: > [Errno 17] File exists: '/root/.cache/Python-Eggs' > The Python egg cache directory is currently set to: > /root/.cache/Python-Eggs > Perhaps your account does not have write access to this directory? You can > change the cache directory by setting the PYTHON_EGG_CACHE environment > variable to point to an accessible directory. > {code} > There is a MWE using Docker at https://github.com/cjmay/thrift-4042 . I > reproduced the error on the first run, but you may need to try several runs > to reproduce it as it is a race condition. > This bug only happens when using the new accelerated protocol in thrift > 0.10.0 and only happens when thrift has been installed as an egg. When the > accelerated protocol is used the fastbinary module is extracted from the > installed thrift egg to the Python egg cache. I believe the extraction of > the module file itself is atomic (using the rename syscall) but the creation > of its parent directory is non-atomic and sometimes results in an error if > the fastbinary module is loaded simultaneously by several processes. Note > this only occurs the first time the module is loaded, as after that point the > parent directory already exists. (This is the justification for using Docker > for the MWE.) > I believe this is the same error as: > http://dev.list.galaxyproject.org/python-egg-cache-exists-error-td4656276.html > http://www.georgevreilly.com/blog/2015/01/28/PythonEggCache.html > I believe the documentation indicates this problem can be worked around by > setting {{zip_safe}} to {{False}} in {{setup.py}}: > http://setuptools.readthedocs.io/en/latest/setuptools.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)