Hi everyone, I am currently investigating an interesting bug in homeassistant which is a bit hard to reproduce. I already dug so deep that I ask here, because I think the root cause has to do with how Python locates and loads modules/packages.
The "story" is: In homeassistant (short HA) some modules are installed at runtime during first start ("bootstrapping"), and for some reason loading the modules fails directly after installing the modules. Subsequent starts work fine. The modules that are installed during bootstrapping are installed to `~/.homeassistant/deps/lib/python3.7/site-packages`. Installing the modules happens in a different thread than where the modules are imported. (However as far as I can tell the importing thread waits for the install to finish.) Now the interesting thing is that when I start HA in verbose mode, I can see that Python is trying not all of the paths that are in `sys.path` (in the following snippet I cut out some stuff for brevity. Link to the full log at the end). As one can see the `~/.homeassistant/deps/lib/python3.7/site-packages` location is not tried and hence the import fails: ``` # trying /home/pyuser/.homeassistant/sqlalchemy.cpython-37m-x86_64-linux-gnu.so # trying /home/pyuser/.homeassistant/sqlalchemy.abi3.so # trying /home/pyuser/.homeassistant/sqlalchemy.so # trying /home/pyuser/.homeassistant/sqlalchemy.py # trying /home/pyuser/.homeassistant/sqlalchemy.pyc # trying /home/pyuser/sqlalchemy.cpython-37m-x86_64-linux-gnu.so [...] # trying /usr/local/lib/python3.7/sqlalchemy.cpython-37m-x86_64-linux-gnu.so [...] # trying /usr/local/lib/python3.7/lib-dynload/sqlalchemy.cpython-37m-x86_64-linux-gnu.so [...] # trying /home/pyuser/.local/lib/python3.7/site-packages/sqlalchemy.cpython-37m-x86_64-linux-gnu.so [...] # trying /usr/local/lib/python3.7/site-packages/sqlalchemy.cpython-37m-x86_64-linux-gnu.so [...] Exception in thread Recorder: Traceback (most recent call last): File "/usr/local/lib/python3.7/threading.py", line 917, in _bootstrap_inner self.run() File "/home/pyuser/.local/lib/python3.7/site-packages/homeassistant/components/recorder/__init__.py", line 187, in run from .models import States, Events File "<frozen importlib._bootstrap>", line 983, in _find_and_load File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 677, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 728, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/home/pyuser/.local/lib/python3.7/site-packages/homeassistant/components/recorder/models.py", line 8, in <module> from sqlalchemy import ( File "<frozen importlib._bootstrap>", line 983, in _find_and_load File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked ModuleNotFoundError: No module named 'sqlalchemy' ``` I also checked that this path is also really in `sys.path` at the time the import happens. This is `sys.path` printed out directly before the `import sqlalchemy statement`: ``` ['/home/pyuser/.homeassistant', '/home/pyuser/.homeassistant/deps/lib/python3.7/site-packages', '/home/pyuser', '/usr/local/lib/python37.zip', '/usr/local/lib/python3.7', '/usr/local/lib/python3.7/lib-dynload', '/home/pyuser/.local/lib/python3.7/site-packages', '/usr/local/lib/python3.7/site-packages'] ``` What I already tried is to do a importlib.invalidate_caches() after installing the modules. However this does not fix it. So what is happening here? Why does the import skip `~/.homeassistant/deps/lib/python3.7/site-packages`? Any tips on how I can investigate further? Link to the full log: https://gist.github.com/StefanD986/fbf08426ceaad657929b973cd5add25a#file-homeassistant_verbose_first_start-txt-L5787 -- https://mail.python.org/mailman/listinfo/python-list