robomics commented on issue #44227: URL: https://github.com/apache/arrow/issues/44227#issuecomment-2424065785
Hi @amoeba, apologies for not coming back to you earlier. I understand your concerns regarding deleting things on behalf of users without their explicit consent. That being said, I still believe it would be useful to provide an automated way for users to fix their python environment when the environment is broken due to updating pyarrow. These are my reasons: - While it is true that the docs state that `create_library_symlinks()` should be run only once, they do not make it clear that installing pyarrow -> `create_library_symlinks()` -> upgrading pyarrow will lead to broken setups on *NIX systems. - Manually fixing broken environments cannot be done easily (with e.g. `rm /tmp/venv/lib/python3.*/site-packages/pyarrow/libarrow*.so`), because that would delete some shared libs that are not symlinks (e.g. `libarrow_python.so`). The simplest way I can think of to safely remove dangling symlinks with shell commands is something like `find "$(python -c 'import os, pyarrow; print(os.path.dirname(pyarrow.__file__), end="")')" -type l -delete`. This works fine, but I am not sure it is a good recommendation to give to every user. - Manually uninstalling and re-installing pyarrow with pip will not fix the broken links (because pip refuses to delete files that it has not created). - When linking against e.g. `libarrow` shipped with the wheels, dangling links can lead to misleading errors. When I stumbled on this issue for the first time I was relying on CMake's `find_library()` to get the path to the wheels' `libarrow` (see [here](https://github.com/paulsengroup/hictkpy/blob/af153ef7b67010779e7e98d6c02c4b9da4376be1/cmake/modules/FindPyarrow.cmake) for more context). `find_library()` was failing claiming that `libarrow.so` didn't exist, even though a file with that name indeed existed on my machine. My first reaction was deleting and re-creating my dev venv, which solved the problem for a few hours. Then something upgraded the pyarrow version installed in my venv and I was back at square zero. Figuring out that the problem were a few dangling symlinks took more time than I'd like to admit :D. How about we change `create_library_symlinks()` signature to something like: ```python3 def create_library_symlinks(repair_broken_symlinks: bool = False): # impl ... ``` and then we update the implementation to unlink dangling symlinks only when `repair_broken_symlinks=True`. It may also be worth updating the docs explaining that up/downgrading pyarrow after calling `create_library_symlinks()` leads to broken symlinks that can cause all sorts of problems, and that this can be fixed by calling `create_library_symlinks(repair_broken_symlinks=True)` (or by manually uninstalling pyarrow, deleting its package folder, reinstalling pyarrow, and finally calling `create_library_symlinks()`). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
