[ https://issues.apache.org/jira/browse/ARROW-17319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575884#comment-17575884 ]
Antoine Pitrou commented on ARROW-17319: ---------------------------------------- cc [~kou] > pyarrow seems to set default CPU affinity to 0 on shutdown, crashes if CPU 0 > is not available > --------------------------------------------------------------------------------------------- > > Key: ARROW-17319 > URL: https://issues.apache.org/jira/browse/ARROW-17319 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 9.0.0 > Environment: Ubuntu 20.02 / Python 3.8.10 (default, Jun 22 2022, > 20:18:18) > $ pip list > Package Version > --------------- ------- > numpy 1.23.1 > pandas 1.4.3 > pip 20.0.2 > pkg-resources 0.0.0 > pyarrow 9.0.0 > python-dateutil 2.8.2 > pytz 2022.1 > setuptools 44.0.0 > six 1.16.0 > Reporter: Mike Gevaert > Priority: Major > > I get the following traceback when exiting python after loading > {{pyarrow.parquet}} > {code} > Python 3.8.10 (default, Jun 22 2022, 20:18:18) > [GCC 9.4.0] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> os.getpid() > 25106 > >>> import pyarrow.parquet > >>> > Fatal error condition occurred in > /opt/vcpkg/buildtrees/aws-c-io/src/9e6648842a-364b708815.clean/source/event_loop.c:72: > aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, > el_group, &thread_options) == AWS_OP_SUCCESS > Exiting Application > ################################################################################ > Stack trace: > ################################################################################ > /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200af06) > [0x7f831b2b3f06] > /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x20028e5) > [0x7f831b2ab8e5] > /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1f27e09) > [0x7f831b1d0e09] > /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200ba3d) > [0x7f831b2b4a3d] > /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1f25948) > [0x7f831b1ce948] > /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200ba3d) > [0x7f831b2b4a3d] > /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1ee0b46) > [0x7f831b189b46] > /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x194546a) > [0x7f831abee46a] > /lib/x86_64-linux-gnu/libc.so.6(+0x468a7) [0x7f831c6188a7] > /lib/x86_64-linux-gnu/libc.so.6(on_exit+0) [0x7f831c618a60] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfa) [0x7f831c5f608a] > {code} > To replicate this; one needs to make sure that CPU 0 isn't available to > schedule tasks on. In HPC our environment, that happens due to slurm using > cgroups to constrain CPU usage. > On a linux workstation, one should be able to: > 1) open python as a normal user > 2) get the pid > 3) as root: > {code} > cd /sys/fs/cgroup/cpuset/ > mkdir pyarrow > cd pyarrow > echo 0 > cpuset.mems > echo 1 > cpuset.cpus # sets the cgroup to only have access to cpu 1 > echo $PID > tasks > {code} > Then, in the python enviroment: > {code} > import pyarrow.parquet > exit() > {code} > Which should trigger the crash. > Sadly, I couldn't track down which {{aws-c-common}} and {{aws-c-io}} are > being used for the 9.0.0 py38 manylinux wheels. (libarrow.so.900 has > BuildID[sha1]=dd6c5a2efd5cacf09657780a58c40f7c930e4df1) -- This message was sent by Atlassian Jira (v8.20.10#820010)