[issue42849] pool worker can't be terminated

2021-01-15 Thread ppperry


ppperry  added the comment:

duplicate of issue22393?

--
nosy: +ppperry

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42849] pool worker can't be terminated

2021-01-08 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

3.7 has not gotten bug fixes for a couple of years.  This needs to be verified 
on a current release.

--
nosy: +davin, pitrou, terry.reedy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42849] pool worker can't be terminated

2021-01-06 Thread Zhesi Huang


New submission from Zhesi Huang :

i see a case, the worker proc of pool can't be terminated, i try kill -SIGTERM 
or -SIGINT, but it still can't terminated the worker proc, then the pool 
__exit__ will hang forever

```
class NonDaemonProcess(multiprocessing.Process):
# 
https://stackoverflow.com/questions/6974695/python-process-pool-non-daemonic

# make 'daemon' attribute always return False
def _get_daemon(self):
return False

def _set_daemon(self, value):
pass

daemon = property(_get_daemon, _set_daemon)


def wrapper_copy_parallel(output_name, local_output_path, obs_path):
log_thread('upload the content of [%s] outputs' % output_name)
try:
if os.path.exists(local_output_path):
log_thread('%s has %d files to be uploaded' % (output_name, 
len(file_list)))

xx

log_thread('upload the content of [%s] outputs successfully' % 
output_name)
log_thread('it can be accessed at obs path [%s]' % obs_path)
else:
log_thread('local output path is not found, skip upload the content 
of [%s] outputs' % output_name)
except Exception as upload_exception:
err_thread('upload the content of [%s] outputs failed: %s' % 
(output_name, str(upload_exception)))
return 255

return 0


def upload_to_s3():
"""
upload the content of local path to s3, handle action [on_completed]
:return:
"""
outputs = []
for local_output_path, (output_name, obs_path, action, _, _) in 
local_to_target.items():
if action == ACTION_ON_COMPLETED:
outputs.append((output_name, local_output_path, obs_path))

if len(outputs) == 0:
return 0

with NonDaemonPool(processes=len(outputs)) as pool:
results = pool.starmap(wrapper_copy_parallel, outputs)

for result in results:
if result != 0:
return result

return 0
```

stdout/stderr log

[ma-test Service Log][INFO][2021/01/05 03:07:49,218]: registered signal handler
[ma-test Service Log][INFO][2021/01/05 03:09:40,299]: output-handler finalizing
[ma-test Service Log][INFO][2021-01-05 03:09:40,309][NonDaemonPoolWorker-1]: 
child process calling self.run()
[ma-test Service Log][INFO][2021-01-05 03:09:40,311][NonDaemonPoolWorker-1]: 
upload the content of [] outputs
[ma-test Service Log][INFO][2021-01-05 03:09:41,331][Process-1:1]: child 
process calling self.run()
[ma-test Service Log][INFO][2021-01-05 03:09:41,333][Process-1:2]: child 
process calling self.run()
[ma-test Service Log][INFO][2021-01-05 03:09:41,338][Process-1:3]: child 
process calling self.run()
[ma-test Service Log][INFO][2021-01-05 03:09:41,351][Process-1:1]: process 
shutting down
[ma-test Service Log][INFO][2021-01-05 03:09:41,351][Process-1:1]: process 
exiting with exitcode 0
[ma-test Service Log][INFO][2021-01-05 03:09:41,386][Process-1:2]: process 
shutting down
[ma-test Service Log][INFO][2021-01-05 03:09:41,386][Process-1:2]: process 
exiting with exitcode 0
[ma-test Service Log][INFO][2021-01-05 03:09:41,410][Process-1:3]: process 
shutting down
[ma-test Service Log][INFO][2021-01-05 03:09:41,410][Process-1:3]: process 
exiting with exitcode 0
[ma-test Service Log][INFO][2021-01-05 03:09:41,415][NonDaemonPoolWorker-1]: 
upload the content of [] outputs successfully
[ma-test Service Log][INFO][2021-01-05 03:09:41,415][NonDaemonPoolWorker-1]: it 
can be accessed at obs path 
[s3://ma-test-algorancher-intel/model_evaluation/6e5746ff-2839-400a-ba93-df38311415f4/dac957b0-b43b-43e2-ab19-0b45672a7ea0/]

18 (pid) proc python stacktrace

>>>
Interrupting process at following point:
  File "/home/ma-user/runtime-scripts-v2/init-container/outputs-handler.py", 
line 396, in 
ret_code = upload_to_obs()
  File "/home/ma-user/runtime-scripts-v2/init-container/outputs-handler.py", 
line 287, in upload_to_obs
results = pool.starmap(wrapper_copy_parallel, outputs)
  File "/home/ma-user/miniconda3/lib/python3.7/multiprocessing/pool.py", line 
623, in __exit__
self.terminate()
  File "/home/ma-user/miniconda3/lib/python3.7/multiprocessing/pool.py", line 
548, in terminate
self._terminate()
  File "/home/ma-user/miniconda3/lib/python3.7/multiprocessing/util.py", line 
201, in __call__
res = self._callback(*self._args, **self._kwargs)
  File "/home/ma-user/miniconda3/lib/python3.7/multiprocessing/pool.py", line 
617, in _terminate_pool
p.join()
  File "/home/ma-user/miniconda3/lib/python3.7/multiprocessing/process.py", 
line 140, in join
res = self._popen.wait(timeout)
  File "/home/ma-user/miniconda3/lib/python3.7/multiprocessing/popen_fork.py", 
line 48, in wait
return self.poll(os.WNOHANG if timeout == 0.0 else 0)
  File "/home/ma-user/miniconda3/lib/python3.7/multiprocessing/popen_fork.py", 
line 28, in poll
pid, sts = os.waitpid(self.pid, flag)

18 (pid) proc gdb bt

#0  0x7f15c2c5ff7b in waitpid () from /lib/x86_64-linux-gnu/libpthread.so.0
#1