tkonolige commented on PR #11434:
URL: https://github.com/apache/tvm/pull/11434#issuecomment-1137505533
I think something between our environments is different :). Here is what I
get from your script (I modified it a little so it went slower):
```
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 1 iter1.1259e+15
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 2 iter2.2518e+15
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 3 iter3.3777e+15
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 4 iter4.5036e+15
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 5 iter5.6295e+15
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 6 iter6.7554e+15
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 7 iter7.8813e+15
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 8 iter9.0072e+15
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 9 iter1.01331e+16
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 10 iter1.1259e+16
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 1 iter1.1259e+16
[09:23:44] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 2 iter2.2518e+16
[09:23:45] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 3 iter3.3777e+16
[09:23:45] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 4 iter4.5036e+16
[09:23:45] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 5 iter5.6295e+16
[09:23:45] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 6 iter6.7554e+16
[09:23:45] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 7 iter7.8813e+16
-------------------------- snip -----------------------------------
[09:23:47] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 96 iter1.08086e+18
[09:23:47] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 97 iter1.09212e+18
[09:23:47] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 98 iter1.10338e+18
[09:23:47] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 99 iter1.11464e+18
[09:23:47] /home/tristan/octoml/tvm/src/support/ffi_testing.cc:184: Finished
counting for 100 iter1.1259e+18
Traceback (most recent call last):
File "timer-debug.py", line 11, in <module>
proc.recv()
File "/home/tristan/octoml/tvm/python/tvm/contrib/popen_pool.py", line
297, in recv
raise TimeoutError()
TimeoutError
```
The timeout error only occurs after the c++ function finishes.
This is Python 3.8.12 on Pop!_OS 21.04.
> Here is why: python's multi-threading is backed by real system threads and
they are guarded by GIL. So although one thread enters the FFI as a long
running function, another thread(the watcher) can continue to run in python
interpreter (as the GIL has been released by the long running function) without
a problem, as a result the timeout signal back to the parent process will
continue to function, then the parent process signals kill to the popen worker.
Is this still true in the case where we call python -> c++ -> python -> c++?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]