Antoni Segura Puimedon has posted comments on this change. Change subject: threadpool: docs: add design notes ......................................................................
Patch Set 1: Code-Review-1 (12 comments) http://gerrit.ovirt.org/#/c/29190/1/lib/threadpool/README.rst File lib/threadpool/README.rst: Line 8: Description Line 9: =========== Line 10: Line 11: This is the implementation of a plain thread pool with a few additions to Line 12: handle blocking task, and compatible with `Futures`_. s/`Futures`/Python Futures/ Line 13: Line 14: Rationale Line 15: ========= Line 16: Line 15: ========= Line 16: Line 17: Usually, the canonical way to deal with blocking I/O is to make the I/O channel Line 18: (usually file or socket handles) not-blocking. Line 19: However, it is possible that the I/O operations are encapsulated in a third party this line goes over 80chars. Also: it is possible that the I/O operations that are encapsulated in a third party library do not provide a non-blocking version. Line 20: library that may not be make use of not-blocking I/O. Line 21: Line 22: If that is the case, the solution pool to deal with this API is quickly Line 23: deplenished because most form of event loop/reactor patterns requires Line 35: A common solution is to replenish the worker pool adding new threads to Line 36: replace the blocked ones. In some circumstances, this may lead to Line 37: a horde of zombie threads leaked (consider the example of network Line 38: being unreachable for a broken cable). Line 39: Moreover, the client code may be want to be notified if a task is detected s/may be want to be/may want to be/ Line 40: as stuck (for some definition of 'stuck') and possibly take some Line 41: countermeasures. Line 42: Line 43: WatchedThreadPool provides a solution for the above scenario, by implementing Line 49: Line 50: The countermeasures implemented in WatchedThreadPool are the following. Line 51: Line 52: 1. each worker takes a task exclusively. Line 53: each task must be took exactly by one worker and exactly once for each s/must be took/must be taken/ Line 54: request. Thus, at any given time at most one worker thread can get stuck Line 55: on an unresponsive task. Line 56: In the case of a periodic task, the task is reinjected in the work queue Line 57: by the worker thread itself, once it is done. Line 53: each task must be took exactly by one worker and exactly once for each Line 54: request. Thus, at any given time at most one worker thread can get stuck Line 55: on an unresponsive task. Line 56: In the case of a periodic task, the task is reinjected in the work queue Line 57: by the worker thread itself, once it is done. This last sentence is not very clear. Please, define what happens with the previous instance of the periodic task. Line 58: Line 59: 2. each worker thread is made capable to answer to a couple of simple yet Line 60: important queries: Line 61: what are you doing? Line 58: Line 59: 2. each worker thread is made capable to answer to a couple of simple yet Line 60: important queries: Line 61: what are you doing? Line 62: how long is this task taking? Make this two queries an unordered list. Line 63: Line 64: 3. a watcher thread is added to the pool. Line 65: the watcher thread does not do any real work, but instead periodically Line 66: checks the health of each active worker thread, and detects long running tasks. Line 62: how long is this task taking? Line 63: Line 64: 3. a watcher thread is added to the pool. Line 65: the watcher thread does not do any real work, but instead periodically Line 66: checks the health of each active worker thread, and detects long running tasks. This line goes over 80 characters. Line 67: Then the watcher can implement any stuck-detection policy. Line 68: The simplest one is to consider a thread 'stuck' if any task elaboration Line 69: time exceeds a given threashold Line 70: Line 65: the watcher thread does not do any real work, but instead periodically Line 66: checks the health of each active worker thread, and detects long running tasks. Line 67: Then the watcher can implement any stuck-detection policy. Line 68: The simplest one is to consider a thread 'stuck' if any task elaboration Line 69: time exceeds a given threashold s/threashold/threshold/ Line 70: Line 71: 4. the worker threads detected as 'stuck' are transparently detached by the Line 72: pool and replaced. Line 73: This is transparently done by the watcher thread, which ensures the pool Line 72: pool and replaced. Line 73: This is transparently done by the watcher thread, which ensures the pool Line 74: has a constant active workforce. The threads detected as stuck are taken out Line 75: the pool and put into a limbo until they eventually unblock. Line 76: The task detected as blocked is *not* automatically retried. This is Tasks detected as blocked are *not* automatically retried. Line 77: very important because, coupled with point #1, avoids the lemmings effect Line 78: with more and more worker threads frozen attempting to do a blocking task. Line 79: Line 80: With all in place, WatchedThreadPool can effectively minimize the waste Line 74: has a constant active workforce. The threads detected as stuck are taken out Line 75: the pool and put into a limbo until they eventually unblock. Line 76: The task detected as blocked is *not* automatically retried. This is Line 77: very important because, coupled with point #1, avoids the lemmings effect Line 78: with more and more worker threads frozen attempting to do a blocking task. s/with/of/ Line 79: Line 80: With all in place, WatchedThreadPool can effectively minimize the waste Line 81: of resource, and can deal with unresponsive tasks gracefully. Line 82: Line 77: very important because, coupled with point #1, avoids the lemmings effect Line 78: with more and more worker threads frozen attempting to do a blocking task. Line 79: Line 80: With all in place, WatchedThreadPool can effectively minimize the waste Line 81: of resource, and can deal with unresponsive tasks gracefully. s/resource/resources/ Line 82: Line 83: Futures Line 84: ======= Line 85: Line 88: A backport_ for python 2.x is available as well. Line 89: WatchedThreadPool provides an integration module to work with this package Line 90: seamlessly. Line 91: Line 92: .. _concurrent.futures: https://docs.python.org/3/library/concurrent.futures.html more than 80 characters line. Line 93: .. _backport: https://pypi.python.org/pypi/futures Line 94: Line 95: Line 96: Tests -- To view, visit http://gerrit.ovirt.org/29190 To unsubscribe, visit http://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I400799e300f5d012dae5d158c4379fe57db1bd37 Gerrit-PatchSet: 1 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Francesco Romani <[email protected]> Gerrit-Reviewer: Antoni Segura Puimedon <[email protected]> Gerrit-Reviewer: Dan Kenigsberg <[email protected]> Gerrit-Reviewer: Dima Kuznetsov <[email protected]> Gerrit-Reviewer: Federico Simoncelli <[email protected]> Gerrit-Reviewer: Michal Skrivanek <[email protected]> Gerrit-Reviewer: Nir Soffer <[email protected]> Gerrit-Reviewer: Saggi Mizrahi <[email protected]> Gerrit-Reviewer: Vinzenz Feenstra <[email protected]> Gerrit-Reviewer: Yaniv Bronhaim <[email protected]> Gerrit-Reviewer: [email protected] Gerrit-HasComments: Yes _______________________________________________ vdsm-patches mailing list [email protected] https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches
