[ https://issues.apache.org/jira/browse/IGNITE-20451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vyacheslav Koptilin resolved IGNITE-20451. ------------------------------------------ Resolution: Duplicate > Introduce WorkerRegistery > ------------------------- > > Key: IGNITE-20451 > URL: https://issues.apache.org/jira/browse/IGNITE-20451 > Project: Ignite > Issue Type: Improvement > Reporter: Vyacheslav Koptilin > Priority: Major > Labels: ignite-3 > > Each Ignite node has a number of system-critical threads. We should implement > a periodic check that calls the failure handler when one of the following > conditions has been detected: > - Critical thread is not alive anymore. > - Critical thread 'hangs' for a long time, e.g. while executing a task > extracted from the task queue. > In case of failure condition, call stacks of all threads should be logged > before invoking failure handler. > Implementations based on separate diagnostic thread seem fragile, cause this > thread become a vulnerable point with respect to thread termination and CPU > resource starvation. So we are to use self-monitoring approach: critical > threads themselves should monitor each other. -- This message was sent by Atlassian Jira (v8.20.10#820010)