Andrey Mashenkov created IGNITE-17964: -----------------------------------------
Summary: Potential deadlock in discovery thread while updating SQL statistics. Key: IGNITE-17964 URL: https://issues.apache.org/jira/browse/IGNITE-17964 Project: Ignite Issue Type: Bug Components: sql Reporter: Andrey Mashenkov On node start/activation IgniteStatisticsConfigurationManager initializes and tries to cleanup orphaned records (e.g. for tables, which were dropped before node stop/crash). To do that *stat-mgmt* thread updates distributed metastorage synchronously under the read-lock. Underneath, metastorage sends a request via discovery, then discovery component gets the answer on that message, and gets stuck trying to get the write-lock to complete the future... So, *stat-mgmt* and *disco-notify* thread fall into inevitable deadlock. We should avoid any synchronous operation on distributed metastorage under the read-lock. Let’s rewrite synchronous CAS deep inside the closure (see IgniteStatisticsConfigurationManager.updateLocalStatistics) to async CAS and pull it's future up to outside the closure and the read-lock. -- This message was sent by Atlassian Jira (v8.20.10#820010)