Andrey Mashenkov created IGNITE-17964:
-----------------------------------------

             Summary: Potential deadlock in discovery thread while updating SQL 
statistics.
                 Key: IGNITE-17964
                 URL: https://issues.apache.org/jira/browse/IGNITE-17964
             Project: Ignite
          Issue Type: Bug
          Components: sql
            Reporter: Andrey Mashenkov


On node start/activation IgniteStatisticsConfigurationManager initializes and 
tries to cleanup orphaned records (e.g. for tables, which were dropped before 
node stop/crash).
To do that *stat-mgmt* thread updates distributed metastorage synchronously 
under the read-lock.
Underneath, metastorage sends a request via discovery, then 
discovery component gets the answer on that message, and gets stuck trying to 
get the write-lock to complete the future... 
So, *stat-mgmt* and *disco-notify* thread fall into inevitable deadlock.

We should avoid any synchronous operation on distributed metastorage under the 
read-lock.

Let’s rewrite synchronous CAS deep inside the closure (see 
IgniteStatisticsConfigurationManager.updateLocalStatistics) to async CAS and 
pull it's future up to outside the closure and the read-lock.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to