leventov edited a comment on issue #6028: Error in SqlMetadataRuleManagerTest
URL: 
https://github.com/apache/incubator-druid/issues/6028#issuecomment-406756225
 
 
   I think the problem is that `SQLMetadataRuleManager.poll()` is scheduled to 
be called asynchronously in a dedicated executor in 
`SQLMetadataRuleManager.start()`. `poll()` makes an inner join of the rules 
table with itself, that appears to be enough to cause a deadlock when the drop 
of this table is called in parallel (it is called in 
`SQLMetadataRuleManagerTest.cleanup()`.
   
   The problem was introduced in #5554, where the 
`SQLMetadataRuleManagerTest.testMultipleStopAndStart()` was added which calls 
`start()` and `stop()` repeatedly. Before, `SQLMetadataRuleManager.start()` was 
not ever called in the context of `SQLMetadataRuleManagerTest`.
   
   Extra coincidental problem that leads to this condition is that `stop()` 
(that pairs `start()` calls in `testMultipleStopAndStart()`) is not actually 
synchronous. By the time when `stop()` exists, some `poll()` may still be 
executed. To ensure this is not the case, `exec.awaitTermination()` should be 
called. But this is awkward, because we don't know what timeout we should use. 
Alternatively, `poll()` could be synchronized together with `start()` and 
`stop()` (the `started` flag should be checked under the lock in `poll()`).
   
   Extra:
    - There is no point of `future` and `exec` being volatile.
    - `future.cancel` is redundant before `shutdownNow()`.
    - There is a race on `retryStartTime` updates in `poll()`.
    - It feels to me that `SQLMetadataRuleManager` mixes two separate 
abstractions - one of scheduled polling, and another of the access to the 
database itself (including the implementation of `poll()`). But I'm uncertain 
about this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org

Reply via email to