[ 
https://issues.apache.org/jira/browse/CASSANDRA-12146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-12146:
------------------------------------------
    Fix Version/s:     (was: 3.9)
                   3.8

> Use dedicated executor for sending JMX notifications
> ----------------------------------------------------
>
>                 Key: CASSANDRA-12146
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12146
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Observability
>            Reporter: Stefan Podkowinski
>            Assignee: Stefan Podkowinski
>             Fix For: 2.2.8, 3.0.9, 3.8
>
>         Attachments: 12146-2.2.patch
>
>
> I'm currently looking into an issue with our repair process where we can 
> notice a significant delay at the end of the repair task and before nodetool 
> is actually terminating. At the same time JMX NOTIF_LOST errors are reported 
> in nodetool during most repair runs.
> Currently {{StorageService.repairAsync(keyspace, options)}} is called through 
> JMX, which will start a new thread executing RepairRunnable using the 
> provided options. StorageService itself implements 
> NotificationBroadcasterSupport and will send JMX progress notifications 
> emitted from RepairRunnable (or during bootstrap). If you take a closer look 
> at {{RepairRunnable}}, {{JMXProgressSupport}} and 
> {{StorageService/NotificationBroadcasterSupport.sendNotification}} you'll 
> notice that this all happens within the calling thread, i.e. RepairRunnable. 
> Given the lost notifications and all kind of potential networking related 
> issues, I'm not really comfortable having the repair coordinator thread 
> running in the JMX stack. Fortunately NotificationBroadcasterSupport accepts 
> a custom executor as constructor argument. See attached patched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to