[ 
https://issues.apache.org/jira/browse/AURORA-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998865#comment-15998865
 ] 

Mehrdad Nurolahzade commented on AURORA-1869:
---------------------------------------------

{{TaskStatusHandlerImpl}} acquires {{LogStorage}} write lock for processing 
every status update received from Mesos master. During implicit and explicit 
reconciliations, this amounts to the number of tasks in the cluster (tens of 
thousands of times in our cluster). 

According to data extracted from one of our production clusters, over 99.9% of 
reconciliation status update events are in fact {{NOOP}} status updates (as 
described above). The storage write lock contention induced by these status 
updates can simply be eliminated by adopting double-checked locking pattern (as 
was done in [AURORA-1820]).

This explains why the combination of reconciliation status update processing 
and other expensive processes like snapshot can be fatal for scheduler. As the 
lock is not fair, it does not guarantee any particular access order. Therefore, 
snapshot structures might need to sit on the heap for a few seconds before they 
can be written to {{LogStorage}} and garbage collected.

> Investigate the status update processing overhead
> -------------------------------------------------
>
>                 Key: AURORA-1869
>                 URL: https://issues.apache.org/jira/browse/AURORA-1869
>             Project: Aurora
>          Issue Type: Task
>          Components: Scheduler
>            Reporter: Mehrdad Nurolahzade
>            Priority: Minor
>
> There is a peculiar similarity pattern between the number of task status 
> update events received from Mesos and the number of JVM threads started by 
> the system 
> ([graphview|http://192.168.33.7:8081/graphview?query=rate(jvm_threads_started)%0Arate(scheduler_status_update_events)]).
>  It seems like a new thread is started every time a status update event is 
> processed.
> {{TaskStatusHandlerImpl}} is a single-threaded service, therefore it should 
> not instantiate new threads. Looking at status update reasons/results, the 
> majority of status updates are associated with {{RECONCILIATION}} and should 
> result in {{NOOP}}. Therefore, they should have no impact on the internal 
> workers. The task state machine should short-circuit and return upon 
> realizing that the task’s reported new state corresponds to scheduler’s view.
> {code:title=TaskStateMachine.updateState()}
> if (stateMachine.getState() == taskState) {
>   return new TransitionResult(NOOP, ImmutableSet.of());
> }
> {code}
> Given the volume of status update events received upon reconciliation this 
> overhead needs to be avoided, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to