Aaron Gresch created STORM-3602:
-----------------------------------

             Summary: loadaware shuffle can overload local worker
                 Key: STORM-3602
                 URL: https://issues.apache.org/jira/browse/STORM-3602
             Project: Apache Storm
          Issue Type: Bug
            Reporter: Aaron Gresch
            Assignee: Aaron Gresch


We were seeing a worker overloaded and tuples timing out with loadaware shuffle 
enabled.  From investigating, we found that the code allows switching from Host 
local to Worker local if the load average is lower than the low water mark.  It 
really should be checking the load on the worker instead. 

 

What's happening is the worker is overloaded with tons of idle host local 
tasks, so it switches to HOST_LOCAL.  Then the calculation across all the host 
tasks is below the low water mark and it immediately switches back to the 
overloaded worker local task.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to