Stig Rohde Døssing created STORM-1750:
-----------------------------------------

             Summary: Report-error-and-die may not kill the worker
                 Key: STORM-1750
                 URL: https://issues.apache.org/jira/browse/STORM-1750
             Project: Apache Storm
          Issue Type: Bug
    Affects Versions: 0.10.0, 1.0.0, 2.0.0
            Reporter: Stig Rohde Døssing
            Assignee: Stig Rohde Døssing
            Priority: Critical


The report-error-and-die function in executor.clj calls report-error, which can 
throw exceptions if Curator runs into any kind of trouble while registering the 
error. I suspect this may happen with network errors, but it can also happen if 
two executors for the same component throw exceptions at the same time and no 
errors have been registered for the component previously. This is because both 
calls to report-error-and-die update the lastErrorPath, and ZkStateStorage 
set_data doesn't catch the potential NodeExistsException that may be thrown 
from the create call.

If an exception is thrown from report-error, the suicide-fn is never called, 
and the worker keeps running sans the crashed executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to