Dimuthu Upeksha created AIRAVATA-2742:
-----------------------------------------

             Summary: Helix Controller throws an Exception when the participant 
is killed
                 Key: AIRAVATA-2742
                 URL: https://issues.apache.org/jira/browse/AIRAVATA-2742
             Project: Airavata
          Issue Type: Bug
          Components: helix implementation
    Affects Versions: 0.18
            Reporter: Dimuthu Upeksha


This was a sporadic issue and occurred only once in the test setup. There were 
5 - 10 tasks running in the Participant and Participant was externally killed 
by SIGTERM command (kill <process-id>. Once the Participant is started again, 
it did not pickup the tasks that it was running at the time it was killed. 
Surprisingly, the status of the respective workflows were IN_PROGRESS status. 
Helix Controller log showed following error for each Workflow. This seems like 
a bug in Helix and I posted the issue in Helix mailing list (Subject : Sporadic 
issue when restarting a Participant). 

 
2018-04-06 15:10:57,766 [Thread-3] ERROR o.a.h.c.s.BestPossibleStateCalcStage  
- Error computing assignment for resource 
Workflow_of_process_PROCESS_7f6c8a54-b50f-4bdb-aafd-59ce87276527-POST-b5e39e07-2d8e-4309-be5a-f5b6067f9a24_TASK_cc8039e5-f054-4dea-8c7f-07c98077b117.
 Skipping.
java.lang.NullPointerException: Name is null
        at java.lang.Enum.valueOf(Enum.java:236)
        at 
org.apache.helix.task.TaskPartitionState.valueOf(TaskPartitionState.java:25)
        at 
org.apache.helix.task.JobRebalancer.computeResourceMapping(JobRebalancer.java:272)
        at 
org.apache.helix.task.JobRebalancer.computeBestPossiblePartitionState(JobRebalancer.java:140)
        at 
org.apache.helix.controller.stages.BestPossibleStateCalcStage.compute(BestPossibleStateCalcStage.java:171)
        at 
org.apache.helix.controller.stages.BestPossibleStateCalcStage.process(BestPossibleStateCalcStage.java:66)
        at 
org.apache.helix.controller.pipeline.Pipeline.handle(Pipeline.java:48)
        at 
org.apache.helix.controller.GenericHelixController.handleEvent(GenericHelixController.java:295)
        at 
org.apache.helix.controller.GenericHelixController$ClusterEventProcessor.run(GenericHelixController.java:595)
2018-04-06 15:11:00,385 [Thread-3] ERROR o.a.h.c.s.BestPossibleStateCalcStage  
- Error computing assignment for resource 
Workflow_of_process_PROCESS_2b69b499-c527-4c9d-8b2b-db17366f5f81-POST-c67607ae-9177-4a02-af8a-8b3751eea4ff_TASK_1ea6876d-f2ec-4139-a15d-0e64a80a3025.
 Skipping. 
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to