Robert Nettleton created AMBARI-12720:
-----------------------------------------

             Summary: Blueprint Logical Request stuck in waiting mode during 
large cluster deployments
                 Key: AMBARI-12720
                 URL: https://issues.apache.org/jira/browse/AMBARI-12720
             Project: Ambari
          Issue Type: Bug
          Components: ambari-server
    Affects Versions: 2.1.0
            Reporter: Robert Nettleton
            Assignee: Robert Nettleton
            Priority: Critical
             Fix For: 2.1.1


During Blueprint deployments involving large cluster sizes (50 or more nodes), 
there is an intermittent failure that occurs in which a logical request never 
completes, since one or more expected host registrations do not complete, and 
so the request can not be fully resolved.  This results in the UI showing that 
the logical request is pending, and the cluster fails to deploy to completion.

This tends to happen under heavy load with large cluster sizes.  This also 
tends to happen more frequently when hosts in the cluster are registered with 
the TopologyManager during the Blueprint configuration phase.  

This appears to be a concurrency problem with the TopologyManager.

I'm working on a fix for this, and will be submitting a patch shortly.  




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to