[ 
https://issues.apache.org/jira/browse/AMBARI-25604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiguo Wu updated AMBARI-25604:
-------------------------------
    Fix Version/s: 2.8.0

> During blueprint deploy tasks sometimes fail due to KeyError on large clusters
> ------------------------------------------------------------------------------
>
>                 Key: AMBARI-25604
>                 URL: https://issues.apache.org/jira/browse/AMBARI-25604
>             Project: Ambari
>          Issue Type: Bug
>            Reporter: Andrew Onischuk
>            Assignee: Andrew Onischuk
>            Priority: Major
>             Fix For: 2.8.0, 2.7.6
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> During blueprint deploy we don't rely on topology cache since AMBARI-23660
> So correct topology is send with
> the command, however the topology from the topology event can be wrong as per 
> AMBARI-23660. 
> The problem occurs when we still try to process broken topology from the 
> event on agent. Agent need to handle this failure with a warning. Currently 
> it just fails the whole command.
> {code:java}ERROR 2020-12-10 06:30:09,350 CustomServiceOrchestrator.py:459 - 
> Caught an exception while executing custom service command: <type 
> 'exceptions.KeyError'>: 10; 10
> Traceback (most recent call last):
>   File "/usr/lib/ambari-agent/lib/ambari_agent/CustomServiceOrchestrator.py", 
> line 324, in runCommand
>     command = self.generate_command(command_header)
>   File "/usr/lib/ambari-agent/lib/ambari_agent/CustomServiceOrchestrator.py", 
> line 507, in generate_command
>     command_dict = self.configuration_builder.get_configuration(cluster_id, 
> service_name, component_name, required_config_timestamp)
>   File "/usr/lib/ambari-agent/lib/ambari_agent/ConfigurationBuilder.py", line 
> 43, in get_configuration
>     'clusterHostInfo': self.topology_cache.get_cluster_host_info(cluster_id),
>   File "/usr/lib/ambari-agent/lib/ambari_agent/Utils.py", line 230, in 
> newFunction
>     return f(*args, **kw)
>   File "/usr/lib/ambari-agent/lib/ambari_agent/ClusterTopologyCache.py", line 
> 112, in get_cluster_host_info
>     hostnames = [self.hosts_to_id[cluster_id][host_id].hostName for host_id 
> in component_dict.hostIds]
> KeyError: 10{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ambari.apache.org
For additional commands, e-mail: issues-h...@ambari.apache.org

Reply via email to