[ 
https://issues.apache.org/jira/browse/AMBARI-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandor Magyari updated AMBARI-15393:
------------------------------------
    Attachment: AMBARI-15393.patch

> Add stderr output of Ambari auto-recovery commands in agent log
> ---------------------------------------------------------------
>
>                 Key: AMBARI-15393
>                 URL: https://issues.apache.org/jira/browse/AMBARI-15393
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-agent
>    Affects Versions: 2.2.1
>            Reporter: Sandor Magyari
>            Assignee: Sandor Magyari
>            Priority: Critical
>             Fix For: 2.2.2
>
>         Attachments: AMBARI-15393.patch, AMBARI-15393_branch-2.2.patch
>
>
> Users rely on Ambari auto-recovery logic to recover from component start 
> failures during cluster create. The idea is to improve reliability (through 
> retries) by sacrificing some of the latency.
> In some cases we see that cluster creates fail because component start fails 
> and auto-recovery is unable to start those components for up to 2 hrs, most 
> often on headnodes for HIVE_SERVER, OOZIE_SERVER, and NAMENODE components.
> The problem these kind of problems are hard to investigate later, as auto 
> recovery files are not sent to server side nor they are saved in ambari agent 
> logs, only stored on agent . 
> The solution is to add a new an option log_auto_execute_errors in logging 
> section to ambari-agent.ini. In case this is enabled agent will append stderr 
> of auto recovery command to agent log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to