[ 
https://issues.apache.org/jira/browse/OOZIE-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010513#comment-17010513
 ] 

Junfan Zhang commented on OOZIE-3569:
-------------------------------------

[~asalamon74] OK, i will extract error file checking to a new method.

In the production environment, we are gradually applying this patch, but this 
change is harmful. If the SSH action was run before applying this patch, so the 
success file is not generated, then after the action ends, the success file 
will be checked, finally the action will be misjudged as a failure.

So I first added the touch success file, but the success file was not verified 
at the end of the action. After one week, I applying this patch completely with 
the checking success file strategy.

After 2 weeks of stable running,  I will reply here. 



> SSH Action should add checking success file
> -------------------------------------------
>
>                 Key: OOZIE-3569
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3569
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Junfan Zhang
>            Assignee: Junfan Zhang
>            Priority: Major
>         Attachments: OOZIE-3569-v1.patch
>
>
> *Phenomenon* 
> Currently, {{SSH Action}} checking operation are as following: 
> Firstly, check operation is to check {{Oozie}} ppid. When pgid does not 
> exist, check whether there is an error file. If the error file does not 
> exist, {{Oozie}} will set action status {{OK}}
> However, when {{Oozie}} pgid is killed externally, this action will be 
> incorrectly determined to be successful.
> *Solution*
> In ssh-wrapper.sh, when command execution is OK, {{Oozie}} should touch a 
> success empty file like touching error file.
> In {{SshActionExecutor}} check method, Oozie should add checking the success 
> file existence.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to