[ 
https://issues.apache.org/jira/browse/OOZIE-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TIAN XING updated OOZIE-3156:
-----------------------------
    Remaining Estimate:     (was: 1h)
     Original Estimate:     (was: 1h)
           Description: 
When {{check()}} method of {{SshActionExecutor}} gets invoked, oozie will ssh 
connect to the host and check whether action shell pid is still there (by 
checking the returned value of command {{ssh $hostIp ps -p $pid}} ) to 
determine whether the action is running or not.

However, there are cases where oozie fails to connect to the host during action 
status check (e.g., the host is under heavy load, or network is bad etc.).

In such cases, the return value of the command {{ssh $hostIp ps -p $pid}} will 
be 255 (ssh command exits with the exit status of the remote command or with 
255 if an error occurred.).

According the current logic of method {{getActionStatus()}} in 
{{SshActionExecutor}}, the action status will be determined as OK which may not 
be correct. 

  was:
When {{check()}} method of {{SshActionExecutor}} gets invoked, oozie will ssh 
connect to the host and check whether action shell pid is still there (by 
checking the returned value of command `{{ssh $hostIp ps -p $pid}}` ) to 
determine whether the action is running or not.

However, there are cases where oozie fails to connect to the host during action 
status check (e.g., the host is under heavy load, or network is bad etc.).

In such cases, the return value of the command `{{ssh $hostIp ps -p $pid}}` 
will be 255 (ssh command exits with the exit status of the remote command or 
with 255 if an error occurred.).

According the current logic of method {{getActionStatus()}} in 
{{SshActionExecutor}}, the action status will be determined as OK which may not 
be correct. 


> SSH action status turns OK wrongly when failed to connect to host
> -----------------------------------------------------------------
>
>                 Key: OOZIE-3156
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3156
>             Project: Oozie
>          Issue Type: Bug
>          Components: action
>    Affects Versions: 4.0.0, 4.1.0, 4.2.0, 4.3.0
>            Reporter: TIAN XING
>             Fix For: 4.3.0
>
>         Attachments: ssh-check-bug.patch
>
>
> When {{check()}} method of {{SshActionExecutor}} gets invoked, oozie will ssh 
> connect to the host and check whether action shell pid is still there (by 
> checking the returned value of command {{ssh $hostIp ps -p $pid}} ) to 
> determine whether the action is running or not.
> However, there are cases where oozie fails to connect to the host during 
> action status check (e.g., the host is under heavy load, or network is bad 
> etc.).
> In such cases, the return value of the command {{ssh $hostIp ps -p $pid}} 
> will be 255 (ssh command exits with the exit status of the remote command or 
> with 255 if an error occurred.).
> According the current logic of method {{getActionStatus()}} in 
> {{SshActionExecutor}}, the action status will be determined as OK which may 
> not be correct. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to