WangMeng created OOZIE-2568: ------------------------------- Summary: SSH action can not retry automaticly when it failed Key: OOZIE-2568 URL: https://issues.apache.org/jira/browse/OOZIE-2568 Project: Oozie Issue Type: Bug Components: core Affects Versions: 4.2.0 Reporter: WangMeng
There is a bug in automaticly retry of SSH action : For example: I have configed the following retry property : {code} <name>oozie.service.LiteWorkflowStoreService.user.retry.error.code.ext</name> <value>ALL</value> {code} And my SSH action is : {code} <action name="ssh-afbb" retry-max="3" retry-interval="1"> <ssh xmlns="uri:oozie:ssh-action:0.1"> <host>wangmeng@XXXX</host> <command>sh /data/wangmeng/hue_sh4.sh</command> <capture-output/> </ssh> <ok to="End"/> <error to="Kill"/> </action> {code} Howerver, when this action failed,it pretends to retry automaticly according to logs ,such as : {code} Start action [0000000-160612140701137-oozie-oozi-W@ssh-afbb] with user-retry state : userRetryCount [1], userRetryMax [3], userRetryInterval [1] {code} However, it does not actually re-run. This reason is : when the previous PID exists in XXXX.pid file of this SSH action’s log dir , no matter this pid process is finished or not , SSH action will not launch a new process to rerun. -- This message was sent by Atlassian JIRA (v6.3.4#6332)