[ https://issues.apache.org/jira/browse/OOZIE-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331536#comment-15331536 ]
WangMeng commented on OOZIE-2568: --------------------------------- [~puru] [~satyaharish] [~shwethags] [~virag] [~devaraj] ,anyone can review this simple patch ? thanks. > SSH action pretends to retry automaticly when it failed > -------------------------------------------------------- > > Key: OOZIE-2568 > URL: https://issues.apache.org/jira/browse/OOZIE-2568 > Project: Oozie > Issue Type: Bug > Components: core > Affects Versions: 4.2.0 > Reporter: WangMeng > Attachments: OOZIE-2568.01.patch > > > There is a bug in automaticly retry of SSH action : > For example: > I have configed the following retry property : > {code} > <name>oozie.service.LiteWorkflowStoreService.user.retry.error.code.ext</name> > <value>ALL</value> > {code} > And my SSH action is : > {code} > <action name="ssh-afbb" retry-max="3" retry-interval="1"> > <ssh xmlns="uri:oozie:ssh-action:0.1"> > <host>wangmeng@XXXX</host> > <command>sh /data/wangmeng/hue_sh4.sh</command> > <capture-output/> > </ssh> > <ok to="End"/> > <error to="Kill"/> > </action> > {code} > When this action failed,it pretends to retry automaticly according to logs. > Such as : > {code} > Start action [0000000-160612140701137-oozie-oozi-W@ssh-afbb] with user-retry > state : userRetryCount [1], userRetryMax [3], userRetryInterval [1] > {code} > However, it does not actually re run the above command "sh > /data/wangmeng/hue_sh4.sh". > The reason is : if the previous PID exists in XXXX.pid file of this SSH > action’s log dir , without checking this PID process is finished or not , SSH > action will not launch a new process to rerun command. And in my tests , I > find this PID process have finished when Oozie rerun this action. -- This message was sent by Atlassian JIRA (v6.3.4#6332)