[ 
https://issues.apache.org/jira/browse/OOZIE-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143677#comment-14143677
 ] 

Robert Kanter commented on OOZIE-2017:
--------------------------------------

I did some digging and it looks like the problem is this method:
{code:java}
        private boolean 
checkCoordRunningStatus(HashMap<CoordinatorAction.Status, Integer> 
coordActionStatus,
                int coordActionsCount, Job.Status[] coordStatus) {
            boolean ret = false;
            if (coordStatus[0] != Job.Status.PREP) {
                if 
(coordActionStatus.containsKey(CoordinatorAction.Status.KILLED)
                        || 
coordActionStatus.containsKey(CoordinatorAction.Status.FAILED)
                        || 
coordActionStatus.containsKey(CoordinatorAction.Status.TIMEDOUT)) {
                    coordStatus[0] = Job.Status.RUNNINGWITHERROR;
                }
                else {
                    coordStatus[0] = Job.Status.RUNNING;
                }
                ret = true;
            }
            return ret;
        }
{code}

I think the solution is to simply add a check that the {{coordStatus\[0]}} is 
also not equal to PREPSUSPENDED (and may as well add PREPPAUSED while we're at 
it) in addition to PREP.  I'll try to verify this.

> On startup, StatusTransitService can resume Coordinators that were in 
> PREPSUSPENDED
> -----------------------------------------------------------------------------------
>
>                 Key: OOZIE-2017
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2017
>             Project: Oozie
>          Issue Type: Bug
>          Components: coordinator
>    Affects Versions: trunk, 4.0.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>             Fix For: trunk
>
>
> You can reproduce this issue easily:
> # Submit a coordinator job that starts in the future
> #- It enters PREP state
> # Suspend the coordinator job
> #- It enters PREPSUSPENDED state
> # Restart Oozie and wait about a minute or so
> #- The job transitions back to PREP state by itself
> The log shows that the StatusTransitService is doing it.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to