[ https://issues.apache.org/jira/browse/OOZIE-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253129#comment-15253129 ]
Hadoop QA commented on OOZIE-2509: ---------------------------------- Testing JIRA OOZIE-2509 Cleaning local git workspace ---------------------------- {color:green}+1 PATCH_APPLIES{color} {color:green}+1 CLEAN{color} {color:green}+1 RAW_PATCH_ANALYSIS{color} . {color:green}+1{color} the patch does not introduce any @author tags . {color:green}+1{color} the patch does not introduce any tabs . {color:green}+1{color} the patch does not introduce any trailing spaces . {color:green}+1{color} the patch does not introduce any line longer than 132 . {color:green}+1{color} the patch does adds/modifies 6 testcase(s) {color:green}+1 RAT{color} . {color:green}+1{color} the patch does not seem to introduce new RAT warnings {color:green}+1 JAVADOC{color} . {color:green}+1{color} the patch does not seem to introduce new Javadoc warnings {color:green}+1 COMPILE{color} . {color:green}+1{color} HEAD compiles . {color:green}+1{color} patch compiles . {color:green}+1{color} the patch does not seem to introduce new javac warnings {color:green}+1 BACKWARDS_COMPATIBILITY{color} . {color:green}+1{color} the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . {color:green}+1{color} the patch does not modify JPA files {color:green}+1 TESTS{color} . Tests run: 1778 {color:green}+1 DISTRO{color} . {color:green}+1{color} distro tarball builds with the patch ---------------------------- {color:green}*+1 Overall result, good!, no -1s*{color} The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/2843/ > SLA job status can stuck in running state > ----------------------------------------- > > Key: OOZIE-2509 > URL: https://issues.apache.org/jira/browse/OOZIE-2509 > Project: Oozie > Issue Type: Bug > Reporter: Purshotam Shah > Assignee: Purshotam Shah > Attachments: OOZIE-2509-V1.patch, OOZIE-2509-V2.patch, > OOZIE-2509-V3.patch, OOZIE-2509-V4.patch, OOZIE-2509-V5.patch, > OOZIE-2509-V6.patch > > > There are few places where job status is not updated properly > 1. Receiving event which is out of order. > Ex "oozie.service.EventHandlerService.batch.size" is set to 50. > oozie.service.EventHandlerService.worker.threads is set to 15. Which means > that there will be 15 thread processing event in the batch of 50. > It can happen that 51th event gets process before the 49th event. > If 49th is job started event and 51th is job completed event, then the job > status will get overridden to running > 2. > {code} > case COORDINATOR_ACTION: > CoordinatorActionBean ca = jpaService.execute(new > CoordActionGetForSLAJPAExecutor(slaCalc.getId())); > if (ca.isTerminalWithFailure()) { > isEndMiss = ended = true; > slaCalc.setActualEnd(ca.getLastModifiedTime()); > } > if (ca.getExternalId() != null) { > wf = jpaService.execute(new > WorkflowJobGetForSLAJPAExecutor(ca.getExternalId())); > if (wf.getEndTime() != null) { > ended = true; > if (wf.getEndTime().getTime() > > slaCalc.getExpectedEnd().getTime()) { > isEndMiss = true; > } > } > slaCalc.setActualEnd(wf.getEndTime()); > slaCalc.setActualStart(wf.getStartTime()); > } > {code} > Oozie checks the wf status and update the sla status with coord job status. > We might have a case where coord is still running,but wf has ended. > 3. HistoryPurgeWorker updates endtime but doesn't update status. > 4. There other few locking issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)