[ 
https://issues.apache.org/jira/browse/OOZIE-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264939#comment-15264939
 ] 

Hadoop QA commented on OOZIE-2509:
----------------------------------

Testing JIRA OOZIE-2509

Cleaning local git workspace

----------------------------

{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:green}+1 RAW_PATCH_ANALYSIS{color}
.    {color:green}+1{color} the patch does not introduce any @author tags
.    {color:green}+1{color} the patch does not introduce any tabs
.    {color:green}+1{color} the patch does not introduce any trailing spaces
.    {color:green}+1{color} the patch does not introduce any line longer than 
132
.    {color:green}+1{color} the patch does adds/modifies 7 testcase(s)
{color:green}+1 RAT{color}
.    {color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.    {color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.    {color:green}+1{color} HEAD compiles
.    {color:green}+1{color} patch compiles
.    {color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.    {color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.    {color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.    Tests run: 1780
.    Tests failed: 1
.    Tests errors: 0

.    The patch failed the following testcases:

.      
testCoordStatusTransitServiceForTimeout(org.apache.oozie.service.TestStatusTransitService)

{color:green}+1 DISTRO{color}
.    {color:green}+1{color} distro tarball builds with the patch 

----------------------------
{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/2856/

> SLA job status can stuck in running state
> -----------------------------------------
>
>                 Key: OOZIE-2509
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2509
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Purshotam Shah
>            Assignee: Purshotam Shah
>         Attachments: OOZIE-2509-V1.patch, OOZIE-2509-V2.patch, 
> OOZIE-2509-V3.patch, OOZIE-2509-V4.patch, OOZIE-2509-V5.patch, 
> OOZIE-2509-V6.patch, OOZIE-2509-V7.patch, OOZIE-2509-V8.patch
>
>
> There are few places where job status is not updated properly
> 1. Receiving event which is out of order. 
> Ex "oozie.service.EventHandlerService.batch.size"  is set to 50.
> oozie.service.EventHandlerService.worker.threads is set to 15. Which means 
> that there will be 15 thread processing event in the batch of 50.
> It can happen that 51th event gets process before the 49th event.
> If 49th  is job started event and 51th is job completed event, then the job 
> status will get overridden to running
> 2.
> {code}
> case COORDINATOR_ACTION:
>                     CoordinatorActionBean ca = jpaService.execute(new 
> CoordActionGetForSLAJPAExecutor(slaCalc.getId()));
>                     if (ca.isTerminalWithFailure()) {
>                         isEndMiss = ended = true;
>                         slaCalc.setActualEnd(ca.getLastModifiedTime());
>                     }
>                     if (ca.getExternalId() != null) {
>                         wf = jpaService.execute(new 
> WorkflowJobGetForSLAJPAExecutor(ca.getExternalId()));
>                         if (wf.getEndTime() != null) {
>                             ended = true;
>                             if (wf.getEndTime().getTime() > 
> slaCalc.getExpectedEnd().getTime()) {
>                                 isEndMiss = true;
>                             }
>                         }
>                         slaCalc.setActualEnd(wf.getEndTime());
>                         slaCalc.setActualStart(wf.getStartTime());
>                     }
> {code}
> Oozie checks the wf status and update the sla status with coord job status.
> We might have a case where coord is still running,but wf has ended.
> 3. HistoryPurgeWorker updates endtime but doesn't update status.
> 4. There other few locking issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to