[ 
https://issues.apache.org/jira/browse/OOZIE-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated OOZIE-2579:
--------------------------------
    Description: 
There are tests in TestBulkWorkflowXCommand which perform bulk killing.

This might fail sometimes, because the {{externalChildIDs}} is set to 
"00000001-dummy-oozie-wrkf-W " that causes an action to fail:

{code}
org.apache.oozie.action.ActionExecutorException: IllegalArgumentException: 
JobId string : 00000001-dummy-oozie-wrkf-W is not properly formed
        at 
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
        at 
org.apache.oozie.action.hadoop.JavaActionExecutor.kill(JavaActionExecutor.java:1614)
        at 
org.apache.oozie.command.wf.ActionKillXCommand.execute(ActionKillXCommand.java:146)
        at 
org.apache.oozie.command.wf.ActionKillXCommand.execute(ActionKillXCommand.java:1)
        at org.apache.oozie.command.XCommand.call(XCommand.java:287)
        at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:331)
        at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:260)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: JobId string : 
00000001-dummy-oozie-wrkf-W is not properly formed
        at org.apache.hadoop.mapreduce.JobID.forName(JobID.java:154)
        at org.apache.hadoop.mapred.JobID.forName(JobID.java:78)
        at 
org.apache.oozie.action.hadoop.MapReduceActionExecutor.getRunningJob(MapReduceActionExecutor.java:342)
        at 
org.apache.oozie.action.hadoop.JavaActionExecutor.kill(JavaActionExecutor.java:1604)
        ... 10 more
{code}

Since this code runs on a separate thread, it might randomly interfere with the 
main test logic, which expects the job status to be "KILLED", but sometimes the 
ActionKillXCommand has a chance to update it to "FAILED".

Solution: set a proper (parseable) job id:
{code}
action.setExternalChildIDs("job-dummy-1");
{code}

  was:
There are tests in TestBulkWorkflowXCommand which perform bulk killing.

This might fail sometimes, because the {{externalChildIDs}} is set to 
"00000001-dummy-oozie-wrkf-W ". This causes an action to fail:

{code}
org.apache.oozie.action.ActionExecutorException: IllegalArgumentException: 
JobId string : 00000001-dummy-oozie-wrkf-W is not properly formed
        at 
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
        at 
org.apache.oozie.action.hadoop.JavaActionExecutor.kill(JavaActionExecutor.java:1614)
        at 
org.apache.oozie.command.wf.ActionKillXCommand.execute(ActionKillXCommand.java:146)
        at 
org.apache.oozie.command.wf.ActionKillXCommand.execute(ActionKillXCommand.java:1)
        at org.apache.oozie.command.XCommand.call(XCommand.java:287)
        at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:331)
        at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:260)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: JobId string : 
00000001-dummy-oozie-wrkf-W is not properly formed
        at org.apache.hadoop.mapreduce.JobID.forName(JobID.java:154)
        at org.apache.hadoop.mapred.JobID.forName(JobID.java:78)
        at 
org.apache.oozie.action.hadoop.MapReduceActionExecutor.getRunningJob(MapReduceActionExecutor.java:342)
        at 
org.apache.oozie.action.hadoop.JavaActionExecutor.kill(JavaActionExecutor.java:1604)
        ... 10 more
{code}

Since this code runs on a separate thread, it might randomly interfere with the 
main test logic, which expects the job status to be "KILLED", but sometimes the 
ActionKillXCommand has a chance to update it to "FAILED".

Solution: set a proper (parseable) job id:
{code}
action.setExternalChildIDs("job-dummy-1");
{code}


> Bulk kill tests in TestBulkWorkflowXCommand might fail because of a race 
> condition
> ----------------------------------------------------------------------------------
>
>                 Key: OOZIE-2579
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2579
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Minor
>
> There are tests in TestBulkWorkflowXCommand which perform bulk killing.
> This might fail sometimes, because the {{externalChildIDs}} is set to 
> "00000001-dummy-oozie-wrkf-W " that causes an action to fail:
> {code}
> org.apache.oozie.action.ActionExecutorException: IllegalArgumentException: 
> JobId string : 00000001-dummy-oozie-wrkf-W is not properly formed
>       at 
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.kill(JavaActionExecutor.java:1614)
>       at 
> org.apache.oozie.command.wf.ActionKillXCommand.execute(ActionKillXCommand.java:146)
>       at 
> org.apache.oozie.command.wf.ActionKillXCommand.execute(ActionKillXCommand.java:1)
>       at org.apache.oozie.command.XCommand.call(XCommand.java:287)
>       at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:331)
>       at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:260)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: JobId string : 
> 00000001-dummy-oozie-wrkf-W is not properly formed
>       at org.apache.hadoop.mapreduce.JobID.forName(JobID.java:154)
>       at org.apache.hadoop.mapred.JobID.forName(JobID.java:78)
>       at 
> org.apache.oozie.action.hadoop.MapReduceActionExecutor.getRunningJob(MapReduceActionExecutor.java:342)
>       at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.kill(JavaActionExecutor.java:1604)
>       ... 10 more
> {code}
> Since this code runs on a separate thread, it might randomly interfere with 
> the main test logic, which expects the job status to be "KILLED", but 
> sometimes the ActionKillXCommand has a chance to update it to "FAILED".
> Solution: set a proper (parseable) job id:
> {code}
> action.setExternalChildIDs("job-dummy-1");
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to