peter liu created OOZIE-1953:
--------------------------------

             Summary: oozie workflow is stuck in RUNNING states if folder 
'oozie-oozi' was deleted durning t's running
                 Key: OOZIE-1953
                 URL: https://issues.apache.org/jira/browse/OOZIE-1953
             Project: Oozie
          Issue Type: Bug
    Affects Versions: 4.0.0
            Reporter: peter liu


Step to reproduce:
After started a workflow, delete the auto-created folder 'oozie-oozi' on HDFS, 
then the workflow will be stuck in RUNNING status and never get killed. 

>From below log it seems in an infinite loop to try to find the generated 
>action files:

{quote}
Caused by: java.io.FileNotFoundException: File 
hdfs://d0e003ash1013.mgmt.symcpe.net/user/test_user1/oozie-oozi/0002981-140710172352455-oozie-oozi-W/SDKAction--map-reduce
 does not exist.
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:654)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:708)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:708)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper$1.run(LauncherMapperHelper.java:277)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper$1.run(LauncherMapperHelper.java:263)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper.getActionData(LauncherMapperHelper.java:263)
        at 
org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1073)
        ... 7 more
2014-07-30 20:11:09,622  WARN MapReduceActionExecutor:542 - USER[test_user1] 
GROUP[-] TOKEN[] APP[WorkFlowSDKAction] 
JOB[0002981-140710172352455-oozie-oozi-W] 
ACTION[0002981-140710172352455-oozie-oozi-W@SDKAction] Exception in check(). 
Message[File 
hdfs://d0e003ash1013.mgmt.symcpe.net/user/test_user1/oozie-oozi/0002981-140710172352455-oozie-oozi-W/SDKAction--map-reduce
 does not exist.]
java.io.FileNotFoundException: File 
hdfs://d0e003ash1013.mgmt.symcpe.net/user/test_user1/oozie-oozi/0002981-140710172352455-oozie-oozi-W/SDKAction--map-reduce
 does not exist.
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:654)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:708)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:708)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper$1.run(LauncherMapperHelper.java:277)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper$1.run(LauncherMapperHelper.java:263)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper.getActionData(LauncherMapperHelper.java:263)
        at 
org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1073)
        at 
org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:177)
        at 
org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:56)
        at org.apache.oozie.command.XCommand.call(XCommand.java:280)
        at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
2014-07-30 20:11:09,622  WARN ActionCheckXCommand:542 - USER[test_user1] 
GROUP[-] TOKEN[] APP[WorkFlowSDKAction] 
JOB[0002981-140710172352455-oozie-oozi-W] 
ACTION[0002981-140710172352455-oozie-oozi-W@SDKAction] Exception while 
executing check(). Error Code [JA008], Message[JA008: File 
hdfs://d0e003ash1013.mgmt.symcpe.net/user/test_user1/oozie-oozi/0002981-140710172352455-oozie-oozi-W/SDKAction--map-reduce
 does not exist.]
org.apache.oozie.action.ActionExecutorException: JA008: File 
hdfs://d0e003ash1013.mgmt.symcpe.net/user/test_user1/oozie-oozi/0002981-140710172352455-oozie-oozi-W/SDKAction--map-reduce
 does not exist.
        at 
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:412)
        at 
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:396)
        at 
org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1163)
        at 
org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:177)
        at 
org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:56)
        at org.apache.oozie.command.XCommand.call(XCommand.java:280)
        at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.FileNotFoundException: File 
hdfs://d0e003ash1013.mgmt.symcpe.net/user/test_user1/oozie-oozi/0002981-140710172352455-oozie-oozi-W/SDKAction--map-reduce
 does not exist.
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:654)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:708)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:708)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper$1.run(LauncherMapperHelper.java:277)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper$1.run(LauncherMapperHelper.java:263)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper.getActionData(LauncherMapperHelper.java:263)
        at 
org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1073)
        ... 7 more
2014-07-30 20:22:09,737  WARN MapReduceActionExecutor:542 - USER[test_user1] 
GROUP[-] TOKEN[] APP[WorkFlowSDKAction] 
JOB[0002981-140710172352455-oozie-oozi-W] 
ACTION[0002981-140710172352455-oozie-oozi-W@SDKAction] Exception in check(). 
Message[File 
hdfs://d0e003ash1013.mgmt.symcpe.net/user/test_user1/oozie-oozi/0002981-140710172352455-oozie-oozi-W/SDKAction--map-reduce
 does not exist.]
java.io.FileNotFoundException: File 
hdfs://d0e003ash1013.mgmt.symcpe.net/user/test_user1/oozie-oozi/0002981-140710172352455-oozie-oozi-W/SDKAction--map-reduce
 does not exist.
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:654)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:708)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:708)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper$1.run(LauncherMapperHelper.java:277)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper$1.run(LauncherMapperHelper.java:263)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper.getActionData(LauncherMapperHelper.java:263)
        at 
org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1073)
        at 
org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:177)
        at 
org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:56)
        at org.apache.oozie.command.XCommand.call(XCommand.java:280)
        at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
2014-07-30 20:22:09,737  WARN ActionCheckXCommand:542 - USER[test_user1] 
GROUP[-] TOKEN[] APP[WorkFlowSDKAction] 
JOB[0002981-140710172352455-oozie-oozi-W] 
ACTION[0002981-140710172352455-oozie-oozi-W@SDKAction] Exception while 
executing check(). Error Code [JA008], Message[JA008: File 
hdfs://d0e003ash1013.mgmt.symcpe.net/user/test_user1/oozie-oozi/0002981-140710172352455-oozie-oozi-W/SDKAction--map-reduce
 does not exist.]
org.apache.oozie.action.ActionExecutorException: JA008: File 
hdfs://d0e003ash1013.mgmt.symcpe.net/user/test_user1/oozie-oozi/0002981-140710172352455-oozie-oozi-W/SDKAction--map-reduce
 does not exist.
        at 
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:412)
        at 
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:396)
        at 
org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1163)
        at 
org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:177)
        at 
org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:56)
        at org.apache.oozie.command.XCommand.call(XCommand.java:280)
        at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.FileNotFoundException: File 
hdfs://d0e003ash1013.mgmt.symcpe.net/user/test_user1/oozie-oozi/0002981-140710172352455-oozie-oozi-W/SDKAction--map-reduce
 does not exist.
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:654)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:708)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:708)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper$1.run(LauncherMapperHelper.java:277)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper$1.run(LauncherMapperHelper.java:263)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
        at 
org.apache.oozie.action.hadoop.LauncherMapperHelper.getActionData(LauncherMapperHelper.java:263)
        at 
org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1073)
        ... 7 more
{quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to