Hi, oozier: Since AWS EMR 5.15.0, it releases with Oozie 5.0.0, upgrades from oozie 4.3.
We found out one nice feature was broken for us on Oozie 5.0.0, unfortunately. On Oozie 4.3, we put our oozie applications in one S3 bucket, as our release repository, and in the oozie application properties file, we just use as following: appBaseDir=${s3.app.bucket}/oozieJobs/${appName} And oozie 4.3 runtime will load all the application code from the S3, and still use the oozie sharelib from the HDFS for us, and whole application workflow works perfectly. After EMR 5.15.0, it upgrades to Oozie 5.0.0, and we cannot use S3 as our application repository anymore. The same application will WORK fine if the application is stored in HDFS. But if stored in S3, we got the following error message: Caused by: org.apache.oozie.workflow.WorkflowException: E0712: Could not create lib paths list for application [s3://bucket-name/oozieJobs/ourAppName/workflow/workflow.xml], Wrong FS: hdfs://ip-172-31-72-175.ec2.internal:8020/user/oozie/share/lib, expected: s3://bucket-name at org.apache.oozie.service.WorkflowAppService.createProtoActionConf(WorkflowAppService.java:258) at org.apache.oozie.command.wf.SubmitXCommand.execute(SubmitXCommand.java:168) ... 36 more Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://ip-172-31-72-175.ec2.internal:8020/user/oozie/share/lib, expected: s3://bucket-name at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:669) at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:487) at com.amazon.ws.emr.hadoop.fs.staging.DefaultStagingMechanism.isStagingDirectoryPath(DefaultStagingMechanism.java:38) at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:740) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1440) at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.exists(EmrFileSystem.java:347) at org.apache.oozie.service.WorkflowAppService.getLibFiles(WorkflowAppService.java:301) at org.apache.oozie.service.WorkflowAppService.createProtoActionConf(WorkflowAppService.java:202) ... 37 more It looks like if we config the APP path as in S3 by appBaseDir=${s3.app.bucket}/oozieJobs/${appName}, Oozie 5.0 will complain that it cannot load the sharelib any more from the HDFS URI, even though the all the share lib are indeed stored in the HFDS correct location as specified in the error message. With this error message, I found out the following commit in the Oozie 5.0 https://github.com/apache/oozie/commit/5998c18fde1da769e91e3ef1bcca484723730c76#diff-d4e9af2c1e2ddeae544be6182b948109 Since the error comes from the FileSystem in core/src/main/java/org/apache/oozie/service/WorkflowAppService.java<https://github.com/apache/oozie/commit/5998c18fde1da769e91e3ef1bcca484723730c76#diff-d4e9af2c1e2ddeae544be6182b948109>, so I think MAYBE above commit causing it? [https://avatars3.githubusercontent.com/u/2914398?s=200&v=4]<https://github.com/apache/oozie/commit/5998c18fde1da769e91e3ef1bcca484723730c76#diff-d4e9af2c1e2ddeae544be6182b948109> OOZIE-2944 Shell action example does not work with Oozie on Yarn on h… · apache/oozie@5998c18 - GitHub<https://github.com/apache/oozie/commit/5998c18fde1da769e91e3ef1bcca484723730c76#diff-d4e9af2c1e2ddeae544be6182b948109> Mirror of Apache Oozie. Contribute to apache/oozie development by creating an account on GitHub. github.com In 5.0.0, on line 202, it is using the "fs" which comes from line 177 with a "conf" coming from line 169 like following: https://github.com/apache/oozie/blob/branch-5.0/core/src/main/java/org/apache/oozie/service/WorkflowAppService.java#L166 URI uri = new URI(jobConf.get(OozieClient.APP_PATH)); Configuration conf = has.createConfiguration(uri.getAuthority()); But in 4.3.0 at https://github.com/apache/oozie/blob/branch-4.3/core/src/main/java/org/apache/oozie/service/WorkflowAppService.java#L167 URI uri = new URI(jobConf.get(OozieClient.APP_PATH)); Configuration conf = has.createJobConf(uri.getAuthority()); I am NOT 100% sure, but the above code indeed returns the FileSystem eventually complains "WRONG FS" in my case, and the above commit changes the "jobConf" from the createJobConf to createConfiguration. So my question here, do you think that it is the above change causing my issue? If so, I believe there is a reason for the above commit, but do I have a solution also for my use case? Thanks Yong