Hi, oozier:

Since AWS EMR 5.15.0, it releases with Oozie 5.0.0, upgrades from oozie 4.3.

We found out one nice feature was broken for us on Oozie 5.0.0, unfortunately.

On Oozie 4.3, we put our oozie applications in one S3 bucket, as our release 
repository, and in the oozie application properties file, we just use as 
following:

appBaseDir=${s3.app.bucket}/oozieJobs/${appName}

And oozie 4.3 runtime will load all the application code from the S3, and still 
use the oozie sharelib from the HDFS for us, and whole application workflow 
works perfectly.

After EMR 5.15.0, it upgrades to Oozie 5.0.0, and we cannot use S3 as our 
application repository anymore. The same application will WORK fine if the 
application is stored in HDFS. But if stored in S3, we got the following error 
message:

Caused by: org.apache.oozie.workflow.WorkflowException: E0712: Could not create 
lib paths list for application 
[s3://bucket-name/oozieJobs/ourAppName/workflow/workflow.xml], Wrong FS: 
hdfs://ip-172-31-72-175.ec2.internal:8020/user/oozie/share/lib, expected: 
s3://bucket-name
        at 
org.apache.oozie.service.WorkflowAppService.createProtoActionConf(WorkflowAppService.java:258)
        at 
org.apache.oozie.command.wf.SubmitXCommand.execute(SubmitXCommand.java:168)
        ... 36 more
Caused by: java.lang.IllegalArgumentException: Wrong FS: 
hdfs://ip-172-31-72-175.ec2.internal:8020/user/oozie/share/lib, expected: 
s3://bucket-name
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:669)
        at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:487)
        at 
com.amazon.ws.emr.hadoop.fs.staging.DefaultStagingMechanism.isStagingDirectoryPath(DefaultStagingMechanism.java:38)
        at 
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:740)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1440)
        at 
com.amazon.ws.emr.hadoop.fs.EmrFileSystem.exists(EmrFileSystem.java:347)
        at 
org.apache.oozie.service.WorkflowAppService.getLibFiles(WorkflowAppService.java:301)
        at 
org.apache.oozie.service.WorkflowAppService.createProtoActionConf(WorkflowAppService.java:202)
        ... 37 more

It looks like if we config the APP path as in S3 by 
appBaseDir=${s3.app.bucket}/oozieJobs/${appName}, Oozie 5.0 will complain that 
it cannot load the sharelib any more from the HDFS URI, even though the all the 
share lib are indeed stored in the HFDS correct location as specified in the 
error message.

With this error message, I found out the following commit in the Oozie 5.0
https://github.com/apache/oozie/commit/5998c18fde1da769e91e3ef1bcca484723730c76#diff-d4e9af2c1e2ddeae544be6182b948109

Since the error comes from the FileSystem in 
core/src/main/java/org/apache/oozie/service/WorkflowAppService.java<https://github.com/apache/oozie/commit/5998c18fde1da769e91e3ef1bcca484723730c76#diff-d4e9af2c1e2ddeae544be6182b948109>,
 so I think MAYBE above commit causing it?
[https://avatars3.githubusercontent.com/u/2914398?s=200&v=4]<https://github.com/apache/oozie/commit/5998c18fde1da769e91e3ef1bcca484723730c76#diff-d4e9af2c1e2ddeae544be6182b948109>

OOZIE-2944 Shell action example does not work with Oozie on Yarn on h… · 
apache/oozie@5998c18 - 
GitHub<https://github.com/apache/oozie/commit/5998c18fde1da769e91e3ef1bcca484723730c76#diff-d4e9af2c1e2ddeae544be6182b948109>
Mirror of Apache Oozie. Contribute to apache/oozie development by creating an 
account on GitHub.
github.com


In 5.0.0, on line 202, it is using the "fs" which comes from line 177 with a 
"conf" coming from line 169 like following: 
https://github.com/apache/oozie/blob/branch-5.0/core/src/main/java/org/apache/oozie/service/WorkflowAppService.java#L166

                    URI uri = new URI(jobConf.get(OozieClient.APP_PATH));

                    Configuration conf = 
has.createConfiguration(uri.getAuthority());


But in 4.3.0 at 
https://github.com/apache/oozie/blob/branch-4.3/core/src/main/java/org/apache/oozie/service/WorkflowAppService.java#L167


            URI uri = new URI(jobConf.get(OozieClient.APP_PATH));

        Configuration conf = has.createJobConf(uri.getAuthority());


I am NOT 100% sure, but the above code indeed returns the FileSystem eventually 
complains "WRONG FS" in my case, and the above commit changes the "jobConf" 
from the createJobConf to createConfiguration.

So my question here, do you think that it is the above change causing my issue? 
If so, I believe there is a reason for the above commit, but do I have a 
solution also for my use case?

Thanks

Yong

Reply via email to