Here you go !
- Add oozie.service.HadoopAccessorService.supported.filesystems as * in oozie-site.xml - include hadoop-aws-2.8.3.jar - Rebuild oozie with Dhttpclient.version=4.5.5 -Dhttpcore.version=4.4.9 - Set jetty_opts with proxy values On Sat, May 19, 2018 at 2:17 AM Peter Cseh <gezap...@cloudera.com> wrote: > Wow, great work! > Can you please summarize the required steps? This would be useful for > others so we probably should add it to our documentation. > Thanks in advance! > Peter > > On Fri, May 18, 2018 at 11:33 PM, purna pradeep <purna2prad...@gmail.com> > wrote: > >> I got this fixed by setting jetty_opts with proxy values. >> >> Thanks Peter!! >> >> On Thu, May 17, 2018 at 4:05 PM purna pradeep <purna2prad...@gmail.com> >> wrote: >> >>> Ok I fixed this by adding aws keys in oozie >>> >>> But I’m getting below error >>> >>> I have tried setting proxy in core-site.xml but no luck >>> >>> >>> 2018-05-17 15:39:20,602 ERROR CoordInputLogicEvaluatorPhaseOne:517 - >>> SERVER[localhost] USER[-] GROUP[-] TOKEN[-] APP[-] >>> JOB[0000000-180517144113498-oozie-xjt0-C] ACTION[0000000- >>> 180517144113498-oozie-xjt0-C@2] >>> org.apache.oozie.service.HadoopAccessorException: >>> E0902: Exception occurred: [doesBucketExist on cmsegmentation-qa: >>> com.amazonaws.SdkClientException: Unable to execute HTTP request: >>> Connect to mybucket.s3.amazonaws.com:443 >>> <http://cmsegmentation-qa.s3.amazonaws.com:443/> [mybucket. >>> s3.amazonaws.com/52.216.165.155 >>> <http://cmsegmentation-qa.s3.amazonaws.com/52.216.165.155>] failed: >>> connect timed out] >>> >>> org.apache.oozie.service.HadoopAccessorException: E0902: Exception >>> occurred: [doesBucketExist on cmsegmentation-qa: >>> com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect >>> to mybucket.s3.amazonaws.com:443 >>> <http://cmsegmentation-qa.s3.amazonaws.com:443/> [mybucket >>> .s3.amazonaws.com >>> <http://cmsegmentation-qa.s3.amazonaws.com/52.216.165.155> failed: >>> connect timed out] >>> >>> at >>> org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:630) >>> >>> at >>> org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:594) >>> at org.apache.oozie.dependency. >>> FSURIHandler.getFileSystem(FSURIHandler.java:184)-env.sh >>> >>> But now I’m getting this error >>> >>> >>> >>> On Thu, May 17, 2018 at 2:53 PM purna pradeep <purna2prad...@gmail.com> >>> wrote: >>> >>>> Ok I got passed this error >>>> >>>> By rebuilding oozie with Dhttpclient.version=4.5.5 >>>> -Dhttpcore.version=4.4.9 >>>> >>>> now getting this error >>>> >>>> >>>> >>>> ACTION[0000000-180517144113498-oozie-xjt0-C@1] >>>> org.apache.oozie.service.HadoopAccessorException: E0902: Exception >>>> occurred: [doesBucketExist on mybucketcom.amazonaws.AmazonClientException: >>>> No AWS Credentials provided by BasicAWSCredentialsProvider >>>> EnvironmentVariableCredentialsProvider >>>> SharedInstanceProfileCredentialsProvider : >>>> com.amazonaws.SdkClientException: Unable to load credentials from service >>>> endpoint] >>>> >>>> org.apache.oozie.service.HadoopAccessorException: E0902: Exception >>>> occurred: [doesBucketExist on cmsegmentation-qa: >>>> com.amazonaws.AmazonClientException: No AWS Credentials provided by >>>> BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider >>>> SharedInstanceProfileCredentialsProvider : >>>> com.amazonaws.SdkClientException: Unable to load credentials from service >>>> endpoint] >>>> >>>> On Thu, May 17, 2018 at 12:24 PM purna pradeep <purna2prad...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> Peter, >>>>> >>>>> Also When I submit a job with new http client jar, I get >>>>> >>>>> ```Error: IO_ERROR : java.io.IOException: Error while connecting Oozie >>>>> server. No of retries = 1. Exception = Could not authenticate, >>>>> Authentication failed, status: 500, message: Server Error``` >>>>> >>>>> >>>>> On Thu, May 17, 2018 at 12:14 PM purna pradeep < >>>>> purna2prad...@gmail.com> wrote: >>>>> >>>>>> Ok I have tried this >>>>>> >>>>>> It appears that s3a support requires httpclient 4.4.x and oozie is >>>>>> bundled with httpclient 4.3.6. When httpclient is upgraded, the ext UI >>>>>> stops loading. >>>>>> >>>>>> >>>>>> >>>>>> On Thu, May 17, 2018 at 10:28 AM Peter Cseh <gezap...@cloudera.com> >>>>>> wrote: >>>>>> >>>>>>> Purna, >>>>>>> >>>>>>> Based on >>>>>>> https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3 >>>>>>> you should try to go for s3a. >>>>>>> You'll have to include the aws-jdk as well if I see it correctly: >>>>>>> https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3A >>>>>>> Also, the property names are slightly different so you'll have to >>>>>>> change the example I've given. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, May 17, 2018 at 4:16 PM, purna pradeep < >>>>>>> purna2prad...@gmail.com> wrote: >>>>>>> >>>>>>>> Peter, >>>>>>>> >>>>>>>> I’m using latest oozie 5.0.0 and I have tried below changes but no >>>>>>>> luck >>>>>>>> >>>>>>>> Is this for s3 or s3a ? >>>>>>>> >>>>>>>> I’m using s3 but if this is for s3a do you know which jar I need to >>>>>>>> include I mean Hadoop-aws jar or any other jar if required >>>>>>>> >>>>>>>> Hadoop-aws-2.8.3.jar is what I’m using >>>>>>>> >>>>>>>> On Wed, May 16, 2018 at 5:19 PM Peter Cseh <gezap...@cloudera.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Ok, I've found it: >>>>>>>>> >>>>>>>>> If you are using 4.3.0 or newer this is the part which checks for >>>>>>>>> dependencies: >>>>>>>>> >>>>>>>>> https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/coord/CoordCommandUtils.java#L914-L926 >>>>>>>>> It passes the coordinator action's configuration and even does >>>>>>>>> impersonation to check for the dependencies: >>>>>>>>> >>>>>>>>> https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/coord/input/logic/CoordInputLogicEvaluatorPhaseOne.java#L159 >>>>>>>>> >>>>>>>>> Have you tried the following in the coordinator xml: >>>>>>>>> >>>>>>>>> <action> >>>>>>>>> <workflow> >>>>>>>>> >>>>>>>>> <app-path>hdfs://bar:9000/usr/joe/logsprocessor-wf</app-path> >>>>>>>>> <configuration> >>>>>>>>> <property> >>>>>>>>> <name>fs.s3.awsAccessKeyId</name> >>>>>>>>> <value>[YOURKEYID]</value> >>>>>>>>> </property> >>>>>>>>> <property> >>>>>>>>> <name>fs.s3.awsSecretAccessKey</name> >>>>>>>>> <value>[YOURKEY]</value> >>>>>>>>> </property> >>>>>>>>> </configuration> >>>>>>>>> </workflow> >>>>>>>>> </action> >>>>>>>>> >>>>>>>>> Based on the source this should be able to poll s3 periodically. >>>>>>>>> >>>>>>>>> On Wed, May 16, 2018 at 10:57 PM, purna pradeep < >>>>>>>>> purna2prad...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> I have tried with coordinator's configuration too but no luck ☹️ >>>>>>>>>> >>>>>>>>>> On Wed, May 16, 2018 at 3:54 PM Peter Cseh <gezap...@cloudera.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Great progress there purna! :) >>>>>>>>>>> >>>>>>>>>>> Have you tried adding these properites to the coordinator's >>>>>>>>>>> configuration? we usually use the action config to build up >>>>>>>>>>> connection to >>>>>>>>>>> the distributed file system. >>>>>>>>>>> Although I'm not sure we're using these when polling the >>>>>>>>>>> dependencies for coordinators, but I'm excited about you trying to >>>>>>>>>>> make it >>>>>>>>>>> work! >>>>>>>>>>> >>>>>>>>>>> I'll get back with a - hopefully - more helpful answer soon, I >>>>>>>>>>> have to check the code in more depth first. >>>>>>>>>>> gp >>>>>>>>>>> >>>>>>>>>>> On Wed, May 16, 2018 at 9:45 PM, purna pradeep < >>>>>>>>>>> purna2prad...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Peter, >>>>>>>>>>>> >>>>>>>>>>>> I got rid of this error by adding >>>>>>>>>>>> hadoop-aws-2.8.3.jar and jets3t-0.9.4.jar >>>>>>>>>>>> >>>>>>>>>>>> But I’m getting below error now >>>>>>>>>>>> >>>>>>>>>>>> java.lang.IllegalArgumentException: AWS Access Key ID and >>>>>>>>>>>> Secret Access Key must be specified by setting the >>>>>>>>>>>> fs.s3.awsAccessKeyId and >>>>>>>>>>>> fs.s3.awsSecretAccessKey properties (respectively) >>>>>>>>>>>> >>>>>>>>>>>> I have tried adding AWS access ,secret keys in >>>>>>>>>>>> >>>>>>>>>>>> oozie-site.xml and hadoop core-site.xml , and hadoop-config.xml >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 16, 2018 at 2:30 PM purna pradeep < >>>>>>>>>>>> purna2prad...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I have tried this ,just added s3 instead of * >>>>>>>>>>>>> >>>>>>>>>>>>> <property> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> <name>oozie.service.HadoopAccessorService.supported.filesystems</name> >>>>>>>>>>>>> >>>>>>>>>>>>> <value>hdfs,hftp,webhdfs,s3</value> >>>>>>>>>>>>> >>>>>>>>>>>>> </property> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Getting below error >>>>>>>>>>>>> >>>>>>>>>>>>> java.lang.RuntimeException: java.lang.ClassNotFoundException: >>>>>>>>>>>>> Class org.apache.hadoop.fs.s3a.S3AFileSystem not found >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2369) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2793) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2810) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2849) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2831) >>>>>>>>>>>>> >>>>>>>>>>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.oozie.service.HadoopAccessorService$5.run(HadoopAccessorService.java:625) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.oozie.service.HadoopAccessorService$5.run(HadoopAccessorService.java:623 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, May 16, 2018 at 2:19 PM purna pradeep < >>>>>>>>>>>>> purna2prad...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> This is what is in the logs >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2018-05-16 14:06:13,500 INFO URIHandlerService:520 - >>>>>>>>>>>>>> SERVER[localhost] Loaded urihandlers >>>>>>>>>>>>>> [org.apache.oozie.dependency.FSURIHandler] >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2018-05-16 14:06:13,501 INFO URIHandlerService:520 - >>>>>>>>>>>>>> SERVER[localhost] Loaded default urihandler >>>>>>>>>>>>>> org.apache.oozie.dependency.FSURIHandler >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, May 16, 2018 at 12:27 PM Peter Cseh < >>>>>>>>>>>>>> gezap...@cloudera.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> That's strange, this exception should not happen in that >>>>>>>>>>>>>>> case. >>>>>>>>>>>>>>> Can you check the server logs for messages like this? >>>>>>>>>>>>>>> LOG.info("Loaded urihandlers {0}", >>>>>>>>>>>>>>> Arrays.toString(classes)); >>>>>>>>>>>>>>> LOG.info("Loaded default urihandler {0}", >>>>>>>>>>>>>>> defaultHandler.getClass().getName()); >>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, May 16, 2018 at 5:47 PM, purna pradeep < >>>>>>>>>>>>>>> purna2prad...@gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This is what I already have in my oozie-site.xml >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> <property> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> <name>oozie.service.HadoopAccessorService.supported.filesystems</name> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> <value>*</value> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> </property> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, May 16, 2018 at 11:37 AM Peter Cseh < >>>>>>>>>>>>>>>> gezap...@cloudera.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> You'll have to configure >>>>>>>>>>>>>>>>> oozie.service.HadoopAccessorService.supported.filesystems >>>>>>>>>>>>>>>>> hdfs,hftp,webhdfs Enlist >>>>>>>>>>>>>>>>> the different filesystems supported for federation. If >>>>>>>>>>>>>>>>> wildcard "*" is >>>>>>>>>>>>>>>>> specified, then ALL file schemes will be allowed.properly. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> For testing purposes it's ok to put * in there in >>>>>>>>>>>>>>>>> oozie-site.xml >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, May 16, 2018 at 5:29 PM, purna pradeep < >>>>>>>>>>>>>>>>> purna2prad...@gmail.com> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> > Peter, >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > I have tried to specify dataset with uri starting with >>>>>>>>>>>>>>>>> s3://, s3a:// and >>>>>>>>>>>>>>>>> > s3n:// and I am getting exception >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > Exception occurred:E0904: Scheme [s3] not supported in >>>>>>>>>>>>>>>>> uri >>>>>>>>>>>>>>>>> > [s3://mybucket/input.data] Making the job failed >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > org.apache.oozie.dependency.URIHandlerException: E0904: >>>>>>>>>>>>>>>>> Scheme [s3] not >>>>>>>>>>>>>>>>> > supported in uri [s3:// mybucket /input.data] >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler( >>>>>>>>>>>>>>>>> > URIHandlerService.java:185) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler( >>>>>>>>>>>>>>>>> > URIHandlerService.java:168) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler( >>>>>>>>>>>>>>>>> > URIHandlerService.java:160) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> org.apache.oozie.command.coord.CoordCommandUtils.createEarlyURIs( >>>>>>>>>>>>>>>>> > CoordCommandUtils.java:465) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > org.apache.oozie.command.coord.CoordCommandUtils. >>>>>>>>>>>>>>>>> > separateResolvedAndUnresolved(CoordCommandUtils.java:404) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > org.apache.oozie.command.coord.CoordCommandUtils. >>>>>>>>>>>>>>>>> > materializeInputDataEvents(CoordCommandUtils.java:731) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> org.apache.oozie.command.coord.CoordCommandUtils.materializeOneInstance( >>>>>>>>>>>>>>>>> > CoordCommandUtils.java:546) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> org.apache.oozie.command.coord.CoordMaterializeTransitionXCom >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> mand.materializeActions(CoordMaterializeTransitionXCommand.java:492) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> org.apache.oozie.command.coord.CoordMaterializeTransitionXCom >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> mand.materialize(CoordMaterializeTransitionXCommand.java:362) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> org.apache.oozie.command.MaterializeTransitionXCommand.execute( >>>>>>>>>>>>>>>>> > MaterializeTransitionXCommand.java:73) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> org.apache.oozie.command.MaterializeTransitionXCommand.execute( >>>>>>>>>>>>>>>>> > MaterializeTransitionXCommand.java:29) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> org.apache.oozie.command.XCommand.call(XCommand.java:290) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> org.apache.oozie.service.CallableQueueService$CallableWrapper.run( >>>>>>>>>>>>>>>>> > CallableQueueService.java:181) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > java.util.concurrent.ThreadPoolExecutor.runWorker( >>>>>>>>>>>>>>>>> > ThreadPoolExecutor.java:1149) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > java.util.concurrent.ThreadPoolExecutor$Worker.run( >>>>>>>>>>>>>>>>> > ThreadPoolExecutor.java:624) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > at java.lang.Thread.run(Thread.java:748) >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > Is S3 support specific to CDH distribution or should it >>>>>>>>>>>>>>>>> work in Apache >>>>>>>>>>>>>>>>> > Oozie as well? I’m not using CDH yet so >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > On Wed, May 16, 2018 at 10:28 AM Peter Cseh < >>>>>>>>>>>>>>>>> gezap...@cloudera.com> wrote: >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > > I think it should be possible for Oozie to poll S3. >>>>>>>>>>>>>>>>> Check out this >>>>>>>>>>>>>>>>> > > < >>>>>>>>>>>>>>>>> > > https://www.cloudera.com/documentation/enterprise/5-9- >>>>>>>>>>>>>>>>> > x/topics/admin_oozie_s3.html >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > description on how to make it work in jobs, something >>>>>>>>>>>>>>>>> similar should work >>>>>>>>>>>>>>>>> > > on the server side as well >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > On Tue, May 15, 2018 at 4:43 PM, purna pradeep < >>>>>>>>>>>>>>>>> purna2prad...@gmail.com> >>>>>>>>>>>>>>>>> > > wrote: >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > > Thanks Andras, >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > Also I also would like to know if oozie supports Aws >>>>>>>>>>>>>>>>> S3 as input events >>>>>>>>>>>>>>>>> > > to >>>>>>>>>>>>>>>>> > > > poll for a dependency file before kicking off a >>>>>>>>>>>>>>>>> spark action >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > For example: I don’t want to kick off a spark action >>>>>>>>>>>>>>>>> until a file is >>>>>>>>>>>>>>>>> > > > arrived on a given AWS s3 location >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > On Tue, May 15, 2018 at 10:17 AM Andras Piros < >>>>>>>>>>>>>>>>> > andras.pi...@cloudera.com >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > wrote: >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > > Hi, >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> > > > > Oozie needs HDFS to store workflow, coordinator, >>>>>>>>>>>>>>>>> or bundle >>>>>>>>>>>>>>>>> > definitions, >>>>>>>>>>>>>>>>> > > > as >>>>>>>>>>>>>>>>> > > > > well as sharelib files in a safe, distributed and >>>>>>>>>>>>>>>>> scalable way. Oozie >>>>>>>>>>>>>>>>> > > > needs >>>>>>>>>>>>>>>>> > > > > YARN to run almost all of its actions, Spark >>>>>>>>>>>>>>>>> action being no >>>>>>>>>>>>>>>>> > exception. >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> > > > > At the moment it's not feasible to install Oozie >>>>>>>>>>>>>>>>> without those Hadoop >>>>>>>>>>>>>>>>> > > > > components. How to install Oozie please *find here >>>>>>>>>>>>>>>>> > > > > < >>>>>>>>>>>>>>>>> https://oozie.apache.org/docs/5.0.0/AG_Install.html>*. >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> > > > > Regards, >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> > > > > Andras >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> > > > > On Tue, May 15, 2018 at 4:11 PM, purna pradeep < >>>>>>>>>>>>>>>>> > > purna2prad...@gmail.com> >>>>>>>>>>>>>>>>> > > > > wrote: >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> > > > > > Hi, >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> > > > > > Would like to know if I can use sparkaction in >>>>>>>>>>>>>>>>> oozie without having >>>>>>>>>>>>>>>>> > > > > Hadoop >>>>>>>>>>>>>>>>> > > > > > cluster? >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> > > > > > I want to use oozie to schedule spark jobs on >>>>>>>>>>>>>>>>> Kubernetes cluster >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> > > > > > I’m a beginner in oozie >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> > > > > > Thanks >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > -- >>>>>>>>>>>>>>>>> > > *Peter Cseh *| Software Engineer >>>>>>>>>>>>>>>>> > > cloudera.com <https://www.cloudera.com> >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > [image: Cloudera] <https://www.cloudera.com/> >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > [image: Cloudera on Twitter] < >>>>>>>>>>>>>>>>> https://twitter.com/cloudera> [image: >>>>>>>>>>>>>>>>> > > Cloudera on Facebook] < >>>>>>>>>>>>>>>>> https://www.facebook.com/cloudera> [image: >>>>>>>>>>>>>>>>> > Cloudera >>>>>>>>>>>>>>>>> > > on LinkedIn] < >>>>>>>>>>>>>>>>> https://www.linkedin.com/company/cloudera> >>>>>>>>>>>>>>>>> > > ------------------------------ >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> *Peter Cseh *| Software Engineer >>>>>>>>>>>>>>>>> cloudera.com <https://www.cloudera.com> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [image: Cloudera] <https://www.cloudera.com/> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> >>>>>>>>>>>>>>>>> [image: >>>>>>>>>>>>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> >>>>>>>>>>>>>>>>> [image: Cloudera >>>>>>>>>>>>>>>>> on LinkedIn] <https://www.linkedin.com/company/cloudera> >>>>>>>>>>>>>>>>> ------------------------------ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> *Peter Cseh *| Software Engineer >>>>>>>>>>>>>>> cloudera.com <https://www.cloudera.com> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [image: Cloudera] <https://www.cloudera.com/> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> >>>>>>>>>>>>>>> [image: >>>>>>>>>>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> >>>>>>>>>>>>>>> [image: >>>>>>>>>>>>>>> Cloudera on LinkedIn] >>>>>>>>>>>>>>> <https://www.linkedin.com/company/cloudera> >>>>>>>>>>>>>>> ------------------------------ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> *Peter Cseh *| Software Engineer >>>>>>>>>>> cloudera.com <https://www.cloudera.com> >>>>>>>>>>> >>>>>>>>>>> [image: Cloudera] <https://www.cloudera.com/> >>>>>>>>>>> >>>>>>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: >>>>>>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: >>>>>>>>>>> Cloudera on LinkedIn] >>>>>>>>>>> <https://www.linkedin.com/company/cloudera> >>>>>>>>>>> ------------------------------ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> *Peter Cseh *| Software Engineer >>>>>>>>> cloudera.com <https://www.cloudera.com> >>>>>>>>> >>>>>>>>> [image: Cloudera] <https://www.cloudera.com/> >>>>>>>>> >>>>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: >>>>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: >>>>>>>>> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera> >>>>>>>>> ------------------------------ >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Peter Cseh *| Software Engineer >>>>>>> cloudera.com <https://www.cloudera.com> >>>>>>> >>>>>>> [image: Cloudera] <https://www.cloudera.com/> >>>>>>> >>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: >>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: >>>>>>> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera> >>>>>>> ------------------------------ >>>>>>> >>>>>>> > > > -- > *Peter Cseh *| Software Engineer > cloudera.com <https://www.cloudera.com> > > [image: Cloudera] <https://www.cloudera.com/> > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: > Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera> > ------------------------------ > >