Here's some related info https://docs.hortonworks.com/HDPDocuments/HDCloudAWS/HDCloudAWS-1.8.0/bk_hdcloud-aws/content/s3-trouble/index.html
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md On Wed, May 16, 2018, 3:45 PM purna pradeep <purna2prad...@gmail.com> wrote: > Peter, > > I got rid of this error by adding > hadoop-aws-2.8.3.jar and jets3t-0.9.4.jar > > But I’m getting below error now > > java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key > must be specified by setting the fs.s3.awsAccessKeyId and > fs.s3.awsSecretAccessKey properties (respectively) > > I have tried adding AWS access ,secret keys in > > oozie-site.xml and hadoop core-site.xml , and hadoop-config.xml > > > > > On Wed, May 16, 2018 at 2:30 PM purna pradeep <purna2prad...@gmail.com> > wrote: > > > > > I have tried this ,just added s3 instead of * > > > > <property> > > > > > <name>oozie.service.HadoopAccessorService.supported.filesystems</name> > > > > <value>hdfs,hftp,webhdfs,s3</value> > > > > </property> > > > > > > Getting below error > > > > java.lang.RuntimeException: java.lang.ClassNotFoundException: Class > > org.apache.hadoop.fs.s3a.S3AFileSystem not found > > > > at > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2369) > > > > at > > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2793) > > > > at > > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2810) > > > > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100) > > > > at > > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2849) > > > > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2831) > > > > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389) > > > > at > > > org.apache.oozie.service.HadoopAccessorService$5.run(HadoopAccessorService.java:625) > > > > at > > > org.apache.oozie.service.HadoopAccessorService$5.run(HadoopAccessorService.java:623 > > > > > > On Wed, May 16, 2018 at 2:19 PM purna pradeep <purna2prad...@gmail.com> > > wrote: > > > >> This is what is in the logs > >> > >> 2018-05-16 14:06:13,500 INFO URIHandlerService:520 - SERVER[localhost] > >> Loaded urihandlers [org.apache.oozie.dependency.FSURIHandler] > >> > >> 2018-05-16 14:06:13,501 INFO URIHandlerService:520 - SERVER[localhost] > >> Loaded default urihandler org.apache.oozie.dependency.FSURIHandler > >> > >> > >> On Wed, May 16, 2018 at 12:27 PM Peter Cseh <gezap...@cloudera.com> > >> wrote: > >> > >>> That's strange, this exception should not happen in that case. > >>> Can you check the server logs for messages like this? > >>> LOG.info("Loaded urihandlers {0}", Arrays.toString(classes)); > >>> LOG.info("Loaded default urihandler {0}", > >>> defaultHandler.getClass().getName()); > >>> Thanks > >>> > >>> On Wed, May 16, 2018 at 5:47 PM, purna pradeep < > purna2prad...@gmail.com> > >>> wrote: > >>> > >>>> This is what I already have in my oozie-site.xml > >>>> > >>>> <property> > >>>> > >>>> > >>>> <name>oozie.service.HadoopAccessorService.supported.filesystems</name> > >>>> > >>>> <value>*</value> > >>>> > >>>> </property> > >>>> > >>>> On Wed, May 16, 2018 at 11:37 AM Peter Cseh <gezap...@cloudera.com> > >>>> wrote: > >>>> > >>>>> You'll have to configure > >>>>> oozie.service.HadoopAccessorService.supported.filesystems > >>>>> hdfs,hftp,webhdfs Enlist > >>>>> the different filesystems supported for federation. If wildcard "*" > is > >>>>> specified, then ALL file schemes will be allowed.properly. > >>>>> > >>>>> For testing purposes it's ok to put * in there in oozie-site.xml > >>>>> > >>>>> On Wed, May 16, 2018 at 5:29 PM, purna pradeep < > >>>>> purna2prad...@gmail.com> > >>>>> wrote: > >>>>> > >>>>> > Peter, > >>>>> > > >>>>> > I have tried to specify dataset with uri starting with s3://, > s3a:// > >>>>> and > >>>>> > s3n:// and I am getting exception > >>>>> > > >>>>> > > >>>>> > > >>>>> > Exception occurred:E0904: Scheme [s3] not supported in uri > >>>>> > [s3://mybucket/input.data] Making the job failed > >>>>> > > >>>>> > org.apache.oozie.dependency.URIHandlerException: E0904: Scheme [s3] > >>>>> not > >>>>> > supported in uri [s3:// mybucket /input.data] > >>>>> > > >>>>> > at > >>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler( > >>>>> > URIHandlerService.java:185) > >>>>> > > >>>>> > at > >>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler( > >>>>> > URIHandlerService.java:168) > >>>>> > > >>>>> > at > >>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler( > >>>>> > URIHandlerService.java:160) > >>>>> > > >>>>> > at > >>>>> > org.apache.oozie.command.coord.CoordCommandUtils.createEarlyURIs( > >>>>> > CoordCommandUtils.java:465) > >>>>> > > >>>>> > at > >>>>> > org.apache.oozie.command.coord.CoordCommandUtils. > >>>>> > separateResolvedAndUnresolved(CoordCommandUtils.java:404) > >>>>> > > >>>>> > at > >>>>> > org.apache.oozie.command.coord.CoordCommandUtils. > >>>>> > materializeInputDataEvents(CoordCommandUtils.java:731) > >>>>> > > >>>>> > at > >>>>> > > >>>>> > org.apache.oozie.command.coord.CoordCommandUtils.materializeOneInstance( > >>>>> > CoordCommandUtils.java:546) > >>>>> > > >>>>> > at > >>>>> > org.apache.oozie.command.coord.CoordMaterializeTransitionXCom > >>>>> > > mand.materializeActions(CoordMaterializeTransitionXCommand.java:492) > >>>>> > > >>>>> > at > >>>>> > org.apache.oozie.command.coord.CoordMaterializeTransitionXCom > >>>>> > mand.materialize(CoordMaterializeTransitionXCommand.java:362) > >>>>> > > >>>>> > at > >>>>> > org.apache.oozie.command.MaterializeTransitionXCommand.execute( > >>>>> > MaterializeTransitionXCommand.java:73) > >>>>> > > >>>>> > at > >>>>> > org.apache.oozie.command.MaterializeTransitionXCommand.execute( > >>>>> > MaterializeTransitionXCommand.java:29) > >>>>> > > >>>>> > at org.apache.oozie.command.XCommand.call(XCommand.java:290) > >>>>> > > >>>>> > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > >>>>> > > >>>>> > at > >>>>> > org.apache.oozie.service.CallableQueueService$CallableWrapper.run( > >>>>> > CallableQueueService.java:181) > >>>>> > > >>>>> > at > >>>>> > java.util.concurrent.ThreadPoolExecutor.runWorker( > >>>>> > ThreadPoolExecutor.java:1149) > >>>>> > > >>>>> > at > >>>>> > java.util.concurrent.ThreadPoolExecutor$Worker.run( > >>>>> > ThreadPoolExecutor.java:624) > >>>>> > > >>>>> > at java.lang.Thread.run(Thread.java:748) > >>>>> > > >>>>> > > >>>>> > > >>>>> > Is S3 support specific to CDH distribution or should it work in > >>>>> Apache > >>>>> > Oozie as well? I’m not using CDH yet so > >>>>> > > >>>>> > On Wed, May 16, 2018 at 10:28 AM Peter Cseh <gezap...@cloudera.com > > > >>>>> wrote: > >>>>> > > >>>>> > > I think it should be possible for Oozie to poll S3. Check out > this > >>>>> > > < > >>>>> > > https://www.cloudera.com/documentation/enterprise/5-9- > >>>>> > x/topics/admin_oozie_s3.html > >>>>> > > > > >>>>> > > description on how to make it work in jobs, something similar > >>>>> should work > >>>>> > > on the server side as well > >>>>> > > > >>>>> > > On Tue, May 15, 2018 at 4:43 PM, purna pradeep < > >>>>> purna2prad...@gmail.com> > >>>>> > > wrote: > >>>>> > > > >>>>> > > > Thanks Andras, > >>>>> > > > > >>>>> > > > Also I also would like to know if oozie supports Aws S3 as > input > >>>>> events > >>>>> > > to > >>>>> > > > poll for a dependency file before kicking off a spark action > >>>>> > > > > >>>>> > > > > >>>>> > > > For example: I don’t want to kick off a spark action until a > >>>>> file is > >>>>> > > > arrived on a given AWS s3 location > >>>>> > > > > >>>>> > > > On Tue, May 15, 2018 at 10:17 AM Andras Piros < > >>>>> > andras.pi...@cloudera.com > >>>>> > > > > >>>>> > > > wrote: > >>>>> > > > > >>>>> > > > > Hi, > >>>>> > > > > > >>>>> > > > > Oozie needs HDFS to store workflow, coordinator, or bundle > >>>>> > definitions, > >>>>> > > > as > >>>>> > > > > well as sharelib files in a safe, distributed and scalable > >>>>> way. Oozie > >>>>> > > > needs > >>>>> > > > > YARN to run almost all of its actions, Spark action being no > >>>>> > exception. > >>>>> > > > > > >>>>> > > > > At the moment it's not feasible to install Oozie without > those > >>>>> Hadoop > >>>>> > > > > components. How to install Oozie please *find here > >>>>> > > > > <https://oozie.apache.org/docs/5.0.0/AG_Install.html>*. > >>>>> > > > > > >>>>> > > > > Regards, > >>>>> > > > > > >>>>> > > > > Andras > >>>>> > > > > > >>>>> > > > > On Tue, May 15, 2018 at 4:11 PM, purna pradeep < > >>>>> > > purna2prad...@gmail.com> > >>>>> > > > > wrote: > >>>>> > > > > > >>>>> > > > > > Hi, > >>>>> > > > > > > >>>>> > > > > > Would like to know if I can use sparkaction in oozie > without > >>>>> having > >>>>> > > > > Hadoop > >>>>> > > > > > cluster? > >>>>> > > > > > > >>>>> > > > > > I want to use oozie to schedule spark jobs on Kubernetes > >>>>> cluster > >>>>> > > > > > > >>>>> > > > > > I’m a beginner in oozie > >>>>> > > > > > > >>>>> > > > > > Thanks > >>>>> > > > > > > >>>>> > > > > > >>>>> > > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > -- > >>>>> > > *Peter Cseh *| Software Engineer > >>>>> > > cloudera.com <https://www.cloudera.com> > >>>>> > > > >>>>> > > [image: Cloudera] <https://www.cloudera.com/> > >>>>> > > > >>>>> > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> > >>>>> [image: > >>>>> > > Cloudera on Facebook] <https://www.facebook.com/cloudera> > [image: > >>>>> > Cloudera > >>>>> > > on LinkedIn] <https://www.linkedin.com/company/cloudera> > >>>>> > > ------------------------------ > >>>>> > > > >>>>> > > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> *Peter Cseh *| Software Engineer > >>>>> cloudera.com <https://www.cloudera.com> > >>>>> > >>>>> [image: Cloudera] <https://www.cloudera.com/> > >>>>> > >>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > >>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: > >>>>> Cloudera > >>>>> on LinkedIn] <https://www.linkedin.com/company/cloudera> > >>>>> ------------------------------ > >>>>> > >>>> > >>> > >>> > >>> -- > >>> *Peter Cseh *| Software Engineer > >>> cloudera.com <https://www.cloudera.com> > >>> > >>> [image: Cloudera] <https://www.cloudera.com/> > >>> > >>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > >>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: > >>> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera> > >>> ------------------------------ > >>> > >>> >