Here you go !

   - Add oozie.service.HadoopAccessorService.supported.filesystems as * in
   oozie-site.xml
   - include hadoop-aws-2.8.3.jar
   - Rebuild oozie with Dhttpclient.version=4.5.5 -Dhttpcore.version=4.4.9
   - Set jetty_opts with proxy values



On Sat, May 19, 2018 at 2:17 AM Peter Cseh <gezap...@cloudera.com> wrote:

> Wow, great work!
> Can you please summarize the required steps? This would be useful for
> others so we probably should add it to our documentation.
> Thanks in advance!
> Peter
>
> On Fri, May 18, 2018 at 11:33 PM, purna pradeep <purna2prad...@gmail.com>
> wrote:
>
>> I got this fixed by setting jetty_opts with proxy values.
>>
>> Thanks Peter!!
>>
>> On Thu, May 17, 2018 at 4:05 PM purna pradeep <purna2prad...@gmail.com>
>> wrote:
>>
>>> Ok I fixed this by adding aws keys in oozie
>>>
>>> But I’m getting below error
>>>
>>> I have tried setting proxy in core-site.xml but no luck
>>>
>>>
>>> 2018-05-17 15:39:20,602 ERROR CoordInputLogicEvaluatorPhaseOne:517 -
>>> SERVER[localhost] USER[-] GROUP[-] TOKEN[-] APP[-]
>>> JOB[0000000-180517144113498-oozie-xjt0-C] ACTION[0000000-
>>> 180517144113498-oozie-xjt0-C@2] 
>>> org.apache.oozie.service.HadoopAccessorException:
>>> E0902: Exception occurred: [doesBucketExist on cmsegmentation-qa:
>>> com.amazonaws.SdkClientException: Unable to execute HTTP request:
>>> Connect to mybucket.s3.amazonaws.com:443
>>> <http://cmsegmentation-qa.s3.amazonaws.com:443/> [mybucket.
>>> s3.amazonaws.com/52.216.165.155
>>> <http://cmsegmentation-qa.s3.amazonaws.com/52.216.165.155>] failed:
>>> connect timed out]
>>>
>>> org.apache.oozie.service.HadoopAccessorException: E0902: Exception
>>> occurred: [doesBucketExist on cmsegmentation-qa:
>>> com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect
>>> to mybucket.s3.amazonaws.com:443
>>> <http://cmsegmentation-qa.s3.amazonaws.com:443/> [mybucket
>>> .s3.amazonaws.com
>>> <http://cmsegmentation-qa.s3.amazonaws.com/52.216.165.155> failed:
>>> connect timed out]
>>>
>>>                 at
>>> org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:630)
>>>
>>>                 at
>>> org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:594)
>>>                 at org.apache.oozie.dependency.
>>> FSURIHandler.getFileSystem(FSURIHandler.java:184)-env.sh
>>>
>>> But now I’m getting this error
>>>
>>>
>>>
>>> On Thu, May 17, 2018 at 2:53 PM purna pradeep <purna2prad...@gmail.com>
>>> wrote:
>>>
>>>> Ok I got passed this error
>>>>
>>>> By rebuilding oozie with Dhttpclient.version=4.5.5
>>>> -Dhttpcore.version=4.4.9
>>>>
>>>> now getting this error
>>>>
>>>>
>>>>
>>>> ACTION[0000000-180517144113498-oozie-xjt0-C@1]
>>>> org.apache.oozie.service.HadoopAccessorException: E0902: Exception
>>>> occurred: [doesBucketExist on mybucketcom.amazonaws.AmazonClientException:
>>>> No AWS Credentials provided by BasicAWSCredentialsProvider
>>>> EnvironmentVariableCredentialsProvider
>>>> SharedInstanceProfileCredentialsProvider :
>>>> com.amazonaws.SdkClientException: Unable to load credentials from service
>>>> endpoint]
>>>>
>>>> org.apache.oozie.service.HadoopAccessorException: E0902: Exception
>>>> occurred: [doesBucketExist on cmsegmentation-qa:
>>>> com.amazonaws.AmazonClientException: No AWS Credentials provided by
>>>> BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider
>>>> SharedInstanceProfileCredentialsProvider :
>>>> com.amazonaws.SdkClientException: Unable to load credentials from service
>>>> endpoint]
>>>>
>>>> On Thu, May 17, 2018 at 12:24 PM purna pradeep <purna2prad...@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> Peter,
>>>>>
>>>>> Also When I submit a job with new http client jar, I get
>>>>>
>>>>> ```Error: IO_ERROR : java.io.IOException: Error while connecting Oozie
>>>>> server. No of retries = 1. Exception = Could not authenticate,
>>>>> Authentication failed, status: 500, message: Server Error```
>>>>>
>>>>>
>>>>> On Thu, May 17, 2018 at 12:14 PM purna pradeep <
>>>>> purna2prad...@gmail.com> wrote:
>>>>>
>>>>>> Ok I have tried this
>>>>>>
>>>>>> It appears that s3a support requires httpclient 4.4.x and oozie is
>>>>>> bundled with httpclient 4.3.6. When httpclient is upgraded, the ext UI
>>>>>> stops loading.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, May 17, 2018 at 10:28 AM Peter Cseh <gezap...@cloudera.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Purna,
>>>>>>>
>>>>>>> Based on
>>>>>>> https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3
>>>>>>> you should try to go for s3a.
>>>>>>> You'll have to include the aws-jdk as well if I see it correctly:
>>>>>>> https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3A
>>>>>>> Also, the property names are slightly different so you'll have to
>>>>>>> change the example I've given.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, May 17, 2018 at 4:16 PM, purna pradeep <
>>>>>>> purna2prad...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Peter,
>>>>>>>>
>>>>>>>> I’m using latest oozie 5.0.0 and I have tried below changes but no
>>>>>>>> luck
>>>>>>>>
>>>>>>>> Is this for s3 or s3a ?
>>>>>>>>
>>>>>>>> I’m using s3 but if this is for s3a do you know which jar I need to
>>>>>>>> include I mean Hadoop-aws jar or any other jar if required
>>>>>>>>
>>>>>>>> Hadoop-aws-2.8.3.jar is what I’m using
>>>>>>>>
>>>>>>>> On Wed, May 16, 2018 at 5:19 PM Peter Cseh <gezap...@cloudera.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Ok, I've found it:
>>>>>>>>>
>>>>>>>>> If you are using 4.3.0 or newer this is the part which checks for
>>>>>>>>> dependencies:
>>>>>>>>>
>>>>>>>>> https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/coord/CoordCommandUtils.java#L914-L926
>>>>>>>>> It passes the coordinator action's configuration and even does
>>>>>>>>> impersonation to check for the dependencies:
>>>>>>>>>
>>>>>>>>> https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/coord/input/logic/CoordInputLogicEvaluatorPhaseOne.java#L159
>>>>>>>>>
>>>>>>>>> Have you tried the following in the coordinator xml:
>>>>>>>>>
>>>>>>>>>  <action>
>>>>>>>>>         <workflow>
>>>>>>>>>
>>>>>>>>> <app-path>hdfs://bar:9000/usr/joe/logsprocessor-wf</app-path>
>>>>>>>>>           <configuration>
>>>>>>>>>             <property>
>>>>>>>>>               <name>fs.s3.awsAccessKeyId</name>
>>>>>>>>>               <value>[YOURKEYID]</value>
>>>>>>>>>             </property>
>>>>>>>>>             <property>
>>>>>>>>>               <name>fs.s3.awsSecretAccessKey</name>
>>>>>>>>>               <value>[YOURKEY]</value>
>>>>>>>>>             </property>
>>>>>>>>>          </configuration>
>>>>>>>>>        </workflow>
>>>>>>>>>       </action>
>>>>>>>>>
>>>>>>>>> Based on the source this should be able to poll s3 periodically.
>>>>>>>>>
>>>>>>>>> On Wed, May 16, 2018 at 10:57 PM, purna pradeep <
>>>>>>>>> purna2prad...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I have tried with coordinator's configuration too but no luck ☹️
>>>>>>>>>>
>>>>>>>>>> On Wed, May 16, 2018 at 3:54 PM Peter Cseh <gezap...@cloudera.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Great progress there purna! :)
>>>>>>>>>>>
>>>>>>>>>>> Have you tried adding these properites to the coordinator's
>>>>>>>>>>> configuration? we usually use the action config to build up 
>>>>>>>>>>> connection to
>>>>>>>>>>> the distributed file system.
>>>>>>>>>>> Although I'm not sure we're using these when polling the
>>>>>>>>>>> dependencies for coordinators, but I'm excited about you trying to 
>>>>>>>>>>> make it
>>>>>>>>>>> work!
>>>>>>>>>>>
>>>>>>>>>>> I'll get back with a - hopefully - more helpful answer soon, I
>>>>>>>>>>> have to check the code in more depth first.
>>>>>>>>>>> gp
>>>>>>>>>>>
>>>>>>>>>>> On Wed, May 16, 2018 at 9:45 PM, purna pradeep <
>>>>>>>>>>> purna2prad...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Peter,
>>>>>>>>>>>>
>>>>>>>>>>>> I got rid of this error by adding
>>>>>>>>>>>> hadoop-aws-2.8.3.jar and jets3t-0.9.4.jar
>>>>>>>>>>>>
>>>>>>>>>>>> But I’m getting below error now
>>>>>>>>>>>>
>>>>>>>>>>>> java.lang.IllegalArgumentException: AWS Access Key ID and
>>>>>>>>>>>> Secret Access Key must be specified by setting the 
>>>>>>>>>>>> fs.s3.awsAccessKeyId and
>>>>>>>>>>>> fs.s3.awsSecretAccessKey properties (respectively)
>>>>>>>>>>>>
>>>>>>>>>>>> I have tried adding AWS access ,secret keys in
>>>>>>>>>>>>
>>>>>>>>>>>> oozie-site.xml and hadoop core-site.xml , and hadoop-config.xml
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, May 16, 2018 at 2:30 PM purna pradeep <
>>>>>>>>>>>> purna2prad...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have tried this ,just added s3 instead of *
>>>>>>>>>>>>>
>>>>>>>>>>>>> <property>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> <name>oozie.service.HadoopAccessorService.supported.filesystems</name>
>>>>>>>>>>>>>
>>>>>>>>>>>>>     <value>hdfs,hftp,webhdfs,s3</value>
>>>>>>>>>>>>>
>>>>>>>>>>>>> </property>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Getting below error
>>>>>>>>>>>>>
>>>>>>>>>>>>> java.lang.RuntimeException: java.lang.ClassNotFoundException:
>>>>>>>>>>>>> Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
>>>>>>>>>>>>>
>>>>>>>>>>>>>     at
>>>>>>>>>>>>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2369)
>>>>>>>>>>>>>
>>>>>>>>>>>>>     at
>>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2793)
>>>>>>>>>>>>>
>>>>>>>>>>>>>     at
>>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2810)
>>>>>>>>>>>>>
>>>>>>>>>>>>>     at
>>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
>>>>>>>>>>>>>
>>>>>>>>>>>>>     at
>>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2849)
>>>>>>>>>>>>>
>>>>>>>>>>>>>     at
>>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2831)
>>>>>>>>>>>>>
>>>>>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
>>>>>>>>>>>>>
>>>>>>>>>>>>>     at
>>>>>>>>>>>>> org.apache.oozie.service.HadoopAccessorService$5.run(HadoopAccessorService.java:625)
>>>>>>>>>>>>>
>>>>>>>>>>>>>     at
>>>>>>>>>>>>> org.apache.oozie.service.HadoopAccessorService$5.run(HadoopAccessorService.java:623
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, May 16, 2018 at 2:19 PM purna pradeep <
>>>>>>>>>>>>> purna2prad...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is what is in the logs
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2018-05-16 14:06:13,500  INFO URIHandlerService:520 -
>>>>>>>>>>>>>> SERVER[localhost] Loaded urihandlers
>>>>>>>>>>>>>> [org.apache.oozie.dependency.FSURIHandler]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2018-05-16 14:06:13,501  INFO URIHandlerService:520 -
>>>>>>>>>>>>>> SERVER[localhost] Loaded default urihandler
>>>>>>>>>>>>>> org.apache.oozie.dependency.FSURIHandler
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, May 16, 2018 at 12:27 PM Peter Cseh <
>>>>>>>>>>>>>> gezap...@cloudera.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That's strange, this exception should not happen in that
>>>>>>>>>>>>>>> case.
>>>>>>>>>>>>>>> Can you check the server logs for messages like this?
>>>>>>>>>>>>>>>         LOG.info("Loaded urihandlers {0}",
>>>>>>>>>>>>>>> Arrays.toString(classes));
>>>>>>>>>>>>>>>         LOG.info("Loaded default urihandler {0}",
>>>>>>>>>>>>>>> defaultHandler.getClass().getName());
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, May 16, 2018 at 5:47 PM, purna pradeep <
>>>>>>>>>>>>>>> purna2prad...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This is what I already have in my oozie-site.xml
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> <property>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> <name>oozie.service.HadoopAccessorService.supported.filesystems</name>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         <value>*</value>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> </property>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, May 16, 2018 at 11:37 AM Peter Cseh <
>>>>>>>>>>>>>>>> gezap...@cloudera.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> You'll have to configure
>>>>>>>>>>>>>>>>> oozie.service.HadoopAccessorService.supported.filesystems
>>>>>>>>>>>>>>>>> hdfs,hftp,webhdfs Enlist
>>>>>>>>>>>>>>>>> the different filesystems supported for federation. If
>>>>>>>>>>>>>>>>> wildcard "*" is
>>>>>>>>>>>>>>>>> specified, then ALL file schemes will be allowed.properly.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For testing purposes it's ok to put * in there in
>>>>>>>>>>>>>>>>> oozie-site.xml
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, May 16, 2018 at 5:29 PM, purna pradeep <
>>>>>>>>>>>>>>>>> purna2prad...@gmail.com>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> > Peter,
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > I have tried to specify dataset with uri starting with
>>>>>>>>>>>>>>>>> s3://, s3a:// and
>>>>>>>>>>>>>>>>> > s3n:// and I am getting exception
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Exception occurred:E0904: Scheme [s3] not supported in
>>>>>>>>>>>>>>>>> uri
>>>>>>>>>>>>>>>>> > [s3://mybucket/input.data] Making the job failed
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > org.apache.oozie.dependency.URIHandlerException: E0904:
>>>>>>>>>>>>>>>>> Scheme [s3] not
>>>>>>>>>>>>>>>>> > supported in uri [s3:// mybucket /input.data]
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler(
>>>>>>>>>>>>>>>>> > URIHandlerService.java:185)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler(
>>>>>>>>>>>>>>>>> > URIHandlerService.java:168)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler(
>>>>>>>>>>>>>>>>> > URIHandlerService.java:160)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> org.apache.oozie.command.coord.CoordCommandUtils.createEarlyURIs(
>>>>>>>>>>>>>>>>> > CoordCommandUtils.java:465)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> > org.apache.oozie.command.coord.CoordCommandUtils.
>>>>>>>>>>>>>>>>> > separateResolvedAndUnresolved(CoordCommandUtils.java:404)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> > org.apache.oozie.command.coord.CoordCommandUtils.
>>>>>>>>>>>>>>>>> > materializeInputDataEvents(CoordCommandUtils.java:731)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> org.apache.oozie.command.coord.CoordCommandUtils.materializeOneInstance(
>>>>>>>>>>>>>>>>> > CoordCommandUtils.java:546)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> org.apache.oozie.command.coord.CoordMaterializeTransitionXCom
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> mand.materializeActions(CoordMaterializeTransitionXCommand.java:492)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> org.apache.oozie.command.coord.CoordMaterializeTransitionXCom
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> mand.materialize(CoordMaterializeTransitionXCommand.java:362)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> org.apache.oozie.command.MaterializeTransitionXCommand.execute(
>>>>>>>>>>>>>>>>> > MaterializeTransitionXCommand.java:73)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> org.apache.oozie.command.MaterializeTransitionXCommand.execute(
>>>>>>>>>>>>>>>>> > MaterializeTransitionXCommand.java:29)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> org.apache.oozie.command.XCommand.call(XCommand.java:290)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(
>>>>>>>>>>>>>>>>> > CallableQueueService.java:181)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> > java.util.concurrent.ThreadPoolExecutor.runWorker(
>>>>>>>>>>>>>>>>> > ThreadPoolExecutor.java:1149)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at
>>>>>>>>>>>>>>>>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>>>>>>>>>>>>>>>> > ThreadPoolExecutor.java:624)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >     at java.lang.Thread.run(Thread.java:748)
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Is S3 support specific to CDH distribution or should it
>>>>>>>>>>>>>>>>> work in Apache
>>>>>>>>>>>>>>>>> > Oozie as well? I’m not using CDH yet so
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > On Wed, May 16, 2018 at 10:28 AM Peter Cseh <
>>>>>>>>>>>>>>>>> gezap...@cloudera.com> wrote:
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > > I think it should be possible for Oozie to poll S3.
>>>>>>>>>>>>>>>>> Check out this
>>>>>>>>>>>>>>>>> > > <
>>>>>>>>>>>>>>>>> > > https://www.cloudera.com/documentation/enterprise/5-9-
>>>>>>>>>>>>>>>>> > x/topics/admin_oozie_s3.html
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > description on how to make it work in jobs, something
>>>>>>>>>>>>>>>>> similar should work
>>>>>>>>>>>>>>>>> > > on the server side as well
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > > On Tue, May 15, 2018 at 4:43 PM, purna pradeep <
>>>>>>>>>>>>>>>>> purna2prad...@gmail.com>
>>>>>>>>>>>>>>>>> > > wrote:
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > > > Thanks Andras,
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > Also I also would like to know if oozie supports Aws
>>>>>>>>>>>>>>>>> S3 as input events
>>>>>>>>>>>>>>>>> > > to
>>>>>>>>>>>>>>>>> > > > poll for a dependency file before kicking off a
>>>>>>>>>>>>>>>>> spark action
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > For example: I don’t want to kick off a spark action
>>>>>>>>>>>>>>>>> until a file is
>>>>>>>>>>>>>>>>> > > > arrived on a given AWS s3 location
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > On Tue, May 15, 2018 at 10:17 AM Andras Piros <
>>>>>>>>>>>>>>>>> > andras.pi...@cloudera.com
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > wrote:
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > > Hi,
>>>>>>>>>>>>>>>>> > > > >
>>>>>>>>>>>>>>>>> > > > > Oozie needs HDFS to store workflow, coordinator,
>>>>>>>>>>>>>>>>> or bundle
>>>>>>>>>>>>>>>>> > definitions,
>>>>>>>>>>>>>>>>> > > > as
>>>>>>>>>>>>>>>>> > > > > well as sharelib files in a safe, distributed and
>>>>>>>>>>>>>>>>> scalable way. Oozie
>>>>>>>>>>>>>>>>> > > > needs
>>>>>>>>>>>>>>>>> > > > > YARN to run almost all of its actions, Spark
>>>>>>>>>>>>>>>>> action being no
>>>>>>>>>>>>>>>>> > exception.
>>>>>>>>>>>>>>>>> > > > >
>>>>>>>>>>>>>>>>> > > > > At the moment it's not feasible to install Oozie
>>>>>>>>>>>>>>>>> without those Hadoop
>>>>>>>>>>>>>>>>> > > > > components. How to install Oozie please *find here
>>>>>>>>>>>>>>>>> > > > > <
>>>>>>>>>>>>>>>>> https://oozie.apache.org/docs/5.0.0/AG_Install.html>*.
>>>>>>>>>>>>>>>>> > > > >
>>>>>>>>>>>>>>>>> > > > > Regards,
>>>>>>>>>>>>>>>>> > > > >
>>>>>>>>>>>>>>>>> > > > > Andras
>>>>>>>>>>>>>>>>> > > > >
>>>>>>>>>>>>>>>>> > > > > On Tue, May 15, 2018 at 4:11 PM, purna pradeep <
>>>>>>>>>>>>>>>>> > > purna2prad...@gmail.com>
>>>>>>>>>>>>>>>>> > > > > wrote:
>>>>>>>>>>>>>>>>> > > > >
>>>>>>>>>>>>>>>>> > > > > > Hi,
>>>>>>>>>>>>>>>>> > > > > >
>>>>>>>>>>>>>>>>> > > > > > Would like to know if I can use sparkaction in
>>>>>>>>>>>>>>>>> oozie without having
>>>>>>>>>>>>>>>>> > > > > Hadoop
>>>>>>>>>>>>>>>>> > > > > > cluster?
>>>>>>>>>>>>>>>>> > > > > >
>>>>>>>>>>>>>>>>> > > > > > I want to use oozie to schedule spark jobs on
>>>>>>>>>>>>>>>>> Kubernetes cluster
>>>>>>>>>>>>>>>>> > > > > >
>>>>>>>>>>>>>>>>> > > > > > I’m a beginner in oozie
>>>>>>>>>>>>>>>>> > > > > >
>>>>>>>>>>>>>>>>> > > > > > Thanks
>>>>>>>>>>>>>>>>> > > > > >
>>>>>>>>>>>>>>>>> > > > >
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > > --
>>>>>>>>>>>>>>>>> > > *Peter Cseh *| Software Engineer
>>>>>>>>>>>>>>>>> > > cloudera.com <https://www.cloudera.com>
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > > [image: Cloudera] <https://www.cloudera.com/>
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > > [image: Cloudera on Twitter] <
>>>>>>>>>>>>>>>>> https://twitter.com/cloudera> [image:
>>>>>>>>>>>>>>>>> > > Cloudera on Facebook] <
>>>>>>>>>>>>>>>>> https://www.facebook.com/cloudera> [image:
>>>>>>>>>>>>>>>>> > Cloudera
>>>>>>>>>>>>>>>>> > > on LinkedIn] <
>>>>>>>>>>>>>>>>> https://www.linkedin.com/company/cloudera>
>>>>>>>>>>>>>>>>> > > ------------------------------
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> *Peter Cseh *| Software Engineer
>>>>>>>>>>>>>>>>> cloudera.com <https://www.cloudera.com>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [image: Cloudera] <https://www.cloudera.com/>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera>
>>>>>>>>>>>>>>>>> [image:
>>>>>>>>>>>>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera>
>>>>>>>>>>>>>>>>> [image: Cloudera
>>>>>>>>>>>>>>>>> on LinkedIn] <https://www.linkedin.com/company/cloudera>
>>>>>>>>>>>>>>>>> ------------------------------
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> *Peter Cseh *| Software Engineer
>>>>>>>>>>>>>>> cloudera.com <https://www.cloudera.com>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [image: Cloudera] <https://www.cloudera.com/>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> 
>>>>>>>>>>>>>>> [image:
>>>>>>>>>>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> 
>>>>>>>>>>>>>>> [image:
>>>>>>>>>>>>>>> Cloudera on LinkedIn]
>>>>>>>>>>>>>>> <https://www.linkedin.com/company/cloudera>
>>>>>>>>>>>>>>> ------------------------------
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> *Peter Cseh *| Software Engineer
>>>>>>>>>>> cloudera.com <https://www.cloudera.com>
>>>>>>>>>>>
>>>>>>>>>>> [image: Cloudera] <https://www.cloudera.com/>
>>>>>>>>>>>
>>>>>>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
>>>>>>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
>>>>>>>>>>> Cloudera on LinkedIn]
>>>>>>>>>>> <https://www.linkedin.com/company/cloudera>
>>>>>>>>>>> ------------------------------
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> *Peter Cseh *| Software Engineer
>>>>>>>>> cloudera.com <https://www.cloudera.com>
>>>>>>>>>
>>>>>>>>> [image: Cloudera] <https://www.cloudera.com/>
>>>>>>>>>
>>>>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
>>>>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
>>>>>>>>> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
>>>>>>>>> ------------------------------
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> *Peter Cseh *| Software Engineer
>>>>>>> cloudera.com <https://www.cloudera.com>
>>>>>>>
>>>>>>> [image: Cloudera] <https://www.cloudera.com/>
>>>>>>>
>>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
>>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
>>>>>>> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
>>>>>>> ------------------------------
>>>>>>>
>>>>>>>
>
>
> --
> *Peter Cseh *| Software Engineer
> cloudera.com <https://www.cloudera.com>
>
> [image: Cloudera] <https://www.cloudera.com/>
>
> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
> ------------------------------
>
>

Reply via email to