Ok, I've found it:

If you are using 4.3.0 or newer this is the part which checks for
dependencies:
https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/coord/CoordCommandUtils.java#L914-L926
It passes the coordinator action's configuration and even does
impersonation to check for the dependencies:
https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/coord/input/logic/CoordInputLogicEvaluatorPhaseOne.java#L159

Have you tried the following in the coordinator xml:

 <action>
        <workflow>
          <app-path>hdfs://bar:9000/usr/joe/logsprocessor-wf</app-path>
          <configuration>
            <property>
              <name>fs.s3.awsAccessKeyId</name>
              <value>[YOURKEYID]</value>
            </property>
            <property>
              <name>fs.s3.awsSecretAccessKey</name>
              <value>[YOURKEY]</value>
            </property>
         </configuration>
       </workflow>
      </action>

Based on the source this should be able to poll s3 periodically.

On Wed, May 16, 2018 at 10:57 PM, purna pradeep <purna2prad...@gmail.com>
wrote:

>
> I have tried with coordinator's configuration too but no luck ☹️
>
> On Wed, May 16, 2018 at 3:54 PM Peter Cseh <gezap...@cloudera.com> wrote:
>
>> Great progress there purna! :)
>>
>> Have you tried adding these properites to the coordinator's
>> configuration? we usually use the action config to build up connection to
>> the distributed file system.
>> Although I'm not sure we're using these when polling the dependencies for
>> coordinators, but I'm excited about you trying to make it work!
>>
>> I'll get back with a - hopefully - more helpful answer soon, I have to
>> check the code in more depth first.
>> gp
>>
>> On Wed, May 16, 2018 at 9:45 PM, purna pradeep <purna2prad...@gmail.com>
>> wrote:
>>
>>> Peter,
>>>
>>> I got rid of this error by adding
>>> hadoop-aws-2.8.3.jar and jets3t-0.9.4.jar
>>>
>>> But I’m getting below error now
>>>
>>> java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access
>>> Key must be specified by setting the fs.s3.awsAccessKeyId and
>>> fs.s3.awsSecretAccessKey properties (respectively)
>>>
>>> I have tried adding AWS access ,secret keys in
>>>
>>> oozie-site.xml and hadoop core-site.xml , and hadoop-config.xml
>>>
>>>
>>>
>>>
>>> On Wed, May 16, 2018 at 2:30 PM purna pradeep <purna2prad...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> I have tried this ,just added s3 instead of *
>>>>
>>>> <property>
>>>>
>>>>     <name>oozie.service.HadoopAccessorService.
>>>> supported.filesystems</name>
>>>>
>>>>     <value>hdfs,hftp,webhdfs,s3</value>
>>>>
>>>> </property>
>>>>
>>>>
>>>> Getting below error
>>>>
>>>> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
>>>> org.apache.hadoop.fs.s3a.S3AFileSystem not found
>>>>
>>>>     at org.apache.hadoop.conf.Configuration.getClass(
>>>> Configuration.java:2369)
>>>>
>>>>     at org.apache.hadoop.fs.FileSystem.getFileSystemClass(
>>>> FileSystem.java:2793)
>>>>
>>>>     at org.apache.hadoop.fs.FileSystem.createFileSystem(
>>>> FileSystem.java:2810)
>>>>
>>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
>>>>
>>>>     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(
>>>> FileSystem.java:2849)
>>>>
>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2831)
>>>>
>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
>>>>
>>>>     at org.apache.oozie.service.HadoopAccessorService$5.run(
>>>> HadoopAccessorService.java:625)
>>>>
>>>>     at org.apache.oozie.service.HadoopAccessorService$5.run(
>>>> HadoopAccessorService.java:623
>>>>
>>>>
>>>> On Wed, May 16, 2018 at 2:19 PM purna pradeep <purna2prad...@gmail.com>
>>>> wrote:
>>>>
>>>>> This is what is in the logs
>>>>>
>>>>> 2018-05-16 14:06:13,500  INFO URIHandlerService:520 -
>>>>> SERVER[localhost] Loaded urihandlers [org.apache.oozie.dependency.
>>>>> FSURIHandler]
>>>>>
>>>>> 2018-05-16 14:06:13,501  INFO URIHandlerService:520 -
>>>>> SERVER[localhost] Loaded default urihandler org.apache.oozie.dependency.
>>>>> FSURIHandler
>>>>>
>>>>>
>>>>> On Wed, May 16, 2018 at 12:27 PM Peter Cseh <gezap...@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>> That's strange, this exception should not happen in that case.
>>>>>> Can you check the server logs for messages like this?
>>>>>>         LOG.info("Loaded urihandlers {0}", Arrays.toString(classes));
>>>>>>         LOG.info("Loaded default urihandler {0}",
>>>>>> defaultHandler.getClass().getName());
>>>>>> Thanks
>>>>>>
>>>>>> On Wed, May 16, 2018 at 5:47 PM, purna pradeep <
>>>>>> purna2prad...@gmail.com> wrote:
>>>>>>
>>>>>>> This is what I already have in my oozie-site.xml
>>>>>>>
>>>>>>> <property>
>>>>>>>
>>>>>>>         <name>oozie.service.HadoopAccessorService.
>>>>>>> supported.filesystems</name>
>>>>>>>
>>>>>>>         <value>*</value>
>>>>>>>
>>>>>>> </property>
>>>>>>>
>>>>>>> On Wed, May 16, 2018 at 11:37 AM Peter Cseh <gezap...@cloudera.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> You'll have to configure
>>>>>>>> oozie.service.HadoopAccessorService.supported.filesystems
>>>>>>>> hdfs,hftp,webhdfs Enlist
>>>>>>>> the different filesystems supported for federation. If wildcard "*"
>>>>>>>> is
>>>>>>>> specified, then ALL file schemes will be allowed.properly.
>>>>>>>>
>>>>>>>> For testing purposes it's ok to put * in there in oozie-site.xml
>>>>>>>>
>>>>>>>> On Wed, May 16, 2018 at 5:29 PM, purna pradeep <
>>>>>>>> purna2prad...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> > Peter,
>>>>>>>> >
>>>>>>>> > I have tried to specify dataset with uri starting with s3://,
>>>>>>>> s3a:// and
>>>>>>>> > s3n:// and I am getting exception
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Exception occurred:E0904: Scheme [s3] not supported in uri
>>>>>>>> > [s3://mybucket/input.data] Making the job failed
>>>>>>>> >
>>>>>>>> > org.apache.oozie.dependency.URIHandlerException: E0904: Scheme
>>>>>>>> [s3] not
>>>>>>>> > supported in uri [s3:// mybucket /input.data]
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler(
>>>>>>>> > URIHandlerService.java:185)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler(
>>>>>>>> > URIHandlerService.java:168)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.service.URIHandlerService.getURIHandler(
>>>>>>>> > URIHandlerService.java:160)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.command.coord.CoordCommandUtils.createEarlyURIs(
>>>>>>>> > CoordCommandUtils.java:465)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.command.coord.CoordCommandUtils.
>>>>>>>> > separateResolvedAndUnresolved(CoordCommandUtils.java:404)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.command.coord.CoordCommandUtils.
>>>>>>>> > materializeInputDataEvents(CoordCommandUtils.java:731)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.command.coord.CoordCommandUtils.
>>>>>>>> materializeOneInstance(
>>>>>>>> > CoordCommandUtils.java:546)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.command.coord.CoordMaterializeTransitionXCom
>>>>>>>> > mand.materializeActions(CoordMaterializeTransitionXCom
>>>>>>>> mand.java:492)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.command.coord.CoordMaterializeTransitionXCom
>>>>>>>> > mand.materialize(CoordMaterializeTransitionXCommand.java:362)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.command.MaterializeTransitionXCommand.execute(
>>>>>>>> > MaterializeTransitionXCommand.java:73)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.command.MaterializeTransitionXCommand.execute(
>>>>>>>> > MaterializeTransitionXCommand.java:29)
>>>>>>>> >
>>>>>>>> >     at org.apache.oozie.command.XCommand.call(XCommand.java:290)
>>>>>>>> >
>>>>>>>> >     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > org.apache.oozie.service.CallableQueueService$
>>>>>>>> CallableWrapper.run(
>>>>>>>> > CallableQueueService.java:181)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > java.util.concurrent.ThreadPoolExecutor.runWorker(
>>>>>>>> > ThreadPoolExecutor.java:1149)
>>>>>>>> >
>>>>>>>> >     at
>>>>>>>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>>>>>>> > ThreadPoolExecutor.java:624)
>>>>>>>> >
>>>>>>>> >     at java.lang.Thread.run(Thread.java:748)
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Is S3 support specific to CDH distribution or should it work in
>>>>>>>> Apache
>>>>>>>> > Oozie as well? I’m not using CDH yet so
>>>>>>>> >
>>>>>>>> > On Wed, May 16, 2018 at 10:28 AM Peter Cseh <
>>>>>>>> gezap...@cloudera.com> wrote:
>>>>>>>> >
>>>>>>>> > > I think it should be possible for Oozie to poll S3. Check out
>>>>>>>> this
>>>>>>>> > > <
>>>>>>>> > > https://www.cloudera.com/documentation/enterprise/5-9-
>>>>>>>> > x/topics/admin_oozie_s3.html
>>>>>>>> > > >
>>>>>>>> > > description on how to make it work in jobs, something similar
>>>>>>>> should work
>>>>>>>> > > on the server side as well
>>>>>>>> > >
>>>>>>>> > > On Tue, May 15, 2018 at 4:43 PM, purna pradeep <
>>>>>>>> purna2prad...@gmail.com>
>>>>>>>> > > wrote:
>>>>>>>> > >
>>>>>>>> > > > Thanks Andras,
>>>>>>>> > > >
>>>>>>>> > > > Also I also would like to know if oozie supports Aws S3 as
>>>>>>>> input events
>>>>>>>> > > to
>>>>>>>> > > > poll for a dependency file before kicking off a spark action
>>>>>>>> > > >
>>>>>>>> > > >
>>>>>>>> > > > For example: I don’t want to kick off a spark action until a
>>>>>>>> file is
>>>>>>>> > > > arrived on a given AWS s3 location
>>>>>>>> > > >
>>>>>>>> > > > On Tue, May 15, 2018 at 10:17 AM Andras Piros <
>>>>>>>> > andras.pi...@cloudera.com
>>>>>>>> > > >
>>>>>>>> > > > wrote:
>>>>>>>> > > >
>>>>>>>> > > > > Hi,
>>>>>>>> > > > >
>>>>>>>> > > > > Oozie needs HDFS to store workflow, coordinator, or bundle
>>>>>>>> > definitions,
>>>>>>>> > > > as
>>>>>>>> > > > > well as sharelib files in a safe, distributed and scalable
>>>>>>>> way. Oozie
>>>>>>>> > > > needs
>>>>>>>> > > > > YARN to run almost all of its actions, Spark action being no
>>>>>>>> > exception.
>>>>>>>> > > > >
>>>>>>>> > > > > At the moment it's not feasible to install Oozie without
>>>>>>>> those Hadoop
>>>>>>>> > > > > components. How to install Oozie please *find here
>>>>>>>> > > > > <https://oozie.apache.org/docs/5.0.0/AG_Install.html>*.
>>>>>>>> > > > >
>>>>>>>> > > > > Regards,
>>>>>>>> > > > >
>>>>>>>> > > > > Andras
>>>>>>>> > > > >
>>>>>>>> > > > > On Tue, May 15, 2018 at 4:11 PM, purna pradeep <
>>>>>>>> > > purna2prad...@gmail.com>
>>>>>>>> > > > > wrote:
>>>>>>>> > > > >
>>>>>>>> > > > > > Hi,
>>>>>>>> > > > > >
>>>>>>>> > > > > > Would like to know if I can use sparkaction in oozie
>>>>>>>> without having
>>>>>>>> > > > > Hadoop
>>>>>>>> > > > > > cluster?
>>>>>>>> > > > > >
>>>>>>>> > > > > > I want to use oozie to schedule spark jobs on Kubernetes
>>>>>>>> cluster
>>>>>>>> > > > > >
>>>>>>>> > > > > > I’m a beginner in oozie
>>>>>>>> > > > > >
>>>>>>>> > > > > > Thanks
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > > >
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> > > --
>>>>>>>> > > *Peter Cseh *| Software Engineer
>>>>>>>> > > cloudera.com <https://www.cloudera.com>
>>>>>>>> > >
>>>>>>>> > > [image: Cloudera] <https://www.cloudera.com/>
>>>>>>>> > >
>>>>>>>> > > [image: Cloudera on Twitter] <https://twitter.com/cloudera>
>>>>>>>> [image:
>>>>>>>> > > Cloudera on Facebook] <https://www.facebook.com/cloudera>
>>>>>>>> [image:
>>>>>>>> > Cloudera
>>>>>>>> > > on LinkedIn] <https://www.linkedin.com/company/cloudera>
>>>>>>>> > > ------------------------------
>>>>>>>> > >
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Peter Cseh *| Software Engineer
>>>>>>>> cloudera.com <https://www.cloudera.com>
>>>>>>>>
>>>>>>>> [image: Cloudera] <https://www.cloudera.com/>
>>>>>>>>
>>>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
>>>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
>>>>>>>> Cloudera
>>>>>>>> on LinkedIn] <https://www.linkedin.com/company/cloudera>
>>>>>>>> ------------------------------
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Peter Cseh *| Software Engineer
>>>>>> cloudera.com <https://www.cloudera.com>
>>>>>>
>>>>>> [image: Cloudera] <https://www.cloudera.com/>
>>>>>>
>>>>>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
>>>>>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
>>>>>> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
>>>>>> ------------------------------
>>>>>>
>>>>>>
>>
>>
>> --
>> *Peter Cseh *| Software Engineer
>> cloudera.com <https://www.cloudera.com>
>>
>> [image: Cloudera] <https://www.cloudera.com/>
>>
>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
>> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
>> ------------------------------
>>
>>


-- 
*Peter Cseh *| Software Engineer
cloudera.com <https://www.cloudera.com>

[image: Cloudera] <https://www.cloudera.com/>

[image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
on LinkedIn] <https://www.linkedin.com/company/cloudera>
------------------------------

Reply via email to