Thanks everyone for sharing your ideas. Very useful. I appreciate.

On Fri, Apr 7, 2017 at 10:40 AM, Sam Elamin <hussam.ela...@gmail.com> wrote:

> Definitely agree with gourav there. I wouldn't want jenkins to run my work
> flow. Seems to me that you would only be using jenkins for its scheduling
> capabilities
>
> Yes you can run tests but you wouldn't want it to run your orchestration
> of jobs
>
> What happens if jenkijs goes down for any particular reason. How do you
> have the conversation with your stakeholders that your pipeline is not
> working and they don't have data because the build server is going through
> an upgrade or going through an upgrade
>
> However to be fair I understand what you are saying Steve if someone is in
> a place where you only have access to jenkins and have to go through hoops
> to setup:get access to new instances then engineers will do what they
> always do, find ways to game the system to get their work done
>
>
>
>
> On Fri, 7 Apr 2017 at 16:17, Gourav Sengupta <gourav.sengu...@gmail.com>
> wrote:
>
>> Hi Steve,
>>
>> Why would you ever do that? You are suggesting the use of a CI tool as a
>> workflow and orchestration engine.
>>
>> Regards,
>> Gourav Sengupta
>>
>> On Fri, Apr 7, 2017 at 4:07 PM, Steve Loughran <ste...@hortonworks.com>
>> wrote:
>>
>>> If you have Jenkins set up for some CI workflow, that can do scheduled
>>> builds and tests. Works well if you can do some build test before even
>>> submitting it to a remote cluster
>>>
>>> On 7 Apr 2017, at 10:15, Sam Elamin <hussam.ela...@gmail.com> wrote:
>>>
>>> Hi Shyla
>>>
>>> You have multiple options really some of which have been already listed
>>> but let me try and clarify
>>>
>>> Assuming you have a spark application in a jar you have a variety of
>>> options
>>>
>>> You have to have an existing spark cluster that is either running on EMR
>>> or somewhere else.
>>>
>>> *Super simple / hacky*
>>> Cron job on EC2 that calls a simple shell script that does a spart
>>> submit to a Spark Cluster OR create or add step to an EMR cluster
>>>
>>> *More Elegant*
>>> Airflow/Luigi/AWS Data Pipeline (Which is just CRON in the UI ) that
>>> will do the above step but have scheduling and potential backfilling and
>>> error handling(retries,alerts etc)
>>>
>>> AWS are coming out with glue <https://aws.amazon.com/glue/> soon that
>>> does some Spark jobs but I do not think its available worldwide just yet
>>>
>>> Hope I cleared things up
>>>
>>> Regards
>>> Sam
>>>
>>>
>>> On Fri, Apr 7, 2017 at 6:05 AM, Gourav Sengupta <
>>> gourav.sengu...@gmail.com> wrote:
>>>
>>>> Hi Shyla,
>>>>
>>>> why would you want to schedule a spark job in EC2 instead of EMR?
>>>>
>>>> Regards,
>>>> Gourav
>>>>
>>>> On Fri, Apr 7, 2017 at 1:04 AM, shyla deshpande <
>>>> deshpandesh...@gmail.com> wrote:
>>>>
>>>>> I want to run a spark batch job maybe hourly on AWS EC2 .  What is the
>>>>> easiest way to do this. Thanks
>>>>>
>>>>
>>>>
>>>
>>>
>>

Reply via email to