Miguel,

At a previous company we were working towards automating the deployment of
the oozie scripts through our continuous integration pipeline. It was a
combination of a green build that would stage the workflow.xmls / jars to a
folder for our deployment system to then copy them to the servers and run a
shell script that would remove the current ones and deploy the new ones in
HDFS. The only issue that you might have is trying to find a good way to
archive previous copies but in a worst case scenario you can always
retrieve them from your version control system (we coupled them into our
Java projects).

OT:
Not meaning to hijack the thread but I had a question about something
stated in the paper cited in a previous response -
"Oozie provides a failure management mechanism in which the system will be
able to recover the status of all its running workflows in case of any
unexpected failure. The Oozie recovery management module utilizes the
 persistence store to manage checkpoints of the workflow execution
progress. In case of failure, the Oozie server will restart from its latest
checkpoint. Although the current implementation doesn't enable
automatic failover between the primary and secondary servers, the
existing mechanism will make it easy for automatic failover, which
is planned as part of our future work."

Can someone expand on this? From my testing I have not seen any workflow
recovery when we hard kill a server and fail over. I realize it is stated
in the later part that automatic fail over is not implemented yet but it
seems like the previous statement is ambiguous towards the workflow
recovery portion.

--
Matt

On Tue, Aug 28, 2012 at 12:17 PM, Miguel Lucero <[email protected]
> wrote:

> Thanks, Mohammad! That's a really great paper. I hope that Yahoo might be
> able to share some of the automation and job management on the operations
> side in the future as well...
>
> I think a lot of users are arriving at similar solutions for managing job
> deployments. I've created a set of scripts to allow for easier deployment
> in the same way that Maxime has but it's still manual to some extent. I
> wanted to ask if there was a standard solution, but it just appears to be
> the creation of a set of scripts to parse job parameters and consolidate
> commands. I'd like to work toward extending our automation and have jobs
> deployed automatically without human intervention so I wanted to see what
> approaches others have taken or if the community had something available :-)
>
> Thanks everybody...
>
> Miguel
>
> -----Original Message-----
> From: Mohammad Islam [mailto:[email protected]]
> Sent: Monday, August 27, 2012 2:13 AM
> To: [email protected]; Eduardo Afonso Ferreira
> Subject: Re: Oozie Scaling and Management...
>
> Hi Miguel,
> Sorry for the late reply.
>
> We recently publish a paper in ACM/SIGMOD workshop that address/discuss
> some scalability issue related to Oozie. One copy could be foudn at:
> https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxzd2VldHdvcmtzaG9wMjAxMnxneDo1NzRhYjZlNzdmNTM1Yjgw
>
>
> For deployment/management/automation, we don't have much documentation
> with specific data
> At Yahoo, Oozie is being utilized extensively that process a thousands of
> job per day.
>
> If you have more specific question, please feel free to ask.
>
> Regards,
> Mohammad
>
>
>
>
> ________________________________
>
> ________________________________
> From: Miguel Lucero <[email protected]>
> To: "[email protected]" <[email protected]>
> Sent: Thursday, August 23, 2012 4:52 PM
> Subject: Oozie Scaling and Management...
>
> Hi oozie-users,
>
> I wanted to ask if anyone could point me in the direction of any resources
> that might clarify scaling oozie to a large number of applications. I'm
> interested in the deployment and management aspects of larger oozie
> platforms. I haven't been able to find anything that goes beyond surface
> level like "use automation for deployment" etc. I'm trying to understand
> exactly how others are accomplishing that. Automation frameworks? Job
> Templating? The environment I manage is growing quickly, and has very
> distinct characteristics for each of our applications making automation a
> fun challenge so I am curious about how other users are tackling this. How
> have others handled hundreds, or thousands, of oozie application/workflow
> deployments in their environment? I apologize if there is a resource I'm
> missing online for information like this, but if there isn't one, can
> anyone share their insights?
>
> Thanks in advance and I apologize if this isn't the forum I should be
> using for questions like these...
>
> ml
>
> ________________________________
> This message is private and confidential. If you have received it in
> error, please notify the sender and remove it from your system.
>
>
> This message is private and confidential. If you have received it in
> error, please notify the sender and remove it from your system.
>
>
>

Reply via email to