On Tue, Dec 18, 2012 at 9:44 PM, Adrian Cole <[email protected]> wrote:
> Thanks for bringing this project out into the open. Looks like a > significant amount of effort, and worthwhile having a hard look. Moreover, > it is good to see more projects using or extending Whirr, either under or > above (heh or maybe both in this case!). > Thanks Adrian! > Something like this sounds like it can add persistence and recovering to > provisioning workflows. That's via activiti, right? I imagine how > provisionr does resiliency (as well ha/clustering of provisioning tasks) > would make an exciting slideshare read. Do keep us up-to-date. > Yep, all the persistence is handled by Activiti - there are some rough edges but it works as advertised. For HA we are planning to have a multi-master database deployment with multiple job executors on different machines. All the synchronisation is done through the DB. > > WRT whirr: > > I suppose whirr currently is possible to run embedded as a library, so > direct dependency on provisionr isn't going to work provided we wish to > continue this mode. That said, could be an interesting experiment to look > at another, more mesos-like, "whirr as a service": provisionr as a tlp or a > "service" subproject within whirr. interesting discussion regardless. > We will find a way later on :) > > Congrats and thanks for opening up this project! > > -A > > > jclouds-related footnotes for the curious: > > Not that this is the forum for it, but you might recall that jclouds 1.6 > alpha is blocking on throttling/request efficiency/resilience > improvements. Seems by your description that you've simultaneously been > attacking this, among your other features. > Yep. I am looking forward to test / help with resilience and request efficiency for jclouds 1.6. > > This has been slow on jclouds for a number of reasons, particularly looking > for the way to do this without creating a strict service dependency, and > without more spaghetti, or a bunch of lib deps. > > The RAX next gen throttle-thing denial of serviced many of us, and a lot of > development time, effectively bumping priority. Steve helped raise an > abstraction of http throttle error, which is now in the 1.5.x codeline. > Next step is to employ it with a system that shares quality information, > potentially out to an external service like hystrix if users choose. RAX > raised the throttle globally, due to collaboration with jclouds, but > jclouds aren't dropping the ball, are progressing this, and will release a > library-only solution as a part of 1.6. > > WRT resumable workflow, I've looked at several systems that claim the > ability to perform lightweight workflow or FSM. There's a ton of tech > available for use, but very few have a embedded mode that doesn't do > something like start zookeeper, or have ESB or BPM ambitions which tend to > bloat the dependency tree. FWIW, my personal opinion is pipeline has the > cleanest syntax, though it suffers from no OSS version as yet. 2 weeks > ago, I started a conversation with google folks about this, but no news. > Regardless, I'll keep folks posted as jclouds has for a long time aimed to > have a resumable workflow aptitude without compromising light deps. > That would be great! > > http://code.google.com/p/appengine-pipeline/ > > On Fri, Dec 14, 2012 at 7:34 AM, Andrei Savu <[email protected]> > wrote: > > > Hi guys, > > > > There is no secret that at Axemblr we are using Apache Whirr for > > provisioning and initial basic cluster configuration for Hadoop. As soon > as > > the machines are running we configure Hadoop by leveraging APIs from > > existing tools like Cloudera Manager or Ambari. > > > > All the orchestration needed to make this happen is not trivial if you > want > > the final system to be predictable, robust, restartable and easy to > inspect > > while running. > > > > A few months ago we've realised that we need to re-work the machine > > provisioning layer from Whirr and build a system that has the following > > features: > > > > * should be able to provision 10s or 100s of virtual machines by doing a > > good job at handling API throttling and by using batch operations as much > > as possible > > > > * all the internal workflows should be persistent and as granular as > > possible and each step should be idempotent > > > > * it should be possible to restart the application server while starting > > virtual machines with no impact > > > > * it should have a modular architecture and provide enough flexibility to > > be able to work with a large number of public and private clouds just by > > replacing modules > > > > * it should hide all this complexity behind a simple REST API and a > simple > > interactive shell > > > > * it should be able to automatically build gold base images and use the > to > > spawn large clusters > > > > We've spent some time looking for existing products that do all this and > in > > the end we've decided that it's better to start from scratch and build > this > > system as a new project based on Activiti, Apache Karaf, jclouds and > native > > sdks. > > > > The source code is now publicly available at: > > > > https://github.com/axemblr/axemblr-provisionr > > > > I would really like to know what you think about the work we've done so > > far. The project will improve a lot over the next couple of weeks / > months > > so I encourage you to stay tunned. > > > > We want to bring this project to the Apache Foundation later on. I will > > give a talk in february at ApacheCon NA on this. > > > > Cheers, > > > > -- Andrei Savu / axemblr.com > > >
