+1 !! Waiting eagerly :-)
- Milind On Jan 11, 2011, at 11:02 AM, Avery Ching wrote: > Hello, > > We have been working on Clusterlib at Yahoo! and would like to contribute it > as a subproject to ZooKeeper. Clusterlib was developed as a next-generation > platform for creating/coordinating search applications/services (including > crawling, processing, indexing, and front end) at Yahoo!. We suspect much of > this work will be useful for others trying to build up > large-scale/distributed applications that would like to coordinate and share > the same semantics. > > Here is a (relatively) short summary of why Clusterlib was developed: > > Large-scale distributed applications are difficult and time-consuming to > develop since a great deal of effort is spent solving the same > challenges (consistency, fault-tolerance, naming problems, etc.). > Additionally, coordinating these applications is typically ad-hoc and > hard to maintain. Clusterlib fills the gap by providing distributed > application developers with an object-oriented data model, > asynchronous event handling system, well-defined consistency semantics, and > methods for making coordination easy across > cooperating applications. Some example applications might include a search > engine, scalable file system, large-scale data cache, etc. > > Clusterlib is a middleware library for building distributed applications. It > was designed to simplify the job of application developers and provides a set > of distributed objects that all inherit from the same Notifyable interface. > The set of distributed objects includes: Root, Application, Group, > DataDistribution, Node, ProcessSlot, PropertyList, and Queue. In order to > give context, each object is described briefly. > > • Root is a point-of-entry object at the top of the hierarchy in Clusterlib > and manages its Applications. There is only one Root per Clusterlib instance. > • Applications are used as a namespace for managing Groups, Nodes, > DataDistributions, Queues, and PropertyLists in a user-defined application. > Using the application concept (as opposed to only having groups) makes > accessing another Application’s child objects explicit to developers. > • Groups are a logical association of Clusterlib objects that can be nested. > Since large-scale applications often require hundreds or thousands of nodes > to operate, there might a “node” Group that has an “alive” child Group and a > “dead” child Group that are each populated with their respective sets of > nodes. > • DataDistributions balance load and data across a set of objects. > DataDistributions provide user-extensible key hashing to variable-sized hash > ranges for user flexibility. > • Nodes typically represent a physical or virtual node in an application. It > has child ProcessSlots that can be used to reserve system resources. > • ProcessSlots maintain an actual process running locally on the physical > machine. It can also contain other information about the process, such as a > PID or port array. > • PropertyLists may be created and maintained as a child of any Notifyable > object. It is basically a key-value storage that can, for instance, be used > to determine how long a timeout would be on a particular server or the number > of retries to allow before giving up. PropertyLists are leafs in the > Clusterlib hierarchy and cannot have any children. > • Queues are distributed FIFO queues. They can be used to synchronize > threads, pass messages between threads, and for JSON-RPC. > > Clusterlib objects are composed in a hierarchy and maintain ACID compliance. > Distributed, non-blocking, fault-tolerant locks can be acquired on any > Clusterlib object and asynchronous event handlers can be registered for > object-specific changes. For example, if a ProcessSlot changed, an > asynchronous event handler might check to see if the process is still running > and if not, try to restart it. There are 3 types of Clusterlib-defined locks > (child, notifyable, and ownership). Clusterlib internally uses a child lock > on a parent object to access child objects, however users may also use this > lock if desired. A notifyable lock is intended as a general-purpose lock on a > Notifyable. Finally, ownership locks are intended to express concepts suchs > as “leadership” in a Group or “reservation” of a Node. In order to allow more > parallelism, Clusterlib locks can be accessed in shared or exclusive modes. > > Since Clusterlib relies upon Zookeeper as a fault-tolerant, consensus > service, it inherits many of its performance and fault-tolerance properties. > As the number of Zookeeper servers increases, read performance scales up > nearly linearly, however write performance scales inversely due to > Zookeeper’s internal atomic broadcast protocol. As long as the number of > correctly functioning Zookeeper servers maintains a quorum, Zookeeper can > continue to operate. The same is true for Clusterlib applications. The locks > and leadership election algorithms in Clusterlib are fault-tolerant to client > failure due to the use of Zookeeper ephemeral nodes. > > In addition to being a library, Clusterlib comes with a http server to > viewing/manipulating Clusterlib objects and/or ZooKeeper znodes directly. > I've linked some PNGs to illustrate this. It also is bundled with a CLI that > is extensible. We have also developed a suite of over 90 unittests that > simulate distributed event ordering using MPI to test for many of those > hard-to-find distributed bugs. It's been tested to build on flavors of > Redhat Linux, Ubuntu Linux, and OSX. > > We would like to see it as a subproject of ZooKeeper because its tightly > integrated with ZooKeeper. What do folks think about Clusterlib as a > subproject of ZooKeeper? > > Thanks, > > Avery > > Clusterlib-UI snapshot link > http://users.eecs.northwestern.edu/~aching/clusterlib-ui.png > > ZooKeeper-UI snapshot link > http://users.eecs.northwestern.edu/~aching/zookeeper-ui.png > > --- Milind Bhandarkar mbhandar...@linkedin.com