On Wed, Jan 12, 2011 at 10:18 AM, Avery Ching <ach...@yahoo-inc.com> wrote:
> Thanks for the suggestions on http://incubator.apache.org/
> The reason why we thought it would be best as ZooKeeper subproject was
> because it is heavily dependent on ZooKeeper.

Subproj is fine if that's the way you want to go, just highlighting
these other possibilities.

> As for libmicrohttpd's LGPL, sorry if it wasn't more clear in the README,
> but we only link to it, we do not include the source code for libmicrohttpd.
>  libmicrohttpd is only required if you want to build the Clusterlib http
> server.

Seems to me though that the UI is pretty useful, would be a good idea
to move to a category A license soonish.

I thought the fact that you detailed the license situation was great,
very helpful. Might be good to break down into sections; core, UI, ...
and be more explicit.

You should also take a look at Apache RAT (release audit tool), it can
scan your code for conformance to apache license guidelines, and look
for prohibited licenses, etc... http://incubator.apache.org/rat/

Patrick

> Avery
> On Jan 12, 2011, at 8:53 AM, Patrick Hunt wrote:
>
> Hi Avery, clusterlib looks like some great functionality, I don't see
> why we couldn't include it as a subproject (see one caveat I noticed
> below). I'd also like to point out that incubator is also a great
> option for the project. http://incubator.apache.org/ , have you
> considered that?
>
> According to the readme on GH a dependency exists on "libmicrohttpd"
> which is LGPL licensed. Unfortunately we (apache projects) cannot
> include LGPL licensed code, see "category X" here
> http://www.apache.org/legal/3party.html This dependency would have to
> be removed prior to adding the subproject.
>
> Regards,
>
> Patrick
>
> On Tue, Jan 11, 2011 at 5:34 PM, Avery Ching <ach...@yahoo-inc.com> wrote:
>
> Sorry for the delay (meetings). I just threw it up on GitHub.
>
> https://github.com/aching/Clusterlib
>
> Enjoy!
>
> Avery
>
> On Jan 11, 2011, at 3:42 PM, Fournier, Camille F. [Tech] wrote:
>
> Is the code somewhere we can look at it right now?
>
> C
>
> -----Original Message-----
>
> From: Avery Ching [mailto:ach...@yahoo-inc.com]
>
> Sent: Tuesday, January 11, 2011 2:02 PM
>
> To: dev@zookeeper.apache.org
>
> Subject: Discussion - Clusterlib as a subproject for ZooKeeper
>
> Hello,
>
> We have been working on Clusterlib at Yahoo! and would like to contribute it
> as a subproject to ZooKeeper.  Clusterlib was developed as a next-generation
> platform for creating/coordinating search applications/services (including
> crawling, processing, indexing, and front end) at Yahoo!.  We suspect much
> of this work will be useful for others trying to build up
> large-scale/distributed applications that would like to coordinate and share
> the same semantics.
>
> Here is a (relatively) short summary of why Clusterlib was developed:
>
> Large-scale distributed applications are difficult and time-consuming to
> develop since a great deal of effort is spent solving the same
>
> challenges (consistency, fault-tolerance, naming problems, etc.).
>  Additionally, coordinating these applications is typically ad-hoc and
>
> hard to maintain.  Clusterlib fills the gap by providing distributed
> application developers with an object-oriented data model,
>
> asynchronous event handling system, well-defined consistency semantics, and
> methods for making coordination easy across
>
> cooperating applications.  Some example applications might include a search
> engine, scalable file system, large-scale data cache, etc.
>
> Clusterlib is a middleware library for building distributed applications. It
> was designed to simplify the job of application developers and provides a
> set of distributed objects that all inherit from the same Notifyable
> interface. The set of distributed objects includes: Root, Application,
> Group, DataDistribution, Node, ProcessSlot, PropertyList, and Queue. In
> order to give context, each object is described briefly.
>
> * Root is a point-of-entry object at the top of the hierarchy in Clusterlib
> and manages its Applications. There is only one Root per Clusterlib
> instance.
>
> * Applications are used as a namespace for managing Groups, Nodes,
> DataDistributions, Queues, and PropertyLists in a user-defined application.
> Using the application concept (as opposed to only having groups) makes
> accessing another Application's child objects explicit to developers.
>
> * Groups are a logical association of Clusterlib objects that can be nested.
> Since large-scale applications often require hundreds or thousands of nodes
> to operate, there might a "node" Group that has an "alive" child Group and a
> "dead" child Group that are each populated with their respective sets of
> nodes.
>
> * DataDistributions balance load and data across a set of objects.
> DataDistributions provide user-extensible key hashing to variable-sized hash
> ranges for user flexibility.
>
> * Nodes typically represent a physical or virtual node in an application. It
> has child ProcessSlots that can be used to reserve system resources.
>
> * ProcessSlots maintain an actual process running locally on the physical
> machine. It can also contain other information about the process, such as a
> PID or port array.
>
> * PropertyLists may be created and maintained as a child of any Notifyable
> object. It is basically a key-value storage that can, for instance, be used
> to determine how long a timeout would be on a particular server or the
> number of retries to allow before giving up. PropertyLists are leafs in the
> Clusterlib hierarchy and cannot have any children.
>
> * Queues are distributed FIFO queues. They can be used to synchronize
> threads, pass messages between threads, and for JSON-RPC.
>
> Clusterlib objects are composed in a hierarchy and maintain ACID compliance.
> Distributed, non-blocking, fault-tolerant locks can be acquired on any
> Clusterlib object and asynchronous event handlers can be registered for
> object-specific changes. For example, if a ProcessSlot changed, an
> asynchronous event handler might check to see if the process is still
> running and if not, try to restart it. There are 3 types of
> Clusterlib-defined locks (child, notifyable, and ownership). Clusterlib
> internally uses a child lock on a parent object to access child objects,
> however users may also use this lock if desired. A notifyable lock is
> intended as a general-purpose lock on a Notifyable. Finally, ownership locks
> are intended to express concepts suchs as "leadership" in a Group or
> "reservation" of a Node. In order to allow more parallelism, Clusterlib
> locks can be accessed in shared or exclusive modes.
>
> Since Clusterlib relies upon Zookeeper as a fault-tolerant, consensus
> service, it inherits many of its performance and fault-tolerance properties.
> As the number of Zookeeper servers increases, read performance scales up
> nearly linearly, however write performance scales inversely due to
> Zookeeper's internal atomic broadcast protocol. As long as the number of
> correctly functioning Zookeeper servers maintains a quorum, Zookeeper can
> continue to operate. The same is true for Clusterlib applications. The locks
> and leadership election algorithms in Clusterlib are fault-tolerant to
> client failure due to the use of Zookeeper ephemeral nodes.
>
> In addition to being a library, Clusterlib comes with a http server to
> viewing/manipulating Clusterlib objects and/or ZooKeeper znodes directly.
>  I've linked some PNGs to illustrate this.  It also is bundled with a CLI
> that is extensible.  We have also developed a suite of over 90 unittests
> that simulate distributed event ordering using MPI to test for many of those
> hard-to-find distributed bugs.  It's been tested to build on flavors of
> Redhat Linux, Ubuntu Linux, and OSX.
>
> We would like to see it as a subproject of ZooKeeper because its tightly
> integrated with ZooKeeper. What do folks think about Clusterlib as a
> subproject of ZooKeeper?
>
> Thanks,
>
> Avery
>
> Clusterlib-UI snapshot link
>
> http://users.eecs.northwestern.edu/~aching/clusterlib-ui.png
>
> ZooKeeper-UI snapshot link
>
> http://users.eecs.northwestern.edu/~aching/zookeeper-ui.png
>
>
>
>
>
>

Reply via email to