I'm OK with finding a better nomenclature but you seem to be suggesting a
different, and perhaps better, idea.

In my original thinking nodes I use the following logical statements:

- A cluster has M subclusters
- A subcluster id=i has Ni nodes
- There are a total of sum{i=0..M-1}(Ni) nodes
- A node is part of one and only one subcluster
- An app instance is activated in one and only one subcluster
- A subcluster can have several app instances (there is no one-to-one
relationship between subcluster and apps)

Base on Karthik comments, it sounds like he is proposing to create a
subcluster for each app instance, that is the subcluster only exists for a
specific app instance. I hadn't thought about that but it might not be a
bad idea.

In that case a cluster is a collection of subclusters where each subcluster
is created and destroyed when an app is created and destroyed. I kind of
like this, is there a downside? For example, to run a query to analyze RT
data, we may want to deploy an App. Would the overhead of creating a
subcluster every time be a problem compared to deploying to an existing
logical subcluster?

Thoughts??

-leo

On Tue, Mar 27, 2012 at 3:03 PM, Karthik Kambatla <kkamb...@cs.purdue.edu>wrote:

> Indeed the demo was significant progress, now we can actually start
> thinking of releasing the piper.
>
> The notion of "Cluster" and "Sub-cluster" makes absolute sense, but the
> nomenclature is confusing. Can we name them differently - for instance, S4
> (physical) Cluster and S4 App-Cluster, or something completely different?
> Each sub-cluster (app-cluster) can be identified by the App ID. We can use
> AppID for routing to a sub-cluster (app-cluster).
>
> Thanks
> Karthik
>
> On Tue, Mar 27, 2012 at 1:18 PM, Leo Neumeyer <leoneume...@gmail.com>
> wrote:
>
> > Matthieu,
> >
> > Congrats again for the great demo using S4 across subclusters and the
> > PlayFramework-like command line tool to manage the cluster and deploy
> apps.
> > We are basically providing a full stack solution so users don't have to
> > spend time getting systems to work together.
> >
> > Here is a summary of today's comments. Please let me know if you agree or
> > if you propose a different approach. Let's make sure we are all in sync.
> >
> >  * We need to be able to extract routing info from messages and
> > deserialize the Event at a higher level where we know the Event Class.
> >
> >  * Definition of "S4 Cluster": a collection of S4 sub-clusters.
> >
> >  * Definition of "S4 Sub-cluster": A collection of nodes (each
> sub-cluster
> > has its own partitions).
> >
> >  * All nodes in an "S4 Cluster" are symmetric (same code image in ALL
> > nodes across sub-clusters.)
> >
> >  * For total isolation, send event stream to another "S4 Cluster".
> >
> >  * A specific app is deployed to ALL nodes in the "S4 Cluster" but
> > activated on a specific "S4 Sub-cluster". This provides flexibility to
> > assign resources to groups of Apps while maintaining simplicity via local
> > method calls. (Remote invokation is also handled by the platform in the
> > same way and transparently.)
> >
> >  * Each Node has a local collection of Sender instances, one for for each
> > sub-cluster in the cluster.
> >
> >  * When sending an Event to an App, we use runtime tables to lookup the
> > sub-cluster and the corresponding Sender. Once we know the Sender, we
> > simply use it to send the Event using local method calls. All the
> > complexity is abstracted out.
> >
> >  * Use EventSource to avoid application dependencies. Apps subscribe to
> > data sources. The event source is looked up when an App is loaded using
> > runtime tables.
> >
> > There are additional comments in the Jira.
> >
> > If you are reading this and would like to contribute to the project,
> please
> > email us! Let us know if you would like to join our weekly Google hangout
> > where we discuss the project.
> >
> > --
> >
> > Leo Neumeyer (@leoneu)
> >
>



-- 

Leo Neumeyer (@leoneu)

Reply via email to