I'm OK with finding a better nomenclature but you seem to be suggesting a different, and perhaps better, idea.
In my original thinking nodes I use the following logical statements: - A cluster has M subclusters - A subcluster id=i has Ni nodes - There are a total of sum{i=0..M-1}(Ni) nodes - A node is part of one and only one subcluster - An app instance is activated in one and only one subcluster - A subcluster can have several app instances (there is no one-to-one relationship between subcluster and apps) Base on Karthik comments, it sounds like he is proposing to create a subcluster for each app instance, that is the subcluster only exists for a specific app instance. I hadn't thought about that but it might not be a bad idea. In that case a cluster is a collection of subclusters where each subcluster is created and destroyed when an app is created and destroyed. I kind of like this, is there a downside? For example, to run a query to analyze RT data, we may want to deploy an App. Would the overhead of creating a subcluster every time be a problem compared to deploying to an existing logical subcluster? Thoughts?? -leo On Tue, Mar 27, 2012 at 3:03 PM, Karthik Kambatla <kkamb...@cs.purdue.edu>wrote: > Indeed the demo was significant progress, now we can actually start > thinking of releasing the piper. > > The notion of "Cluster" and "Sub-cluster" makes absolute sense, but the > nomenclature is confusing. Can we name them differently - for instance, S4 > (physical) Cluster and S4 App-Cluster, or something completely different? > Each sub-cluster (app-cluster) can be identified by the App ID. We can use > AppID for routing to a sub-cluster (app-cluster). > > Thanks > Karthik > > On Tue, Mar 27, 2012 at 1:18 PM, Leo Neumeyer <leoneume...@gmail.com> > wrote: > > > Matthieu, > > > > Congrats again for the great demo using S4 across subclusters and the > > PlayFramework-like command line tool to manage the cluster and deploy > apps. > > We are basically providing a full stack solution so users don't have to > > spend time getting systems to work together. > > > > Here is a summary of today's comments. Please let me know if you agree or > > if you propose a different approach. Let's make sure we are all in sync. > > > > * We need to be able to extract routing info from messages and > > deserialize the Event at a higher level where we know the Event Class. > > > > * Definition of "S4 Cluster": a collection of S4 sub-clusters. > > > > * Definition of "S4 Sub-cluster": A collection of nodes (each > sub-cluster > > has its own partitions). > > > > * All nodes in an "S4 Cluster" are symmetric (same code image in ALL > > nodes across sub-clusters.) > > > > * For total isolation, send event stream to another "S4 Cluster". > > > > * A specific app is deployed to ALL nodes in the "S4 Cluster" but > > activated on a specific "S4 Sub-cluster". This provides flexibility to > > assign resources to groups of Apps while maintaining simplicity via local > > method calls. (Remote invokation is also handled by the platform in the > > same way and transparently.) > > > > * Each Node has a local collection of Sender instances, one for for each > > sub-cluster in the cluster. > > > > * When sending an Event to an App, we use runtime tables to lookup the > > sub-cluster and the corresponding Sender. Once we know the Sender, we > > simply use it to send the Event using local method calls. All the > > complexity is abstracted out. > > > > * Use EventSource to avoid application dependencies. Apps subscribe to > > data sources. The event source is looked up when an App is loaded using > > runtime tables. > > > > There are additional comments in the Jira. > > > > If you are reading this and would like to contribute to the project, > please > > email us! Let us know if you would like to join our weekly Google hangout > > where we discuss the project. > > > > -- > > > > Leo Neumeyer (@leoneu) > > > -- Leo Neumeyer (@leoneu)