Hi all, 2012/1/30 Marcel Offermans <[email protected]>: > > Some of us (including me) see a tenant as a logical entity that > (possibly) spans multiple nodes. So one can say "Tenant X is deployed > to nodes A, B, C". This allows applications (eg. Amdatu Big Data) to > use the tenant.pid as an identifying key when creating a keyspace for > Tenant X. This strategy is something we will discourage in 1.0 but > that is besides the point for now. > > > After a voice conference on Skype, Bram, Jan Willem and I just agreed that > the tenant.pid is globally unique and only of interest to the Amdatu > Platform and therefore cannot be used as an identifying key of a storage > mechanism of an application (see the footnote below, or > click http://www.amdatu.org/confluence/display/Amdatu/Amdatu+Core+Tenant+Design if > you don't want to scroll down (the yellow box is our summary of this > discussion)). Applications should use their own configuration to configure > the name of (in this case) a keyspace.
I'm just gonna go on record here. I send my reply on this thread after that call and said: > On Jan 30, 2012, at 12:51 , Bram de Kruijff wrote: > > Although the discussion about the exact meaning op tenant.pid not yet > finalized entirely it is very important to realize what interpretation > is being used by whom when they reply. So I propose to explicitly > mention whether you are talking about local or distributed events. IMHO it is not finalized and there more I think about it I tend to disagree. I did agree to the fact stated by Jan-Willem that applications should not use it as on identifier but rather prefer their own configuration space. My concerns from two are two angels: 1) Why the tenant.pid should be globally unique? The reason for a globally unique seems to be MT-UC10 which you interpret as being able to copy a tenant including all it's data from disk. In this case you argue that thus we need an identifiable data directory, the tenant.pid should be used for that and thus it must be globally unique. Some remarks on this: * The assumption is that the existing tenant.pid must be used for the directory. I do not see why, if we need it at all, it could not be tenant.globallyuniquedirname and in that way not enforcing the constraint on tenant.pid * I can see that in simple hosted cases this is a backup/move strategy. I do not thinks it will be a viable in large distributed (cloud) deployments where a sysop would go and scp this stuff around. In the end I think there should be a service for that (think Android BackupAgent for example). * There are probably provisioning issues with this strategy anyway. Moving a full data dir which includes configuration data from AMA1 to AMA2 will surely mess up the AMS administration. * As you mentioned this globally unique tenant.pid must be specified up front. Consider a deployment with 10 "customers" with ditributed deployment over 100 nodes. Will we expect application managers/sysops to come up with 1000 unique identifiers? I think the it an implementation concern cause the actors think in "customers", "applications", maybe "subsytems" and where they are deployed. (using "customer" here where I would normally use tenant). * Another issue with the need of coming up with unique identifiers may be in elastics scaling scenario's. Will we ask application manager to specify globally unique identifiers for things that may er may not be? * Now if there is a service that handles tenant (selective) data import/export from a node all of a sudden we do not need the unique dirname to be known. It doesn't even need to be globally unique. It is just an system identifier. It is strategy based and we do not make assumptions on data dir storage strategy. 2) Why the tenant.pid should NOT be globally unique? I my view Tenant is a logical concept at the application layer. A tenant is a logical (virtual) instance of an application. An application is something end customer wants, say SalesForce or in our case BlueConic. It's architecture is multi-tenant meaning that the hosting party can let multiple customers enjoy that application on a single (distributed) deployment. The customer is in no way concerned with how it is deployed just QoS. The hosting party is concerned with keeping promises at at little cost as possible. Amdatu Platform provides a way for a logical application to dynamically scale-out/in meaning that it's physical deployment is not limited to one cloud node. It lives in a distributed deployment. IMHO this is what cloud/scale-out means. If you can not scale out, tie an application to one physical node, forget about it. Therefore I see it as an integral part of even multi-tenancy design and claim Tenant to be that logical entity not an implemenation detail. Some notes: * From an application management perspective one manages tenants. Tenant have configuration, QoS contracts etc. At least that is how I think about it, I think that is the common widely accepted interpretation and that is how we use the concept in BlueConic. From this perspective it seem illogical to me to use this term for a deployment detail identifying a process on a node. * Having a unique tenant identifier in a distributed system running subsystems on multiple nodes makes sense. They share (partial) configuration, it allows question as "Where is tenant X deployed", "What are the performance metrics for Tenant X" etc. We could introduce a new concept like "Customer" or whatever, having a 1-X mapping to tenants but see my previous point. * Having a Tenant deployed to a Node already provides a unique identifier "TenantX@NodeY" which at a system level can be associated with a globally unique identifier if required for a particular use case. * The definition and use of the concept Tenant as I describe it here is how we effectively see and use Multi-Tenancy right now at GX in our BlueConic product. I am opposed of changing that whithout a very good reason. Bottomline, I do not believe we need a logical globally unique identifier. At least you have not shown me a concrete use case for it that can not be easily solved in another way. We may need a (generated) system identifier but we should not hijack and remove a higher level concept. Best Regards, Bram _______________________________________________ Amdatu-developers mailing list [email protected] http://lists.amdatu.org/mailman/listinfo/amdatu-developers

