Hi all,

2012/1/30 Marcel Offermans <[email protected]>:
>
> Some of us (including me) see a tenant as a logical entity that
> (possibly) spans multiple nodes. So one can say "Tenant X is deployed
> to nodes A, B, C". This allows applications (eg. Amdatu Big Data) to
> use the tenant.pid as an identifying key when creating a keyspace for
> Tenant X. This strategy is  something we will discourage in 1.0 but
> that is besides the point for now.
>
>
> After a voice conference on Skype, Bram, Jan Willem and I just agreed that
> the tenant.pid is globally unique and only of interest to the Amdatu
> Platform and therefore cannot be used as an identifying key of a storage
> mechanism of an application (see the footnote below, or
> click http://www.amdatu.org/confluence/display/Amdatu/Amdatu+Core+Tenant+Design if
> you don't want to scroll down (the yellow box is our summary of this
> discussion)). Applications should use their own configuration to configure
> the name of (in this case) a keyspace.

I'm just gonna go on record here. I send my reply on this thread after
that call and said:

> On Jan 30, 2012, at 12:51 , Bram de Kruijff wrote:
>
> Although the discussion about the exact meaning op tenant.pid not yet
> finalized entirely it is very important to realize what interpretation
> is being used by whom when they reply. So I propose to explicitly
> mention whether you are talking about local or distributed events.

IMHO it is not finalized and there more I think about it I tend to
disagree. I did agree to the fact stated by Jan-Willem that
applications should not use it as on identifier but rather prefer
their own configuration space.

My concerns from two are two angels:

1) Why the tenant.pid should be globally unique?

The reason for a globally unique seems to be MT-UC10 which you
interpret as being able to copy a tenant including all it's data from
disk. In this case you argue that thus we need an identifiable data
directory, the tenant.pid should be used for that and thus it must be
globally unique. Some remarks on this:

* The assumption is that the existing tenant.pid must be used for the
directory. I do not see why, if we need it at all, it could not be
tenant.globallyuniquedirname and in that way not enforcing the
constraint on tenant.pid

* I can see that in simple hosted cases this is a backup/move
strategy. I do not thinks it will be a viable in large distributed
(cloud) deployments where a sysop would go and scp this stuff around.
In the end I think there should be a service for that (think Android
BackupAgent for example).

* There are probably provisioning issues with this strategy anyway.
Moving a full data dir which includes configuration data from AMA1 to
AMA2 will surely mess up the AMS administration.

* As you mentioned this globally unique tenant.pid must be specified
up front. Consider a deployment with 10 "customers" with ditributed
deployment over 100 nodes. Will we expect application managers/sysops
to come up with 1000 unique identifiers? I think the it an
implementation concern cause the actors think in "customers",
"applications", maybe "subsytems" and where they are deployed. (using
"customer" here where I would normally use tenant).

* Another issue with the need of coming up with unique identifiers may
be in elastics scaling scenario's. Will we ask application manager to
specify globally unique identifiers for things that may er may not be?

* Now if there is a service that handles tenant (selective) data
import/export from a node all of a sudden we do not need the unique
dirname to be known. It doesn't even need to be globally unique. It is
just an system identifier. It is strategy based and we do not make
assumptions on data dir storage strategy.


2) Why the tenant.pid should NOT be globally unique?

I my view Tenant is a logical concept at the application layer. A
tenant is a logical (virtual) instance of an application. An
application is something end customer wants, say SalesForce or in our
case BlueConic. It's architecture is multi-tenant meaning that the
hosting party can let multiple customers enjoy that application on a
single (distributed) deployment. The customer is in no way concerned
with how it is deployed just QoS. The hosting party is concerned with
keeping promises at at little cost as possible. Amdatu Platform
provides a way for a logical application to dynamically scale-out/in
meaning that it's physical deployment is not limited to one cloud
node. It lives in a distributed deployment. IMHO this is what
cloud/scale-out means. If you can not scale out, tie an application to
one physical node, forget about it. Therefore I see it as an integral
part of even multi-tenancy design and claim Tenant to be that logical
entity not an implemenation detail. Some notes:

* From an application management perspective one manages tenants.
Tenant have configuration, QoS contracts etc. At least that is how I
think about it, I think that is the common widely accepted
interpretation and that is how we use the concept in BlueConic. From
this perspective it seem illogical to me to use this term for a
deployment detail identifying a process on a node.

* Having a unique tenant identifier in a distributed system running
subsystems on multiple nodes makes sense. They share (partial)
configuration, it allows question as "Where is tenant X deployed",
"What are the performance metrics for Tenant X" etc. We could
introduce a new concept like "Customer" or whatever, having a 1-X
mapping to tenants but see my previous point.

* Having a Tenant deployed to a Node already provides a unique
identifier "TenantX@NodeY" which at a system level can be associated
with a globally unique identifier  if required for a particular use
case.

* The definition and use of the concept Tenant as I describe it here
is how we effectively see and use Multi-Tenancy right now at GX in our
BlueConic product. I am opposed of changing that whithout a very good
reason.


Bottomline, I do not believe we need a logical globally unique
identifier. At least you have not shown me a concrete use case for it
that can not be easily solved in another way. We may need a
(generated) system identifier but we should not hijack and remove a
higher level concept.


Best Regards,
Bram

_______________________________________________
Amdatu-developers mailing list
[email protected]
http://lists.amdatu.org/mailman/listinfo/amdatu-developers

Reply via email to