Re: [Amdatu-developers] Amdatu Platform Multi-Tenancy model feedback/reviews wanted

Marcel Offermans Mon, 30 Jan 2012 05:21:48 -0800

On Jan 30, 2012, at 12:51 , Bram de Kruijff wrote:

> I'm not sure we are on the same page yet :)
> 
> On Fri, Jan 27, 2012 at 1:49 PM, Marcel Offermans
> <[email protected]> wrote:
>> Hello Ivo,
>> 
>> First of all, I'm not sure what your e-mail app is doing to the quoted text, 
>> but it's coming out all garbled here.
>> 
>> On Jan 27, 2012, at 13:34 PM, Ivo Ladage-van Doorn wrote:
>> 
>>> I thought the WIKI page 
>>> http://www.amdatu.org/confluence/display/Amdatu/Amdatu+Core+Tenant+Use+Cases
>>>  described the use cases that the platform is going to support. If you say 
>>> "it's up to the application to implement it", the platform must by design 
>>> support it and be able to tell me how to implement it. If the answer is 
>>> that that is not the concern of the platform, that's the same as saying you 
>>> do not support these use cases.
>> 
>> The platform implements and therefore supports those use cases.
> 
> Agreed, we implement these use-cases as described. However, there may
> still be some different interpretations of what they mean/cover
> exactly. In particular there has been some confusion on what the
> "scope" of a tenant is.
> 
> Some of us (including me) see a tenant as a logical entity that
> (possibly) spans multiple nodes. So one can say "Tenant X is deployed
> to nodes A, B, C". This allows applications (eg. Amdatu Big Data) to
> use the tenant.pid as an identifying key when creating a keyspace for
> Tenant X. This strategy is  something we will discourage in 1.0 but
> that is besides the point for now.


After a voice conference on Skype, Bram, Jan Willem and I just agreed that the 
tenant.pid is globally unique and only of interest to the Amdatu Platform and 
therefore cannot be used as an identifying key of a storage mechanism of an 
application (see the footnote below, or click 
http://www.amdatu.org/confluence/display/Amdatu/Amdatu+Core+Tenant+Design if 
you don't want to scroll down (the yellow box is our summary of this 
discussion)). Applications should use their own configuration to configure the 
name of (in this case) a keyspace.

> The relevant part is that in this
> context "tenants get created or destroyed" are distributed events NOT
> simple OSGi service lifcycle events.

Which would be true if tenants spans more than a single container, but they 
don't.

> Others see a tenant as a local (to one container) uniquely identified
> virtual application. So one can NOT say "Tenant X is deployed to nodes
> A, B, C". In this context a Tenant is always bound to ONE container.
> In this context "tenants get created or destroyed" are can be mapped
> to simple OSGi service lifcycle events.

In this case it is a local event, but...

It still cannot be mapped to service lifecycle events, because services are 
stopped when a framework is stopped, but removing a tenant is something 
different: it is the result of a specific action that depends on how tenants 
are actually created and destroyed (by default we use a managed service factory 
for that).

> Although the discussion about the exact meaning op tenant.pid not yet
> finalized entirely it is very important to realize what interpretation
> is being used by whom when they reply. So I propose to explicitly
> mention whether you are talking about local or distributed events.

I'll speak for myself, I was talking about local events (as mentioned above).

>> Jan Willem already replied with a proposal to send out events whenever 
>> tenants get created or destroyed. That will allow you to create and destroy 
>> your keyspace (by subscribing to those events).
> 
> I guess this resulted in use case MT-UC11? Not sure what this entails,
> but I an under the strong impression that Jan-Willem is talking about
> local created and destroyed events which probably is not much more
> then service events. Also I am under the impression that Ivo is
> talking about distributed created and destroyed events...
> 
>>> I think the following Use Cases are crucial for using Amdatu in combination 
>>> with Cassandra:
>>> 
>>> Use case 5 (MT-UC5):
>>> As a developer, I want to be able to add/remove tenants on the go.
>>>> Indeed, using Cassandra I want to create a new keyspace and associate it 
>>>> with a new tenant and delete the tenant when it is deleted.
>> 
>> See above.
> 
> OK, this will probably work in a distributed setting using a lazy
> check/create approach. It will probably need to be changed to a
> configuration instead of using the tenant.pid as an identifier in
> order to decouple it from multi-tenancy
> 
> 
>>> Use case 9 (MT-UC9):
>>> As a developer, I want to be able to purge all platform-specific data of a 
>>> (former) tenant without other influencing other tenants.
>>>> Yes, upon a purge I will delete the keyspace
>> 
>> See above.
> 
> NO, I do not believe Jan-Willem's  MT-UC11 covers in any way the
> complexity of a distributed "purge tenant", simply because he doesn't
> see a tenant a something distriubted? Please correct me if I am
> wrong!.
> 
> So in a local context one could choose to interpret the removal of a
> tenant service or maybe the uninstall of the bundle as a "purge
> tenant". But that is something the platform can not enforce beyond
> deleting the bundle data area.
> 
> In a distributed context you can still do the same interpretation but
> what will happen to your clustered cassandra and application state?
> It's complex and certainly not something the platform can decide
> generically.
> 
>>> Use case 10 (MT-UC10):
>>> As a developer, I want to be able to migrate a tenant from one container to 
>>> another
>>>> In  a Cassandra cluster there is no way you can have a different set of 
>>>> keyspaces per node. You could have several clusters though, but a single 
>>>> Cassandra node cannot join multiple clusters. In a Cassandra cluster the 
>>>> data being stored is also not really associated with a physical machine; 
>>>> it's out somewhere but it's up to Cassandra to decide where to store it 
>>>> (even data in a single keyspace may distributed over several nodes). So 
>>>> this use case doesn't make sense with Cassandra.
>>> However, if you're talking about data isolation and Cassandra, you should 
>>> be able to define multiple Cassandra clusters, such that data within one 
>>> cluster is 'isolated' from the other clusters. So the use case should be; " 
>>> As a developer, I want to be able to migrate a tenant from one cluster to 
>>> another".
>> 
>> The platform does not have the notion of a cluster.
>> 
>> We know about containers, and (looking ahead at provisioning support in the 
>> platform) we decided to add this use case:
>> 
>> For normal bundles, if we take a bundle and its data area, and install it in 
>> a new container, it will keep its state. We wanted to make sure the 
>> experience is the same for tenant aware bundles.
>> 
>> If moving that data area is enough for Cassandra to migrate data from one 
>> cluster to the other, great. If not, you need to work on a solution yourself.
> 
> No, it's not for any but the most trivial use cases. IMHO backing up
> the bundle data in general is.
> 
>>> Use case 11 (MT-UC11):
>>> As a developer, I want to implement services that can be notified in case a 
>>> tenant is created or destroyed.
>>>> If with 'destroyed' you mean 'deleted', then this is exactly what I need.
>> 
>> Yes, see above, destroyed is deleted.
> 
> NO, I do not believe Jan-Willem's  MT-UC11 covers in any way the
> complexity of a distributed "purge tenant", simply because he doesn't
> see a tenant a something distributed?. Please correct me if I am
> wrong.
> 
> Bottomline the way I see it:
> 
> 1) The core platform does not add any extensions to the existing
> bundle/service lifecycles & (configuration) events.

See above, I do think we need specific events for creation and deletion of a 
tenant, because they do not align with service events, and the fact that they 
are now aligned with configuration events is just an implementation detail.

> Only bundle
> storage is cleaned upon bundle uninstall. What the semantics of these
> events are with regard to distributed or persisted state is up to the
> application developer.

We need a tenant specific subfolder of the bundle storage to be deleted when 
the tenant is deleted. If we postpone it to the uninstallation of the bundle, 
you can never "garbage collect" tenants and end up with bundles that over time 
leak disk space.

> 2) The core platform does not have a runtime notion of distributed
> state. Both current fileinstall and future provisioning bring just
> asynchronous configuration and software updates. The core platform
> does not orchestrate distributed (un)deployments nor does it provide
> any distributed state or events.

Agreed.

> 3) Thus, for now applications that required distributed state
> management must provide it themselves. Service Fabric may bring in
> some infrastructure to ease parts of these tasks (eg pub/sub), but how
> exactly is not yet clear.

Agreed.

> While I was typing this Jan-Willem has elaborated a little on this
> matter at the wiki [0]. Please have a look at that cause it may helps
> us all to get a grip on this rather complex matter.

Yes, it's a good summary of our talks. We probably do want to add that we will 
allow people to specify the pid upfront to ease with management tasks and with 
UC-10.

Greetings, Marcel

_______________________________________________
Amdatu-developers mailing list
[email protected]
http://lists.amdatu.org/mailman/listinfo/amdatu-developers

Re: [Amdatu-developers] Amdatu Platform Multi-Tenancy model feedback/reviews wanted

Reply via email to