[ 
https://issues.apache.org/jira/browse/HDDS-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika reassigned HDDS-15215:
----------------------------------

    Assignee: Ivan Andika

> Multi tenancy in HDDS layer
> ---------------------------
>
>                 Key: HDDS-15215
>                 URL: https://issues.apache.org/jira/browse/HDDS-15215
>             Project: Apache Ozone
>          Issue Type: New Feature
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>
> This is just a loose idea.
> Currently multi tenancy in Ozone refers to namespace level multi tenancy 
> focusing on security while there are no concepts of a "tenant" in the HDDS 
> layer (although there is a concept of "owner" which is now set to the OM 
> service so that different OM services will not share the same container). 
> However, in storage system literature multi-tenancy refers to the ability of 
> storage system to isolate its "tenant" (each tenant might have different use 
> cases and performance characteristics) while at the same time use the same 
> shared resource pool. 
> Each tenant can have a different performance and availability requirements. 
> For example, big data offline processing / AI training requires high 
> throughput but not really latency sensitive, but online serving / AI 
> inference requires low latency, but the throughput / bandwidth is not as high 
> as offline processing.
> From Tectonic paper [1], there are two challenges regarding sharing resources
> * Tenants must share resources while giving each tenant its fair share (i.e. 
> at least the same resource
> * Tenants should be able to optimize performance as in specialized systems
> In Tectonic, they use a concept of TrafficGroup to try to isolate some 
> traffics
> The isolation can technically be supported by using a HDFS federation like 
> implementation (with or without RBF) we divide a single namespace into 
> multiple isolated Ozone clusters. However, although this supports isolation, 
> the clusters cannot share resource pools which increase the overall cost 
> (CAPEX + OPEX).
> Currently, we have a loose isolation logic in containers and pipelines for 
> HDDS layer, but any number of tenant in an OM service can share the same 
> container (i.e. there is no relation between namespace and block space) which 
> might affect tenant's performance. In my opinion we can think deeper on how 
> namespace and blockspace can relate to each other. We can see at the old 2015 
> Ozone presentation ([2]) which envisioned that one container is in charge of 
> a range of keys (among others like zero copy HDFS -> Ozone data migration). 
> The main point is that we can make Ozone container to carry more meaning than 
> simply "a collection of blocks".
> We can also extend Ozone volume to be unit of tenancy instead the current use 
> case of namespace isolation.
> Related resources
> [1] https://www.usenix.org/system/files/fast21-pan.pdf
> [2] 
> https://www.slideshare.net/slideshow/ozone-an-object-store-in-hdfs/49578502
> [3] https://kubernetes.io/docs/concepts/security/multi-tenancy/
> [4] https://docs.min.io/aistor/administration/multi-tenancy/
> [5] 
> https://learn.microsoft.com/en-us/azure/architecture/guide/multitenant/approaches/storage-data
> [6] https://www.cloudflare.com/en-gb/learning/cloud/what-is-multitenancy/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to