[
https://issues.apache.org/jira/browse/HDDS-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317966#comment-17317966
]
István Fajth commented on HDDS-4944:
------------------------------------
Hi [~ppogde],
thank you very much for taking this effort on, and to work on figuring out the
requirements and share it with us along with plans, and a lot of useful
information to understand what this is all about.
I see one big confusing thing in the proposal and the requirements, it is
related to cross tenant access, let me discuss this first.
In my view, cross-tenant access in an account namespace isolated environment is
similar to cross-site scripting, with that a possible administration and
security nightmare, while also it is the most complex part of this proposal,
with that I would drop it if there aren't any really good reasons for having it
implemented.
A few problems that I see:
- according to requirement 2/h/v. tenant admins should be able to see modify
and delete their tenant's users' data. This implies possibly unwanted cross
tenant access for tenant admins in tenants where there is a user who accesses
an other tenant's resources
- if a tenant admin can allow access for a user in an other tenant, that would
mean that a tenant admin can access an other tenant's user object(s)
- if a user from tenant a can have write access to tenant b, then tenant b QoS
quotas will be affected by a user in tenant a, also tenant a quotas should
aggregate usage of the user in tenant b which makes enforcement and calculation
of these way too heavy, or very limited and exploitable
- it is a problem to figure out how to isolate comfortably the access policies
and tie it to a user, when the user exists in one tenant only, but is has to be
harmonized with general policies in the other tenant, not to mention
groups/roles associated to the user that might exists just in one tenant, and
does not have a real meaning in the other
- applying different password policies is really complex, especially if user
lifecycle events apply to tenant users differently tenant by tenant and has
different enforcement rules... (what happens if one tenant uses kerberos auth,
and does not care much about aws keys, while the other uses aws style access
exclusively... who can administer this for the user, where the user will
configure the different access patterns? etc.)
I would like to clear one question related to super-admins, I believe we should
explicitly declare that these type of admins should not have access to any
tenant data on their own, but are just have the ability to administer tenant's,
tenant admins, and QoS limit enforcements on tenants, and can see the list of
tenants the QoS usage statistics of tenants, and maybe the list of tenant
users, but they should not have access to tenant's data/metadata.
Also I think it would be beneficial to make it clear that we think about the
volumes as the abstraction provides bucket namespace isolation, and we do not
want to develop an other abstraction on top of it, though for ease of use we
might introduce aliases that reflect multi-tenancy terminology better.
There is one problem, 3/c in the features and limitations section, global
bucket namespace... If we have a well separated account namespace isolation
that does not get in our way we can use it on top of one volume probably.
I am full on with Marton that we should separate the problem to more parts and
discuss each of this separately to figure out the gaps, and possible
implementation approaches with benefits and drawbacks. Marton talked about
bucket and account namespace isolation, I would like to add to that list. I
would cut out the QoS, and tenant in tenant (hierarchy of tenants) related
things.
Bucket namespace isolation: I agree, let's use the volumes for that.
Account namespace isolation: I agree with Marton, let's try to think about this
as an addon, that brings in complexity, I think maybe the best is to have
Hadoop style SIMPLE auth as the default, and even kerberos auth I belive should
go into a plugin implementation, with that for this refactor we already have 2
authentication styles, we can add SAML, we can add oAuth, we can add any other
authentication model, and we can leave the implementation of account namespace
isolation to that. I imagine authorization should be also managed by this part,
and should be separated behind an interface that we can use, and provided as
pluggable modules. This means that we could be able to separate fully this
concern. This will have interdependencies with auditing also, but I believe we
can define interactions within these three modules independently from the rest
of the code, and add as a decoration to our real internal business logic
instead of hardwiring in. I know this is a really significant work, but by just
simple extracting the authentication out is a good start.
QoS: I think about this similarly to authentication, authorization or even
auditing... This should be a pluggable thing, and the default system should not
really know much about it, neither should care about the implementation. I
believe at this stage we talk about quota enforcement mainly as there aren't
much more we can provide as part of QoS, but if we start to think about
extending this, we should also think about implementing this as a pluggable
something that can decorate the core business logic, but does not directly
affect it.
Hierarchy of tenants: as this is extremely low prio, let's not think about this
too much, authentication wise, this might be a group, and bucket's wise, we may
introduce a bucket group, but as this has things to do with QoS, authentication
and authorization, this might be a new pluggable element that can be developed
independently from other parts of the system, if our design is correct.
All in all, besides the feature gaps, I think we have an architectural gap,
that we might need to solve to arrive to an implementation, that does not
complicate our core logic, if it is not necessary, and extract good parts of
the current implementation into replaceable/interchangeable/combinable
subsystems.
As I understand the proposal is suggesting to introduce tenant as an
abstraction because of this major reason:
{quote}
All of the above requirements fall into a well known industry abstraction of
multi-tenancy, where we want to provide resource isolation of all kinds between
different sub organizations. The proposal is to add this same and well known
abstraction/terminology of “multi-tenant” environment for Ozone and fill the
gaps as appropriate.
{quote}
I believe it is a good motivation to introduce an industry known naming
convention (even if we are think about just aliases), but I fear that based on
the docs that we are looking at this as new abstractions in our codebase,
however I believe we have the concepts to support bucket namespace isolation,
we have the possibility to bring in account namespace isolation but we do not
have the obligation to create one, just to allow the usage of one, similarly
with authorization and QoS. So for me it is important to think about account
namespace isolation as something that is an addon, and not something for which
the core code has a "simple", "limited" implementation, we have it, it has one
account namespace, and has SIMPLE authentication. Kerberos auth in my beliefs
should also just be an addon replacing the simple authentication, and should
not have traces in our core logic, just in our configuration and
initialization. (Maybe this is true now, but I am not sure :) )
> Multi-Tenant Support in Ozone
> -----------------------------
>
> Key: HDDS-4944
> URL: https://issues.apache.org/jira/browse/HDDS-4944
> Project: Apache Ozone
> Issue Type: New Feature
> Components: Ozone CLI, Ozone Datanode, Ozone Manager, Ozone Recon,
> S3, SCM, Security
> Affects Versions: 1.2.0
> Reporter: Prashant Pogde
> Assignee: Prashant Pogde
> Priority: Major
> Labels: pull-request-available
> Attachments: Apache-S3-compatible-Multi-Tenant-Ozone-short.pdf.gz,
> Ozone MultiTenant Feature _ Requirements and Abstractions-3.pdf, Ozone,
> Multi-tenancy, S3, Kerberos....pdf, UseCaseAWSCompatibility.pdf,
> UseCaseCephCompatibility.pdf, UseCaseConfigureMultiTenancy.png,
> UseCaseCurrentOzoneS3BackwardCompatibility.pdf, VariousActorsInteractions.png
>
>
> This Jira will be used to track a new feature for Multi-Tenant support in
> Ozone. Initially Multi-Tenant feature would be limited to ozone-users
> accessing Ozone over S3 interface.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]