[
https://issues.apache.org/jira/browse/HDDS-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17332418#comment-17332418
]
Marton Elek commented on HDDS-4944:
-----------------------------------
Thanks for the answers [~ppogde]. I think you wrote it based on the comment
from [~pifta] but as was mentioned multiple times let me add my own comments.
bq. I believe, we did talk about the alternative that Marton brought up
regarding implementing multi-tenancy by running different instances of S3G for
each tenant.
This was just one of my questions, and I am very grateful that it's started to
collect all the pro and cons. (I don't remember that we did it thourghly in our
talks)
bq. It would further add burden to provide HA for S3G service because, now
every instance needs to have HA.
Can you please explain why? If the multi tenancy would be a s3g side
configuration, s3g is still a stateless service.
bq. It makes the multi-tenancy feature dependent on deployment.
I agree with this statement, this was definitely the goal.
bq. The onus of tracking which S3G instance serves which specific tenant, lies
with the customer and application.
I agree if it's a limitation, but please note that it's very close to what AWS
has with the DNS name bucket names (which is the default). Using dns the URL
of buckets became https://bucket1.endpoint and https://bucket2.endpoint. There
is no significant difference between this and using
https://bucket1.sales.endpoint and https://bucket1.marketnig.endpoint
> What we are proposing here, is Multi-tenancy as part of core Ozone-Manager
> module. Please check the recently uploaded picture
Using s3g side configuration value instead of user-specific field for
identifing the default s3 volume was just ONE example. My may concern was the
introduction of other abstraction level on top of volumes. Still we don't have
(or I didn't find / understand) proper pro/con comparison of introducing new
abstraction level or improving volume features.
bq. In the existing S3 design, access-key-id is tied to kerberos identity. It
doesn't have to be this way only. We are providing an API to Create S3-User as
if they do not have any kerberos identity at all. This is along the lines
As I already mentioned I am totally in to make the s3 secret management
kerberos independent. And I requested multiple times to separate the two
questions (tenancy and kerberos-free secret management). This suggestion was
ignored until now.
Please also note that the only dependency of the current solution on kerberos
is we use kerberos auth to create the table entries. After that it's just a
string which is used as UGI information. It may require very small modification
to make it kerberos free and may be possible without tent-level abstraction.
I'm proposing here -- again -- to separate the two questions.
bq. I believe, we should just take the username/access-id as provided by the
authentication system and we need not do another mapping from external to
internal id. There could be a valid use case that a user "bob" can authenticate
him through either kerberos or S3-shared secret. We should leave it to
system-admin as to how they want their user-ids to be unique (or duplicate)
while creating S3 identity or kerberos identity. Unless there is a a use-case
that we can not solve with what we have, we need not complicate this part,
I don't fully understand this section. Would be good to get more information
about this. I think it's not just the responsibility of the system-admin to
avoid name collising but we should carefully plan to avoid any security
problems based on name collisions.
bq. Looking at the code, I believe whatever we proposed in the design is doable.
There are more than one way to implement these features (tenancy + kerberos
free auth). The concern was not that it's impossible to do in this way, concern
was that we may have multiple other options which are more simple, more
maintainable and easier to implement. That's the reason why I suggested
extending the design doc and explaining the available options with pro/con and
explain why should we do the implementation in this way.
bq. Different multi-tenant-authorizer plugins can implement it in different
ways.
This is something which is not clear for me. I think neither bucket-namespace
isolation nor account-namespace isolation require additional level of
authorization. Based on my understanding this is some additional feature which
is suggested in addition of the isolations. Would be great to understand it
better. I already asked about more information about the ranger implementation
(used interface, functions of the non-ranger implementation). Let me know if
this information is available somewhere.
bq. Buckets can be refered by "tenant-name:bucketname" syntax and can be
accessed by anyone who has authorization to access it. There are other ways to
do it too. But for now, we plan to support "tenant-name:bucketname" convention.
Are we talking about s3 bucket names here? colon is not allowed in the bucket
name (AFAIK). Can we have more information about this mapping? (Is it for s3 or
ofs?)
*Summary*:
I really appreciate the invested effort to find out how multi tenancy can be
supported. I agree that we should provide a solution for multi tenancy (for me
it means bucket + account namespace isolation). But I have the same opinion as
previously:
1. The uploaded design doc is very high level and doesn't explain the proposed
solution in details
2. I suggested separating the kerberos-free authentication problem from
multi-tenancy.
3. The attached design doc doesn't explain why do we need to add more complex
abstraction level instead of re-using volumes. I would assume to explain
possible approaches (using volumes for both account + bucket namespace
isolation.
*Note*:
I created a video about using Hashicorp Vault as a secret manager:
https://www.youtube.com/watch?v=VlKfvA0TgPk&list=PLCaV-jpCBO8XFUxVFV-zUK76ycgb5Dh-X&index=1&t=6s
While Vault is an external (but simple) component for secret management it has
dozens of features: multiple types of storage, good integration with multiple
authentication mechanism, unsealing, ...)
I think we couldn't compete with existing secret managers, and I think we
shouldn't. While our interfaces can be improved in multiple ways, I think we
need a long-term vision and agreement that what complexity level supposed to be
implemented in Ozone and what are the complex use cases which should be
supported by external product.
For monitoring we use Prometheus instead of implementing our on method.
For ACL handling we have Ranger integration for advanced use cases and simple
ACL handling for everybody else.
Almost all the account-namespace isolation requirements can be fulfilled with
an external secret manager (like Vault).
Therefore, it's an important discussion that what type of complexity should be
the responsibility of Ozone and what is not. This is where the mentioned
pro/con comparisons are really missing (for me) from the current design /
available information. Those informations might be discussed internally, but
they are unavailable for me (or I just didn't get it).
> Multi-Tenant Support in Ozone
> -----------------------------
>
> Key: HDDS-4944
> URL: https://issues.apache.org/jira/browse/HDDS-4944
> Project: Apache Ozone
> Issue Type: New Feature
> Components: Ozone CLI, Ozone Datanode, Ozone Manager, Ozone Recon,
> S3, SCM, Security
> Affects Versions: 1.2.0
> Reporter: Prashant Pogde
> Assignee: Prashant Pogde
> Priority: Major
> Labels: pull-request-available
> Attachments: Apache-S3-compatible-Multi-Tenant-Ozone-short.pdf.gz,
> Ozone MultiTenant Feature _ Requirements and Abstractions-3.pdf, Ozone,
> Multi-tenancy, S3, Kerberos....pdf, UseCaseAWSCompatibility.pdf,
> UseCaseCephCompatibility.pdf, UseCaseConfigureMultiTenancy.png,
> UseCaseCurrentOzoneS3BackwardCompatibility.pdf,
> VariousActorsInteractions.png, uml_multitenant_interface_design.png
>
>
> This Jira will be used to track a new feature for Multi-Tenant support in
> Ozone. Initially Multi-Tenant feature would be limited to ozone-users
> accessing Ozone over S3 interface.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]