XJDKC commented on code in PR #1506:
URL: https://github.com/apache/polaris/pull/1506#discussion_r2070930235
##########
spec/polaris-management-service.yml:
##########
@@ -938,6 +940,34 @@ components:
format: password
description: Bearer token (input-only)
+ SigV4AuthenticationParameters:
+ type: object
+ description: AWS Signature Version 4 authentication
+ allOf:
+ - $ref: '#/components/schemas/AuthenticationParameters'
+ properties:
+ roleArn:
+ type: string
+ description: The aws IAM role arn assume when signing requests
Review Comment:
Yes, this assumes the use of STS. While SigV4 can technically work with just
a keyID/keySecret pair, that's not how it works in Polaris.
Let me break it down a bit:
Polaris acts as the service provider, and it has its own IAM user with an
AWS credential (key ID and key secret). However, this IAM user is owned by the
Polaris service, not by the Polaris user. Since IAM doesn't allow one AWS
account to grant privileges directly to another IAM user belong to another AWS
account, Polaris wouldn't be able to access the polaris user's Glue catalog
that way.
So how does Polaris access a user's Glue catalog? By assuming the IAM role
provided by the Polaris user. This lets the Polaris service temporarily inherit
the permissions tied to that role, essentially gaining the necessary access
without the user having to expose long-lived credentials.
Now, why are there S3 implementations that don't use STS?
That's very common on the query engine side. In that case, the query engine
is fully managed by the Polaris user themselves. They can create an IAM user +
access key pair, grant that IAM user privileges to their storage, and configure
their engine to use that credential directly. No need for STS in that scenario.
So here's the key difference:
1. AWS User credentials (key ID/secret) are long-lived and tied to a user.
IAM roles aren't real credentials, they're just a set of permissions. To access
resources, Polaris assumes a role and obtains short-lived temporary credentials
via STS. **This is much more secure**
2. In the query engine use case, both the engine and the storage are owned
by the same identity (the Polaris user), so they're free to use long-lived user
credentials if they want.
3. The IAM role assumption model is actually consistent with how Polaris
accesses S3 storage in general. In Polaris S3 storage configs, users provide an
IAM role, Polaris assumes that role, and gets subscoped temporary credentials
via STS to access the storage.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]