Sbaia commented on PR #60498: URL: https://github.com/apache/doris/pull/60498#issuecomment-3851713914
Thanks for picking this up and building on #59893, @CalvinKirs. I want to clarify an important distinction about what this PR actually enables vs. what users on EKS/GKE typically need. **What this PR adds:** cross-account assume-role support (`AssumeRoleAwsClientFactory`), where a `role_arn` is explicitly provided in the catalog properties. **What this PR does NOT add:** support for IRSA (IAM Roles for Service Accounts) or, more generally, the AWS default credentials chain without any explicit credential configuration. The validation in `IcebergRestProperties` (lines 128-137) explicitly rejects the case where neither access-key/secret-key nor `role_arn` is provided: ```java boolean hasAccessKeyAndSecret = ...; boolean hasIamRole = ...; return !hasAccessKeyAndSecret && !hasIamRole; // fails validation ``` With IRSA (and future equivalents like GKE Workload Identity or ECS task roles), the pod already has credentials injected via the service account. The user doesn't have a `role_arn` to specify in the catalog DDL — the role is attached to the pod's ServiceAccount via the `eks.amazonaws.com/role-arn` annotation. The correct behavior is to let Iceberg use the AWS SDK's `DefaultCredentialsProvider`, which picks up the IRSA token automatically. This means accepting a catalog definition with **no credential properties at all**. **Regarding the assume-role path itself:** `AssumeRoleAwsClientFactory` uses the AWS SDK's default credentials chain internally to obtain the **base credentials** needed to call `sts:AssumeRole`. So who actually performs the AssumeRole call depends on the environment: - **EKS with IRSA configured:** the pod's ServiceAccount role (via web identity token) is used as the base identity to assume the target role. This requires a trust policy on the target role that allows the SA role to assume it. - **EKS without IRSA:** falls back to the EC2 instance profile (node role via IMDS). The node role needs `sts:AssumeRole` permission on the target role. This is less secure since all pods on the node share the same base identity. - **EC2 (non-Kubernetes):** the instance profile role is used directly. So the assume-role feature works, but it's solving a different use case (cross-account access) than what most EKS users need (direct IRSA credentials without specifying any role_arn). Our original PR #59893 takes a simpler approach: when no explicit credentials are provided, we just don't set any credential properties, allowing Iceberg/AWS SDK to resolve credentials through the default chain. This covers IRSA, instance profiles, environment variables, and any future pod identity mechanism — with zero configuration in the catalog DDL. I'd suggest either: 1. Relaxing the validation to allow no credentials at all (supporting both IRSA and assume-role), or 2. Keeping assume-role as an addition on top of #59893's default-chain support Happy to collaborate on merging the two approaches. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
