fmorg-git commented on code in PR #10338:
URL: https://github.com/apache/ozone/pull/10338#discussion_r3314245657


##########
hadoop-hdds/docs/content/design/oidc-assume-role-with-web-identity.md:
##########
@@ -0,0 +1,528 @@
+---
+title: OIDC AssumeRoleWithWebIdentity for Ozone STS
+summary: Web identity support for Ozone STS using OIDC and Ranger authorization
+date: 2026-05-13
+status: proposed
+---
+<!--
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# OIDC AssumeRoleWithWebIdentity for Ozone STS
+
+## Status
+
+Proposed staged implementation.
+
+This document narrows the previous broad OIDC direction to an upstream-friendly
+MVP: extend the Ozone STS temporary S3 credential model with an AWS-compatible
+`AssumeRoleWithWebIdentity` action.
+
+## Problem
+
+Secure Ozone S3 deployments currently depend on Kerberos-backed identities for
+S3 credential issuance. Kubernetes workloads commonly already have OIDC tokens
+from Keycloak or another IdP, but do not have an easy Kerberos bootstrap path.
+
+The target is not a Kerberos-free Ozone cluster. The target is a narrow STS
+exchange:
+
+1. Keycloak authenticates the caller and issues a signed OIDC JWT.
+2. Ozone STS validates the JWT locally using JWKS.
+3. Ranger or the configured Ozone authorizer decides whether the identity may
+   assume the requested role.
+4. Ozone STS issues temporary S3 credentials.
+5. S3 Gateway and OM validate those temporary credentials and authorize object
+   operations using the assumed identity and session context.
+
+Ranger remains the authorization source of truth. Keycloak roles and groups are
+identity attributes only.
+
+## Current STS And S3 Security Path
+
+This design is based on the `origin/HDDS-13323-sts` branch at commit
+`37a224b217`, which contains the STS runtime that was missing from earlier
+base branches.
+
+The current STS runtime contains:
+
+- `/sts` HTTP endpoint in `org.apache.hadoop.ozone.s3sts.S3STSEndpoint`;
+- endpoint authentication setup in `S3STSEndpointBase`;
+- AWS STS `AssumeRole` XML response model in `S3AssumeRoleResponseXml`;
+- S3G to OM client path through `ObjectStore`, `ClientProtocol`, `RpcClient`,
+  `OzoneManagerProtocol`, and
+  `OzoneManagerProtocolClientSideTranslatorPB.assumeRole()`;
+- OM request handling in
+  `org.apache.hadoop.ozone.om.request.s3.security.S3AssumeRoleRequest`;
+- OM response handling in `S3AssumeRoleResponse`;
+- session token identifier and secret manager in `STSTokenIdentifier`,
+  `STSTokenSecretManager`, and `STSSecurityUtil`;
+- revoked STS token metadata and cleanup through `S3RevokeSTSTokenRequest`,
+  `S3DeleteRevokedSTSTokensRequest`, and
+  `RevokedSTSTokenCleanupService`;
+- authorization extension points in `AssumeRoleRequest`,
+  `IAccessAuthorizer.generateAssumeRoleSessionPolicy()`, and
+  `RequestContext.sessionPolicy`.
+
+The current S3 request authentication path is:
+
+- S3G parses AWS SigV4 in `AuthorizationFilter`,
+  `SignatureProcessor`, `AuthorizationV4HeaderParser`,
+  `AuthorizationV4QueryParser`, and `StringToSignProducer`.
+- `EndpointBase` creates `S3Auth` from the parsed access key, signature, and
+  string-to-sign, then stores it in the `ClientProtocol` thread-local.
+- `OzoneManagerProtocolClientSideTranslatorPB` copies `S3Auth` into
+  `OMRequest.s3Authentication`.
+- `AWSSignatureProcessor` extracts `x-amz-security-token` for temporary
+  credentials into `SignatureInfo.sessionToken`.
+- `EndpointBase` and `S3STSEndpointBase` propagate the session token into
+  `S3Auth`.
+- `OzoneManagerProtocolClientSideTranslatorPB` copies `S3Auth` into
+  `OMRequest.s3Authentication`.
+- `S3SecurityUtil.validateS3Credential()` validates either permanent S3
+  credentials or STS temporary credentials.
+- For STS credentials, `STSSecurityUtil` decodes, validates, and decrypts the
+  session token, then OM validates the SigV4 signature with the temporary
+  secret access key.
+- `OmMetadataReader` attaches STS session policy from the OM thread-local
+  `STSTokenIdentifier` to `RequestContext` for subsequent authorization.
+
+The current permanent S3 credential storage path is:
+
+- `S3SecretManager` and `S3SecretManagerImpl`.
+- `S3SecretValue`.
+- OM metadata S3 secret table via `OmMetadataManagerImpl`.
+- `ozone s3 getsecret`, `setsecret`, and `revokesecret` client paths.
+
+## Dependency On Existing Ozone STS Runtime
+
+`AssumeRoleWithWebIdentity` must be an incremental extension of the existing
+`AssumeRole` runtime in `origin/HDDS-13323-sts`. It must not introduce a second
+STS endpoint, a separate S3 authentication system, or local S3G-only temporary
+credential state.
+
+Existing runtime:
+
+```text
+AssumeRole
+  -> S3 SigV4-authenticated /sts request
+  -> OM S3AssumeRoleRequest
+  -> temporary access key / secret / session token
+  -> STSSecurityUtil validation on later S3 requests
+  -> RequestContext.sessionPolicy
+  -> Ranger or configured authorizer
+```
+
+New runtime:
+
+```text
+AssumeRoleWithWebIdentity
+  -> unauthenticated /sts bootstrap request only for this action
+  -> OM validates Keycloak/OIDC JWT
+  -> OM authorizes role assumption through Ranger or configured authorizer
+  -> existing temporary credential issuer / session token path
+  -> existing STSSecurityUtil validation on later S3 requests
+  -> RequestContext.sessionPolicy and assumed identity
+  -> Ranger or configured authorizer
+```
+
+The OM runtime slice adds the WebIdentity request/protobuf path while 
preserving
+the existing `AssumeRole` flow. S3G parses and routes
+`Action=AssumeRoleWithWebIdentity`, but OM remains the authoritative validator
+and issuer.
+
+The raw `WebIdentityToken` is accepted only in the external OM RPC request. The
+OM leader validates it in `S3AssumeRoleWithWebIdentityRequest.preExecute()`,
+maps claims into a sanitized identity/session request, authorizes role
+assumption, generates temporary credential material using the existing STS
+helpers, and returns an `UpdateAssumeRoleWithWebIdentityRequest` for Ratis
+replication. The replicated request must not contain the raw JWT.
+
+`validateAndUpdateCache()` consumes only the sanitized update request. It must
+not call Keycloak, refresh JWKS, revalidate JWTs, or otherwise depend on 
current
+external IdP state during Ratis apply or replay. Credential expiration is
+computed by the leader before replication and stored as
+`credentialExpirationEpochSeconds` so replay does not depend on the apply-time
+clock.
+
+Temporary credentials must not be stored only in S3G memory. S3G can have
+multiple replicas and can restart. The issuing and validation authority must be
+OM-backed, persisted in Ozone metadata, or based on self-contained signed 
tokens
+whose signing keys are rotation-safe and available to all validating 
components.
+
+## Endpoint Placement
+
+The existing STS runtime places `/sts` on the S3 Gateway HTTP/HTTPS port.
+WebIdentity follows that placement: S3G exposes the AWS-compatible STS API
+surface, while OM remains authoritative for JWT validation, identity mapping,
+role-assumption authorization, credential issuance, revocation, and later
+temporary credential validation.
+
+Because existing `/sts` `AssumeRole` is protected by the normal S3 SigV4
+`AuthorizationFilter`, `AssumeRoleWithWebIdentity` needs a narrow bootstrap
+exception:
+
+- only for the STS application path;
+- only for `Action=AssumeRoleWithWebIdentity`;
+- only when `ozone.sts.web.identity.enabled=true`;
+- never for normal S3 object APIs;
+- never for existing `AssumeRole` or other STS actions.
+
+This exception must not make S3G a JWT source of truth. S3G may parse and route
+the request, but it must forward the web identity token and request context to
+OM. OM validates the JWT itself and issues the credentials.
+
+## RoleArn Semantics
+
+The current `AssumeRoleRequest` model contains `targetRoleName`, not a full AWS
+IAM role database. No role metadata store or IAM-like role lifecycle was found
+in this tree. `RoleArn` should therefore be treated as the authorization
+resource and request context for Ranger or the configured Ozone authorizer in
+the MVP.
+
+The Web Identity patch must not invent a new IAM role database. If the STS
+runtime already defines role ARN parsing or role-name normalization, Web
+Identity should reuse it. Otherwise, `RoleArn` remains an opaque policy 
resource
+for the authorizer and for audit/session context.
+
+## New Flow
+
+`AssumeRoleWithWebIdentity` is handled by the Ozone STS endpoint.
+
+Request parameters:
+
+- `Action=AssumeRoleWithWebIdentity`
+- `RoleArn=<role>`
+- `RoleSessionName=<session>`
+- `WebIdentityToken=<OIDC JWT>`
+- `DurationSeconds=<optional>`
+- `Policy=<optional, only if the existing STS AssumeRole session policy path is
+  implemented cleanly>`
+- `ProviderId=<optional compatibility field>`
+
+Flow:
+
+1. The client or workload obtains an OIDC access token from Keycloak.
+2. The client calls Ozone STS with `AssumeRoleWithWebIdentity`.
+3. Ozone STS rejects the request unless `ozone.sts.web.identity.enabled=true`.
+4. S3G validates only the STS request shape that is safe to validate at the
+   edge: action, version, role ARN syntax, role session name, duration bounds,
+   and presence of `WebIdentityToken`.
+5. S3G forwards `RoleArn`, `RoleSessionName`, `WebIdentityToken`,
+   `DurationSeconds`, `ProviderId`, and request context to OM in the external
+   RPC request only.
+6. OM validates the JWT:
+   - token is a signed JWT;
+   - `alg=none` is rejected;
+   - signature validates against the configured JWKS;
+   - `iss` equals `ozone.sts.web.identity.issuer.uri`;
+   - configured audience is present;
+   - `exp`, `nbf`, and `iat` are validated with configured clock skew;
+   - configured username and subject claims are present.
+7. OM maps claims into an Ozone identity:
+   - username;
+   - subject;
+   - issuer;
+   - groups;
+   - roles;
+   - token expiration.
+8. OM builds an assume-role authorization request and calls Ranger or the
+   configured Ozone authorizer before issuing any credential.
+9. OM strips the raw JWT before Ratis replication and submits only sanitized
+   identity/session fields:
+   - role ARN and role session name;
+   - provider id;
+   - effective user;
+   - subject, issuer, audience;
+   - groups and roles;
+   - web identity token expiration;
+   - token fingerprint;
+   - requested/effective duration;
+   - credential expiration;
+   - derived session policy.
+10. If authorized, OM issues temporary S3 credentials:
+   - `Credentials.AccessKeyId`;
+   - `Credentials.SecretAccessKey`;
+   - `Credentials.SessionToken`;
+   - `Credentials.Expiration`;
+   - `SubjectFromWebIdentityToken`;
+   - `AssumedRoleUser`;
+   - `Audience`;
+   - `Provider`.
+11. The client uses those credentials with ordinary AWS SigV4 against S3G.
+12. S3G and OM validate the temporary credential, recover the assumed identity
+   and session policy, and pass them to the authorizer for every S3 operation.
+
+## Configuration
+
+The MVP uses STS-focused configuration keys:
+
+```properties
+ozone.sts.web.identity.enabled=false

Review Comment:
   As mentioned above, should `ozone.sts.*` be `ozone.s3g.sts.*` for 
consistency with current STS implementation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to