paf91 opened a new pull request, #10266:
URL: https://github.com/apache/ozone/pull/10266
## What changes were proposed in this pull request?
This PR adds OIDC/WebIdentity support to Apache Ozone STS by implementing
`AssumeRoleWithWebIdentity` on top of the existing STS runtime.
The feature is disabled by default through
`ozone.sts.web.identity.enabled=false`, so there is no behavior change on
upgrade unless explicitly enabled.
This change allows clients to exchange a Keycloak/OIDC JWT for temporary S3
credentials. The returned credentials can then be used with normal AWS SigV4
requests and `x-amz-security-token`.
### Architecture
The change extends the existing Ozone STS temporary credential path instead
of introducing a parallel S3 authentication model:
* Keycloak/OIDC authenticates the caller by issuing a signed JWT.
* Ozone STS validates the JWT using issuer, audience, expiry, and JWKS
signature checks.
* OM, not S3G, is the authoritative JWT validator.
* OM authorizes role assumption through the configured Ozone authorizer /
Ranger path.
* OM issues temporary S3 credentials using the existing STS token
infrastructure.
* Subsequent S3 requests continue to use normal SigV4 plus
`x-amz-security-token`.
* Subsequent S3 authorization continues through the existing session-policy
/ Ranger authorization path.
Keycloak groups and roles are treated only as identity attributes. Ranger or
the configured Ozone authorizer remains the policy decision point.
This PR does **not** replace Kerberos daemon authentication, does **not**
add OFS OIDC login, does **not** add CLI device-code login, does **not** add
daemon-to-daemon OIDC authentication, and does **not** use Keycloak
Authorization Services as the Ozone policy engine.
### Backward compatibility
* Existing `AssumeRole` flow is unchanged.
* Existing permanent S3 secret flow is unchanged.
* Existing S3 SigV4 behavior is unchanged.
* Previously-issued `AssumeRole` tokens remain valid.
* Tokens serialized without `AuthType` deserialize as `ASSUME_ROLE`,
preserving compatibility with previously-issued tokens.
* The feature is disabled by default.
### Main implementation points
* Adds an STS-focused OIDC/JWT validation module with JWKS caching and
refresh.
* Adds `ozone.sts.web.identity.*` configuration keys.
* Adds `AssumeRoleWithWebIdentity` request/response models and S3G STS XML
response handling.
* Adds a tightly scoped unauthenticated bootstrap bypass only for `/sts`
`Action=AssumeRoleWithWebIdentity` when explicitly enabled.
* Validates and strips raw `WebIdentityToken` in OM `preExecute()` before
the Ratis-applied request is created.
* The Ratis-applied request contains only sanitized identity/session fields,
token expiry metadata, and token fingerprint.
* Extends `STSTokenIdentifier` with a backward-compatible auth type for
WebIdentity-backed credentials.
* Keeps existing `AssumeRole` token compatibility.
* Adds an authorizer hook for WebIdentity role assumption:
`IAccessAuthorizer.generateAssumeRoleWithWebIdentitySessionPolicy(...)`.
### Security notes
* S3G does not become the source of truth for JWT identity.
* Raw OIDC JWTs are not persisted in the Ratis-applied request, OM metadata,
STS tokens, or logs.
* WebIdentity temporary credentials require the existing STS session token
validation path.
* `SecretAccessKey` and temporary credential material follow the existing
STS credential handling model.
* Operators must still protect OM/Ratis logs and metadata files.
* `alg=none` is rejected.
* Issuer, audience, `exp`, `nbf`, and `iat` are validated.
* JWKS fetch is bounded by connect timeout, read timeout, and size limit.
* Unknown-kid refresh is debounced and fails closed.
* Insecure HTTP issuer/JWKS usage is test-only and emits an OM startup
warning when explicitly enabled.
* Deployments without a WebIdentity-capable Ranger/Ozone authorizer override
fail closed with `NOT_SUPPORTED_OPERATION`.
### Configuration keys
* `ozone.sts.web.identity.enabled`
* `ozone.sts.web.identity.issuer.uri`
* `ozone.sts.web.identity.jwks.uri`
* `ozone.sts.web.identity.audience`
* `ozone.sts.web.identity.username.claim`
* `ozone.sts.web.identity.subject.claim`
* `ozone.sts.web.identity.groups.claim`
* `ozone.sts.web.identity.roles.claim`
* `ozone.sts.web.identity.clock.skew`
* `ozone.sts.web.identity.jwks.refresh.interval`
* `ozone.sts.web.identity.jwks.connect.timeout`
* `ozone.sts.web.identity.jwks.read.timeout`
* `ozone.sts.web.identity.jwks.size.limit`
* `ozone.sts.web.identity.require.https`
* `ozone.sts.web.identity.allow.insecure.http.for.tests`
### Dependency
* Uses Nimbus JOSE + JWT for OIDC/JWT/JWKS validation.
### Design / user docs added
* `hadoop-hdds/docs/content/design/oidc-assume-role-with-web-identity.md`
* `hadoop-hdds/docs/content/security/OzoneSTSWebIdentityKeycloakRanger.md`
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-15273
## How was this patch tested?
The patch was tested with focused unit, S3 Gateway, OM, mini-cluster, and
Keycloak Testcontainers coverage.
### Unit and component tests
```bash
mvn -Dmaven.repo.local=/tmp/m2-ozone \
-pl hadoop-ozone/common \
-am \
-DskipITs \
-DskipShade \
-Dtest='TestOidcJwtIdentityProvider,TestAssumeRoleWithWebIdentityRequest,TestAssumeRoleResponseInfo'
\
test
```
Result:
```text
Tests run: 36, Failures: 0, Errors: 0, Skipped: 0
```
```bash
mvn -Dmaven.repo.local=/tmp/m2-ozone \
-pl hadoop-ozone/s3gateway \
-am \
-DskipITs \
-DskipShade \
-Dtest='TestS3STSEndpoint,TestS3STSWebIdentityAuthBypassFilter,TestAuthorizationFilter'
\
test
```
Result:
```text
Tests run: 44, Failures: 0, Errors: 0, Skipped: 0
```
```bash
mvn -Dmaven.repo.local=/tmp/m2-ozone \
-pl hadoop-ozone/ozone-manager \
-am \
-DskipITs \
-DskipShade \
-Dtest='TestSTSTokenSecretManager,TestSTSSecurityUtil,TestS3AssumeRoleWithWebIdentityRequest,TestS3AssumeRoleRequest,TestS3AssumeRoleResponse,TestSTSTokenIdentifier'
\
test
```
Result:
```text
Tests run: 77, Failures: 0, Errors: 0, Skipped: 0
```
### Mini-cluster E2E test
The mini-cluster E2E test verifies:
* generated JWT + local JWKS;
* `AssumeRoleWithWebIdentity`;
* temporary `AccessKeyId`, `SecretAccessKey`, and `SessionToken`;
* real AWS SDK v2 S3 SigV4 request with session token;
* allowed bucket succeeds;
* denied bucket fails;
* wrong/missing session credentials fail.
```bash
mvn -Dmaven.repo.local=/tmp/m2-ozone \
-pl hadoop-ozone/integration-test-s3 \
-am \
-DskipShade \
-Dtest=TestAssumeRoleWithWebIdentityEndToEnd \
test
```
Result:
```text
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0
```
### Keycloak Testcontainers integration test
The Keycloak IT starts a real Keycloak container, imports a test realm,
obtains a real Keycloak JWT, exchanges it through Ozone STS, and uses the
returned temporary credentials for S3 operations.
```bash
env DOCKER_HOST=unix:///var/run/docker.sock \
TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE=/var/run/docker.sock \
mvn -Dapi.version=1.44 \
-Dmaven.repo.local=/tmp/m2-ozone \
-pl hadoop-ozone/integration-test-s3 \
-am \
-DskipShade \
-Dtest=TestAssumeRoleWithWebIdentityKeycloakIT \
test
```
Result:
```text
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0
```
The `-Dapi.version=1.44` parameter is used for Docker API compatibility in
the local Docker environment.
### Static check
```bash
git diff --check origin/HDDS-13323-sts..HEAD
```
Result:
```text
No output / clean
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]