dimas-b commented on code in PR #3327:
URL: https://github.com/apache/polaris/pull/3327#discussion_r2651206316
##########
polaris-core/src/main/java/org/apache/polaris/core/storage/cache/StorageCredentialCache.java:
##########
@@ -130,10 +146,9 @@ public StorageAccessConfig getOrGenerateSubScopeCreds(
allowedReadLocations,
allowedWriteLocations,
refreshCredentialsEndpoint,
- includePrincipalNameInSubscopedCredential
- ? Optional.of(polarisPrincipal)
- : Optional.empty());
- LOGGER.atDebug().addKeyValue("key", key).log("subscopedCredsCache");
+ includePrincipalInCacheKey ? Optional.of(polarisPrincipal) :
Optional.empty(),
+ includeSessionTags ? Optional.of(credentialVendingContext) :
Optional.empty());
+ LOGGER.atDebug().addKeyValue("key",
key.toSanitizedLogString()).log("subscopedCredsCache");
Review Comment:
Is this debug statement valuable at all? If we want to hide "potentially
sensitive information", that reduces the usefulness of logging the key pretty
much to zero, I guess 🤔
To be clear, I support protecting "potentially sensitive information". I'm
suggesting that this log statement may be removed now... WDYT?
##########
runtime/service/src/main/java/org/apache/polaris/service/catalog/io/StorageAccessConfigProvider.java:
##########
@@ -137,4 +150,56 @@ public StorageAccessConfig getStorageAccessConfig(
}
return accessConfig;
}
+
+ /**
+ * Builds a credential vending context from the table identifier and
resolved path. This context
+ * is used to populate session tags in cloud provider credentials for
audit/correlation purposes.
+ *
+ * @param tableIdentifier the table identifier containing namespace and
table name
+ * @param resolvedPath the resolved entity path containing the catalog entity
+ * @return a credential vending context with catalog, namespace, table, and
request ID
+ */
+ private CredentialVendingContext buildCredentialVendingContext(
+ TableIdentifier tableIdentifier, PolarisResolvedPathWrapper
resolvedPath) {
+ CredentialVendingContext.Builder builder =
CredentialVendingContext.builder();
+
+ // Extract catalog name from the first entity in the resolved path
+ List<PolarisEntity> fullPath = resolvedPath.getRawFullPath();
+ if (fullPath != null && !fullPath.isEmpty()) {
+ builder.catalogName(fullPath.get(0).getName());
+ }
+
+ // Extract namespace from table identifier
+ Namespace namespace = tableIdentifier.namespace();
+ if (namespace != null && namespace.length() > 0) {
+ builder.namespace(String.join(".", namespace.levels()));
+ }
+
+ // Extract table name from table identifier
+ builder.tableName(tableIdentifier.name());
+
+ // Extract request ID from the current request context
+ getRequestId().ifPresent(builder::requestId);
+
+ return builder.build();
+ }
+
+ /**
+ * Extracts the request ID from the current request context.
+ *
+ * <p>Note: we must avoid injecting {@link
jakarta.ws.rs.container.ContainerRequestContext} here,
+ * because this may cause some tests to fail, e.g. when running with no
active request scope.
+ *
+ * @return the request ID if available, empty otherwise
+ */
+ private Optional<String> getRequestId() {
+ // See org.jboss.resteasy.reactive.server.injection.ContextProducers
+ ResteasyReactiveRequestContext context = CurrentRequestManager.get();
Review Comment:
Is it possible to avoid using `ResteasyReactiveRequestContext` here? Some
access paths needing storage credentials may run outside of the REST API
context (e.g. async tasks).
##########
polaris-core/src/main/java/org/apache/polaris/core/storage/CredentialVendingContext.java:
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.core.storage;
+
+import java.util.Optional;
+import org.apache.polaris.immutables.PolarisImmutable;
+
+/**
+ * Context information for credential vending operations. This context is used
to provide metadata
+ * that can be attached to credentials as session tags (e.g., AWS STS session
tags) for audit and
+ * correlation purposes in CloudTrail and similar logging systems.
+ *
+ * <p>When session tags are enabled, this context provides:
+ *
+ * <ul>
+ * <li>{@code catalogName} - The name of the catalog vending credentials
+ * <li>{@code namespace} - The namespace/database being accessed (e.g.,
"db.schema")
+ * <li>{@code tableName} - The name of the table being accessed
+ * <li>{@code requestId} - A unique request identifier for correlation with
catalog audit logs
+ * </ul>
+ *
+ * <p>These values appear in cloud provider audit logs (e.g., AWS CloudTrail),
enabling
+ * deterministic correlation between catalog operations and data access events.
+ */
+@PolarisImmutable
+public interface CredentialVendingContext {
Review Comment:
The code related to vended credential generation is in active development
and refactoring now... Cf. #3270 ... So I guess this class may have to change
too at some point. I hope you do not mind.
That said, I think introducing this context in the current codebase makes
sense.
I'd like to propose moving the endpoint config and the principal name here
too. Then, we could have a "reduce" method using `ReamConfig` to trim the
information only to the required sub-set in order to have more efficient cache
keys in environments where the new features are not required. WDYT?
e.g. (pseudo-code):
```
CredentialVendingContext reduce(RealmConfig config) {
if (includePrincipalInCacheKey) {...}
}
##########
polaris-core/src/main/java/org/apache/polaris/core/storage/cache/StorageCredentialCache.java:
##########
@@ -144,7 +159,8 @@ public StorageAccessConfig getOrGenerateSubScopeCreds(
allowedReadLocations,
allowedWriteLocations,
polarisPrincipal,
- refreshCredentialsEndpoint);
+ refreshCredentialsEndpoint,
+ credentialVendingContext);
Review Comment:
For the sake of correctness, we should use the value from the cache key
here, because the key may be compared for returning previous entries, but
`credentialVendingContext` here and in the key may have different values.
This applies to all values, but it's critical for `credentialVendingContext`
and `polarisPrincipal`.
##########
polaris-core/src/main/java/org/apache/polaris/core/config/FeatureConfiguration.java:
##########
@@ -91,6 +91,28 @@ public static void enforceFeatureEnabledOrThrow(
.defaultValue(false)
.buildFeatureConfiguration();
+ /**
+ * When enabled, includes session tags (catalog, namespace, table,
principal, request-id) in AWS
Review Comment:
nit: the javadoc is mostly equivalent to the flag's descriptions... why
bother duplicating that info?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]