dimas-b commented on code in PR #3327:
URL: https://github.com/apache/polaris/pull/3327#discussion_r2651206316


##########
polaris-core/src/main/java/org/apache/polaris/core/storage/cache/StorageCredentialCache.java:
##########
@@ -130,10 +146,9 @@ public StorageAccessConfig getOrGenerateSubScopeCreds(
             allowedReadLocations,
             allowedWriteLocations,
             refreshCredentialsEndpoint,
-            includePrincipalNameInSubscopedCredential
-                ? Optional.of(polarisPrincipal)
-                : Optional.empty());
-    LOGGER.atDebug().addKeyValue("key", key).log("subscopedCredsCache");
+            includePrincipalInCacheKey ? Optional.of(polarisPrincipal) : 
Optional.empty(),
+            includeSessionTags ? Optional.of(credentialVendingContext) : 
Optional.empty());
+    LOGGER.atDebug().addKeyValue("key", 
key.toSanitizedLogString()).log("subscopedCredsCache");

Review Comment:
   Is this debug statement valuable at all? If we want to hide "potentially 
sensitive information", that reduces the usefulness of logging the key pretty 
much to zero, I guess 🤔 
   
   To be clear, I support protecting "potentially sensitive information". I'm 
suggesting that this log statement may be removed now... WDYT?



##########
runtime/service/src/main/java/org/apache/polaris/service/catalog/io/StorageAccessConfigProvider.java:
##########
@@ -137,4 +150,56 @@ public StorageAccessConfig getStorageAccessConfig(
     }
     return accessConfig;
   }
+
+  /**
+   * Builds a credential vending context from the table identifier and 
resolved path. This context
+   * is used to populate session tags in cloud provider credentials for 
audit/correlation purposes.
+   *
+   * @param tableIdentifier the table identifier containing namespace and 
table name
+   * @param resolvedPath the resolved entity path containing the catalog entity
+   * @return a credential vending context with catalog, namespace, table, and 
request ID
+   */
+  private CredentialVendingContext buildCredentialVendingContext(
+      TableIdentifier tableIdentifier, PolarisResolvedPathWrapper 
resolvedPath) {
+    CredentialVendingContext.Builder builder = 
CredentialVendingContext.builder();
+
+    // Extract catalog name from the first entity in the resolved path
+    List<PolarisEntity> fullPath = resolvedPath.getRawFullPath();
+    if (fullPath != null && !fullPath.isEmpty()) {
+      builder.catalogName(fullPath.get(0).getName());
+    }
+
+    // Extract namespace from table identifier
+    Namespace namespace = tableIdentifier.namespace();
+    if (namespace != null && namespace.length() > 0) {
+      builder.namespace(String.join(".", namespace.levels()));
+    }
+
+    // Extract table name from table identifier
+    builder.tableName(tableIdentifier.name());
+
+    // Extract request ID from the current request context
+    getRequestId().ifPresent(builder::requestId);
+
+    return builder.build();
+  }
+
+  /**
+   * Extracts the request ID from the current request context.
+   *
+   * <p>Note: we must avoid injecting {@link 
jakarta.ws.rs.container.ContainerRequestContext} here,
+   * because this may cause some tests to fail, e.g. when running with no 
active request scope.
+   *
+   * @return the request ID if available, empty otherwise
+   */
+  private Optional<String> getRequestId() {
+    // See org.jboss.resteasy.reactive.server.injection.ContextProducers
+    ResteasyReactiveRequestContext context = CurrentRequestManager.get();

Review Comment:
   Is it possible to avoid using `ResteasyReactiveRequestContext` here? Some 
access paths needing storage credentials may run outside of the REST API 
context (e.g. async tasks).



##########
polaris-core/src/main/java/org/apache/polaris/core/storage/CredentialVendingContext.java:
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.core.storage;
+
+import java.util.Optional;
+import org.apache.polaris.immutables.PolarisImmutable;
+
+/**
+ * Context information for credential vending operations. This context is used 
to provide metadata
+ * that can be attached to credentials as session tags (e.g., AWS STS session 
tags) for audit and
+ * correlation purposes in CloudTrail and similar logging systems.
+ *
+ * <p>When session tags are enabled, this context provides:
+ *
+ * <ul>
+ *   <li>{@code catalogName} - The name of the catalog vending credentials
+ *   <li>{@code namespace} - The namespace/database being accessed (e.g., 
"db.schema")
+ *   <li>{@code tableName} - The name of the table being accessed
+ *   <li>{@code requestId} - A unique request identifier for correlation with 
catalog audit logs
+ * </ul>
+ *
+ * <p>These values appear in cloud provider audit logs (e.g., AWS CloudTrail), 
enabling
+ * deterministic correlation between catalog operations and data access events.
+ */
+@PolarisImmutable
+public interface CredentialVendingContext {

Review Comment:
   The code related to vended credential generation is in active development 
and refactoring now... Cf. #3270 ... So I guess this class may have to change 
too at some point. I hope you do not mind.
   
   That said, I think introducing this context in the current codebase makes 
sense.
   
   I'd like to propose moving the endpoint config and the principal name here 
too. Then, we could have a "reduce" method using `ReamConfig` to trim the 
information only to the required sub-set in order to have more efficient cache 
keys in environments where the new features are not required. WDYT?
   
   e.g. (pseudo-code):
   
   ```
   CredentialVendingContext reduce(RealmConfig config) {
     if (includePrincipalInCacheKey) {...}
   }
   
   



##########
polaris-core/src/main/java/org/apache/polaris/core/storage/cache/StorageCredentialCache.java:
##########
@@ -144,7 +159,8 @@ public StorageAccessConfig getOrGenerateSubScopeCreds(
                   allowedReadLocations,
                   allowedWriteLocations,
                   polarisPrincipal,
-                  refreshCredentialsEndpoint);
+                  refreshCredentialsEndpoint,
+                  credentialVendingContext);

Review Comment:
   For the sake of correctness, we should use the value from the cache key 
here, because the key may be compared for returning previous entries, but 
`credentialVendingContext` here and in the key may have different values.
   
   This applies to all values, but it's critical for `credentialVendingContext` 
and `polarisPrincipal`.



##########
polaris-core/src/main/java/org/apache/polaris/core/config/FeatureConfiguration.java:
##########
@@ -91,6 +91,28 @@ public static void enforceFeatureEnabledOrThrow(
           .defaultValue(false)
           .buildFeatureConfiguration();
 
+  /**
+   * When enabled, includes session tags (catalog, namespace, table, 
principal, request-id) in AWS

Review Comment:
   nit: the javadoc is mostly equivalent to the flag's descriptions... why 
bother duplicating that info?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to