jerryshao opened a new issue, #11170:
URL: https://github.com/apache/gravitino/issues/11170

   ### Version
   main branch
   
   ### Describe what's wrong
   
   Four related defects in the audit log subsystem affect the correctness and 
completeness of audit records.
   
   ---
   
   **1. Audit log file management is broken (no rotation, unreliable flushing)**
   
   `FileAuditWriter` opens a `FileOutputStream` with `append=true` and writes 
forever — no size cap, no time-based roll, no compression, no retention policy. 
Additionally, `tryFlush()` only fires when a new event arrives and the 
configured interval has elapsed, with no background timer. Combined: the log 
grows without bound, and under low traffic events can sit in the OS buffer 
indefinitely before reaching disk.
   
   **2. List event responses are not captured**
   
   `IcebergListTableEvent`, `IcebergListNamespacesEvent`, 
`IcebergListViewEvent`, and native equivalents (`ListMetalakeEvent`, 
`ListCatalogEvent`, `ListSchemaEvent`, `ListTableEvent`) store only the parent 
identifier. The response payload returned to the caller is never saved on the 
event object, so the audit log cannot answer "what was user X shown at time T?" 
— critical when RBAC filtering causes different users to see different results 
for the same query.
   
   **3. Client IP is not captured correctly on either API surface**
   
   On the Gravitino REST surface, `BaseEvent.remoteAddress()` returns the 
hardcoded string `"unknown"` and none of the event subclasses override it. On 
the Iceberg REST surface, `IcebergRequestContext` reads 
`httpRequest.getRemoteHost()` directly, recording the proxy IP rather than the 
real client behind any load balancer. In both cases, forensic correlation 
between Gravitino audit logs and network logs is not possible.
   
   **4. Event payload data is captured but discarded by the formatter**
   
   `SimpleFormatterV2` emits only 7 fixed fields. Rich data already present on 
event objects — `LoadTableResponse` (including vended credentials), 
`CatalogInfo`, request/response pairs — is never serialized. Critically, there 
is no way to tell from an audit row whether a `LoadTable` operation resulted in 
credential vending.
   
   ---
   
   ### Error message and/or stacktrace
   
   **Finding 1** — `FileAuditWriter.tryFlush()`:
   ```java
   private void tryFlush() {
       Instant now = Instant.now();
       if (now.isAfter(nextFlushTime)) {
           nextFlushTime = now.plusSeconds(flushIntervalSecs);
           doFlush();
       }
   }
   ```
   Flush is only triggered on write, never on a background timer.
   
   **Finding 2** — `IcebergListTableEvent` constructor takes no response 
payload:
   ```java
   public IcebergListTableEvent(
       IcebergRequestContext icebergRequestContext, NameIdentifier 
resourceIdentifier) {
       super(icebergRequestContext, resourceIdentifier);
   }
   ```
   
   **Finding 3** — `BaseEvent.remoteAddress()` hardcoded in 
`core/src/main/java/org/apache/gravitino/listener/api/event/BaseEvent.java`:
   ```java
   public String remoteAddress() {
       return "unknown";
   }
   ```
   `IcebergRequestContext`: `this.remoteHostName = 
httpRequest.getRemoteHost();` — no `X-Forwarded-For` / RFC 7239 `Forwarded` 
header consultation.
   
   **Finding 4** — `SimpleAuditLogV2.toString()` serializes only 7 fields; 
`customInfo` and all response-payload fields on event objects are silently 
dropped.
   
   ### How to reproduce
   
   - **Finding 1:** Run the server under low traffic; observe that audit events 
only appear on disk when a subsequent write triggers a flush.
   - **Finding 2:** Fire any list operation (e.g. list tables); confirm the 
audit row contains no indication of which identifiers were returned.
   - **Finding 3:** Route requests through a proxy or load balancer; confirm 
every `eventSource=GRAVITINO_SERVER` audit row shows `remoteAddress=unknown`, 
and Iceberg REST rows show the proxy IP rather than the originating client.
   - **Finding 4:** Fire a `LoadTable` with and without 
`X-Iceberg-Access-Delegation: vended-credentials`; confirm the resulting audit 
rows are identical.
   
   ### Additional context
   
   Suggested fixes:
   1. Replace `FileAuditWriter` with a rotating writer (size + time triggers, 
compression, configurable retention). Flush must run on a background timer 
independent of write activity.
   2. Update list event dispatchers to pass response payloads into event 
constructors and expose a bounded representation (count + identifiers with 
overflow indicator) in the audit row.
   3. Capture client IP from the inbound HTTP request on the Gravitino REST 
surface, and honor `X-Forwarded-For` (RFC 7239) on both surfaces with a 
configurable trusted-proxy list.
   4. Extend the formatter to serialize response-payload fields; at minimum, 
`LoadTable`, `CreateTable`, and `RegisterTable` rows must indicate whether 
credentials were vended and, if so, the role ARN, session name, expiration, and 
granted scope.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to