GitHub user weiqingy edited a discussion: [Feature] Per-Event-Type Configurable Log Levels for Event Log
GitHub issue: https://github.com/apache/flink-agents/issues/541 ## Motivation The event log captures every event flowing through an agent for debugging, auditing, and observability. Today there is no mechanism to control the verbosity of logged events. This makes it impossible to: - Log some event types at full detail while keeping others concise. - Reduce log volume in production without losing visibility into critical event types. - Adjust verbosity for a single event type at job submission time. This design introduces **per-event-type configurable log levels** so that operators can independently control the verbosity of each event type. ## Log Levels Three log levels, ordered from least to most verbose: | Level | Behavior | |---|---| | `OFF` | Event is not logged at all. | | `STANDARD` | Event is logged. Large content fields may be truncated to keep logs concise (see [Truncation Strategy](#truncation-strategy-standard-level)). | | `VERBOSE` | Event is logged with full detail. Nothing is omitted. | The default level for all event types is `STANDARD` with truncation active. This means `STANDARD` and `VERBOSE` have distinct behaviors out of the box. ### Why Default to STANDARD with Active Truncation | Approach | Out-of-the-box Behavior | Backward Compatible? | Semantic Clarity | |---|---|---|---| | **STANDARD + active truncation (chosen)** | Events logged with long fields truncated automatically. | No. Existing logs may be truncated after upgrade. | High. STANDARD and VERBOSE are immediately distinct. | | VERBOSE (no truncation) | All events logged in full, identical to today. | Yes. Zero behavior change. | Medium. Users must opt-in to STANDARD to see benefits. | We chose active truncation because: - **Semantic clarity**: `STANDARD` and `VERBOSE` mean different things from day one. No configuration required to see the distinction. - **Simple opt-in path**: Operators who need full detail for specific event types simply set those types to `VERBOSE`. - **Practical benefit by default**: AI agent events frequently contain very long LLM responses (10K+ characters) and tool outputs. Truncation keeps logs usable for monitoring without excessive disk usage. - **Backward compatibility trade-off**: Existing users will see truncated logs after upgrade. This is mitigated by setting `event-log.level: VERBOSE` to restore the previous behavior, or setting specific event types to `VERBOSE` for full detail where needed. ## Configuration ### Config Key Pattern Per-event-type settings use the pattern: ``` event-log.<EVENT_TYPE>.<property> ``` The event type appears in the middle, and the property name appears at the tail. This structure groups all settings for a given event type together and allows future per-type properties (e.g., routing events to different logger destinations) without restructuring the key namespace. This follows standard hierarchical logger configuration conventions. **Future extensibility example:** ```yaml event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE event-log.org.apache.flink.agents.api.event.ChatRequestEvent.logger: kafka # future ``` ### Event Type Names in Config Keys Config keys use the **fully-qualified class name** of the event type. This avoids ambiguity when different packages define event classes with the same simple name. ``` event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level=VERBOSE event-log.org.apache.flink.agents.api.InputEvent.level=OFF ``` ### Hierarchical Inheritance Log level resolution follows **hierarchy inheritance**. The dot-separated event type name defines a natural hierarchy. When an event type has no exact config match, the level is inherited from the nearest configured ancestor. The root config key `event-log.level` serves as the global default — no special `default` keyword is needed. **Resolution order** (most specific wins): 1. **Exact match**: `event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level` 2. **Parent package**: `event-log.org.apache.flink.agents.api.event.level` 3. **Grandparent package**: `event-log.org.apache.flink.agents.api.level` 4. ... _(walks up the hierarchy)_ 5. **Root**: `event-log.level` 6. **Built-in default**: `STANDARD` (if `event-log.level` is not configured) **Example**: Given these event types: ``` org.apache.flink.agents.api.InputEvent org.apache.flink.agents.api.OutputEvent org.apache.flink.agents.api.event.ChatRequestEvent org.apache.flink.agents.api.event.ToolRequestEvent ``` And this config: ```yaml event-log.level: STANDARD # root default event-log.org.apache.flink.agents.api.event.level: OFF # package-level event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE # exact type ``` Resolution: | Event Type | Resolved Level | Why | |---|---|---| | `...api.event.ChatRequestEvent` | `VERBOSE` | Exact match | | `...api.event.ToolRequestEvent` | `OFF` | No exact match → inherits from `...api.event` | | `...api.InputEvent` | `STANDARD` | No exact match, no `...api` key → inherits from root | | `...api.OutputEvent` | `STANDARD` | Same as above | ### Complete Config Key Reference | Config Key | Type | Default | Description | |---|---|---|---| | `event-log.level` | String (`OFF` / `STANDARD` / `VERBOSE`) | `STANDARD` | Root default log level for all event types. | | `event-log.<EVENT_TYPE>.level` | String (`OFF` / `STANDARD` / `VERBOSE`) | _(inherits from parent in hierarchy)_ | Log level for a specific event type or package. | | `event-log.standard.max-string-length` | Integer | `2000` | Maximum character length for individual string values at `STANDARD` level. Strings exceeding this limit are truncated. `0` disables string truncation. Has no effect at `VERBOSE` level. | | `event-log.standard.max-array-elements` | Integer | `20` | Maximum number of elements retained in arrays at `STANDARD` level. Arrays exceeding this limit are trimmed. `0` disables array trimming. Has no effect at `VERBOSE` level. | | `event-log.standard.max-depth` | Integer | `5` | Maximum object nesting depth at `STANDARD` level. Objects nested beyond this depth are collapsed. `0` disables depth collapsing. Has no effect at `VERBOSE` level. | ### Configuration Examples **Config file:** ```yaml # Root default: log everything at STANDARD event-log.level: STANDARD # Java events: use Java FQCN event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE event-log.org.apache.flink.agents.api.event.ChatResponseEvent.level: VERBOSE event-log.org.apache.flink.agents.api.event.ContextRetrievalRequestEvent.level: OFF event-log.org.apache.flink.agents.api.event.ContextRetrievalResponseEvent.level: OFF # Python events: use Python module path (the event type string from PythonEvent) event-log.flink_agents.api.events.event.OutputEvent.level: VERBOSE event-log.my_module.MyCustomEvent.level: OFF # Truncation thresholds (defaults shown — adjust as needed): # event-log.standard.max-string-length: 2000 # event-log.standard.max-array-elements: 20 # event-log.standard.max-depth: 5 ``` The config key uses whatever type string appears in the event log's `eventType` field. For Java events, that's the Java FQCN (e.g., `org.apache.flink.agents.api.event.ChatRequestEvent`). For Python events, that's the Python module path (e.g., `flink_agents.api.events.event.OutputEvent`). The hierarchy inheritance works the same way for both — it walks up the dot-separated segments. **Known limitations of the current model:** - **Same logical event requires two config keys**: Java `OutputEvent` and Python `OutputEvent` are the same concept, but they have different type strings (`org.apache.flink.agents.api.OutputEvent` vs `flink_agents.api.events.event.OutputEvent`). There is no single config key that covers both. - **Package-level config doesn't cross languages**: `event-log.org.apache.flink.agents.api.event.level: OFF` silences all Java events in that package, but equivalent Python events are unaffected. - **No common ancestor below root**: Java hierarchies start with `org.apache...`, Python with `flink_agents...`. The only shared ancestor is the root `event-log.level`, which is too broad for targeted control. These limitations are acceptable for the initial release because most jobs today are either pure Java or pure Python. See [Migration to Language-Independent Events](#migration-to-language-independent-events) for how these limitations are resolved when events become language-independent. **Override at job submission time:** ```bash # A shared config.yaml defines defaults for all jobs. # Override just one event type for debugging a specific job run. # Other per-type levels from config.yaml are preserved because # each type has its own independent config key. flink run ... \ -Devent-log.org.apache.flink.agents.api.event.ChatRequestEvent.level=VERBOSE ``` ## Truncation Strategy (STANDARD Level) At `STANDARD` level, events may be truncated according to three independently configurable per-field thresholds. Each threshold controls one truncation strategy with clear, predictable behavior. Truncation **never** applies at `VERBOSE` level. ### Per-Field Thresholds The three thresholds compose by walking the serialized event's object graph: 1. **`max-string-length`** (default: `2000`) — Any leaf string value exceeding this length is truncated. This most commonly affects LLM response text, tool call arguments, and tool response bodies. 2. **`max-array-elements`** (default: `20`) — Arrays with more elements than this limit are trimmed to retain only the first N elements. 3. **`max-depth`** (default: `5`) — Object structures nested beyond this depth are collapsed. The thresholds compose by walking the object graph: array trimming reduces element count, then string truncation caps leaf string values within the retained structure, and depth collapsing handles deeply nested objects. **Example**: Given a `ChatRequestEvent` with 50 messages and a 10K-char response: - `max-array-elements: 20` trims `messages` from 50 to 20 elements - `max-string-length: 2000` truncates the `content` field inside each retained message (e.g., a 5000-char content → 2000 chars), and truncates the 10K-char `response` to 2000 chars - Short fields like `model`, `role`, `requestId` are untouched Each threshold has clear, independent behavior — no opaque budget allocation across fields. Setting any threshold to `0` disables that specific truncation strategy. Setting all three to `0` makes `STANDARD` behave identically to `VERBOSE` (except for the metadata label). ### Truncation Wrapper Format Truncated fields are wrapped in metadata objects to keep the output as valid, parseable JSON. This enables downstream tooling to detect and reason about truncation programmatically. **Wrapper formats by type:** | Truncation Type | Wrapper Format | |---|---| | String | `{"truncatedString": "retained content...", "omittedChars": N}` | | Array | `{"truncatedList": [retained elements...], "omittedElements": N}` | | Object (depth) | `{"truncatedObject": {retained fields...}, "omittedFields": N}` | This means truncated fields change type (e.g., a `string` field becomes an `object`). Consumers parsing event logs at `STANDARD` level should check for wrapper objects. Consumers needing a stable schema should use `VERBOSE` level, which never truncates. ### What Does NOT Get Truncated Structural and identifying fields are always preserved in full: - `eventType`, `id`, `attributes`, `timestamp` - Top-level scalar fields (model name, request IDs, status flags) ## Event Log Record Schema This section describes the JSON schema of each record written to the event log file. Two new top-level fields (`logLevel`, `eventType`) are added. Users and downstream tools that parse event log files should be aware of these additions. Records include top-level `logLevel` and `eventType` fields: ```json { "timestamp": "2024-01-15T10:30:00Z", "logLevel": "VERBOSE", "eventType": "org.apache.flink.agents.api.event.ChatRequestEvent", "event": { "eventType": "org.apache.flink.agents.api.event.ChatRequestEvent", "id": "...", "attributes": {}, "model": "gpt-4", "messages": [ {"role": "system", "content": "You are a helpful assistant..."}, {"role": "user", "content": "Analyze this document..."} ] } } ``` At `STANDARD` level with truncation applied: ```json { "timestamp": "2024-01-15T10:30:00Z", "logLevel": "STANDARD", "eventType": "org.apache.flink.agents.api.event.ChatRequestEvent", "event": { "eventType": "org.apache.flink.agents.api.event.ChatRequestEvent", "id": "...", "attributes": {}, "model": "gpt-4", "messages": { "truncatedList": [ {"role": "system", "content": "You are a helpful assistant..."}, {"role": "user", "content": {"truncatedString": "Analyze this doc...", "omittedChars": 1000}}, {"role": "assistant", "content": {"truncatedString": "Based on my...", "omittedChars": 3000}} ], "omittedElements": 30 } } } ``` The `eventType` field is emitted at the top level (alongside `timestamp`) for convenient downstream filtering without needing to parse into the `event` object. Old records without `logLevel` or top-level `eventType` are deserialized correctly, defaulting to `VERBOSE` (since they were written before log levels existed and contain full untruncated content). ## EventFilter Removal The existing `EventFilter` mechanism (`EventFilter.java`) is removed as part of this design. The log level system fully subsumes `EventFilter`'s functionality: | EventFilter | Log Level Equivalent | |---|---| | `ACCEPT_ALL` (default) | `event-log.level: STANDARD` (default) | | `REJECT_ALL` | `event-log.level: OFF` | | `byEventType(ChatRequestEvent.class)` | Set `event-log.level: OFF`, then `event-log.<ChatRequestEvent>.level: STANDARD` | Log levels provide additional capabilities that `EventFilter` cannot: - **Declarative configuration** via YAML config files and CLI `-D` flags (no Java code required) - **Runtime override** at job submission time without recompilation - **Three-level verbosity** (OFF / STANDARD / VERBOSE) instead of binary accept/reject - **Hierarchy inheritance** for package-level defaults - **Cross-language support** (config keys work for both Java and Python events) The only capability lost is custom `accept(Event, EventContext)` logic that filters by event content or context (e.g., filtering by key or attribute values). This is acceptable because: - This is 0.x — no backward compatibility guarantees for pre-release APIs - No known users are implementing custom `EventFilter` logic - Users should not directly interact with `EventFilter` unless they implement their own event logger, which is unlikely in practice **Files affected:** - **Delete**: `api/src/main/java/org/apache/flink/agents/api/EventFilter.java` - **Modify**: `api/src/main/java/org/apache/flink/agents/api/logger/EventLoggerConfig.java` — remove `eventFilter` field and `Builder.eventFilter()` method - **Modify**: `runtime/src/main/java/org/apache/flink/agents/runtime/eventlog/FileEventLogger.java` — remove `eventFilter.accept()` check in `append()` - **Modify**: `runtime/src/test/java/org/apache/flink/agents/runtime/eventlog/FileEventLoggerTest.java` — remove EventFilter-related tests ## Validation No upfront validation of configured event type names is performed. If a configured type string doesn't match any event at runtime, it simply has no effect — the hierarchy fallback resolves the level from the nearest ancestor or the root default. This matches how standard logging frameworks like Log4j handle unknown logger names. **Rationale**: Not all event types are known at initialization time. Users may define custom events that are only instantiated at runtime via `ctx.sendEvent()`. Strict upfront validation would produce false warnings for valid custom event types. The hierarchy fallback is safe — a misconfigured type name silently inherits from its parent, and events still get logged at the appropriate level. ## Observability When truncation is active, a counter metric `eventLogTruncatedEvents` is incremented each time an event is truncated. This helps operators decide whether to adjust truncation thresholds or switch specific event types to `VERBOSE`. ## Backward Compatibility - Default log level is `STANDARD` with active truncation. This is a **behavior change** from today — events at `STANDARD` level may be truncated. To restore previous behavior, set `event-log.level: VERBOSE`. - JSON records without `logLevel` or top-level `eventType` fields deserialize correctly, defaulting to `VERBOSE` (old records contain full untruncated content). - `EventFilter` is removed (see [EventFilter Removal](#eventfilter-removal)). This is acceptable for 0.x pre-release. - No existing config keys are renamed or removed. ## Migration to Language-Independent Events There is ongoing discussion about changing events to language-independent JSON objects to simplify custom event definitions, especially for cross-language use cases where users currently need to define the same event type in both Java and Python. ### Current Model (This Design) Config keys use the event's type string as-is — Java FQCNs for Java events, Python module paths for Python events: ```yaml event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE # Java event-log.flink_agents.api.events.event.OutputEvent.level: VERBOSE # Python ``` This has known limitations in mixed-language jobs (see [Configuration Examples](#configuration-examples)), but is acceptable for the initial release because most jobs today are either pure Java or pure Python. ### Future Model (Language-Independent Events) If events become plain JSON with a user-chosen type string (e.g., `"ChatRequestEvent"`, `"OutputEvent"`), the config keys simplify and the cross-language limitations disappear: ```yaml event-log.ChatRequestEvent.level: VERBOSE # one key covers both Java and Python event-log.OutputEvent.level: VERBOSE # no language-specific namespace ``` ### Migration Plan When language-independent events are adopted: 1. **Event type strings change**: The `eventType` field in log records would change from FQCNs/module paths to plain type strings. Config keys follow automatically since they are based on the `eventType` value. 2. **Deprecation period**: During migration, the system recognizes both old FQCN-style keys and new plain-string keys. If both are configured for the same event, the new key takes precedence. A warning is logged for deprecated FQCN-style keys. 3. **Hierarchy inheritance adapts**: With plain type strings that may not contain dots, hierarchy inheritance becomes less relevant. The root `event-log.level` still serves as the global default. If the community adopts a naming convention with dots (e.g., `chat.request`, `tool.response`), hierarchy inheritance continues to work. ### Design Decision This design targets the current model (Java FQCNs + Python module paths) for the initial implementation. The `event-log.<TYPE>.level` config key pattern and hierarchy inheritance mechanism are compatible with both the current and future models — only the type strings that users write in config files would change during migration. GitHub link: https://github.com/apache/flink-agents/discussions/552 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
