GitHub user weiqingy edited a discussion: [Feature] Per-Event-Type Configurable 
Log Levels for Event Log

GitHub issue: https://github.com/apache/flink-agents/issues/541

## Motivation

The event log captures every event flowing through an agent for debugging, 
auditing, and observability. Today there is no mechanism to control the 
verbosity of logged events. This makes it impossible to:

- Log some event types at full detail while keeping others concise.
- Reduce log volume in production without losing visibility into critical event 
types.
- Adjust verbosity for a single event type at job submission time.

This design introduces **per-event-type configurable log levels** so that 
operators can independently control the verbosity of each event type.

## Log Levels

Three log levels, ordered from least to most verbose:

| Level | Behavior |
|---|---|
| `OFF` | Event is not logged at all. |
| `STANDARD` | Event is logged. Large content fields may be truncated to keep 
logs concise (see [Truncation Strategy](#truncation-strategy-standard-level)). |
| `VERBOSE` | Event is logged with full detail. Nothing is omitted. |

The default level for all event types is `STANDARD` with truncation active. 
This means `STANDARD` and `VERBOSE` have distinct behaviors out of the box.

### Why Default to STANDARD with Active Truncation

| Approach | Out-of-the-box Behavior | Backward Compatible? | Semantic Clarity |
|---|---|---|---|
| **STANDARD + active truncation (chosen)** | Events logged with long fields 
truncated automatically. | No. Existing logs may be truncated after upgrade. | 
High. STANDARD and VERBOSE are immediately distinct. |
| VERBOSE (no truncation) | All events logged in full, identical to today. | 
Yes. Zero behavior change. | Medium. Users must opt-in to STANDARD to see 
benefits. |

We chose active truncation because:

- **Semantic clarity**: `STANDARD` and `VERBOSE` mean different things from day 
one. No configuration required to see the distinction.
- **Simple opt-in path**: Operators who need full detail for specific event 
types simply set those types to `VERBOSE`.
- **Practical benefit by default**: AI agent events frequently contain very 
long LLM responses (10K+ characters) and tool outputs. Truncation keeps logs 
usable for monitoring without excessive disk usage.
- **Backward compatibility trade-off**: Existing users will see truncated logs 
after upgrade. This is mitigated by setting `event-log.level: VERBOSE` to 
restore the previous behavior, or setting specific event types to `VERBOSE` for 
full detail where needed.

## Configuration

### Config Key Pattern

Per-event-type settings use the pattern:

```
event-log.<EVENT_TYPE>.<property>
```

The event type appears in the middle, and the property name appears at the 
tail. This structure groups all settings for a given event type together and 
allows future per-type properties (e.g., routing events to different logger 
destinations) without restructuring the key namespace. This follows standard 
hierarchical logger configuration conventions.

**Future extensibility example:**

```yaml
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.logger: kafka      
 # future
```

### Event Type Names in Config Keys

Config keys use the **fully-qualified class name** of the event type. This 
avoids ambiguity when different packages define event classes with the same 
simple name.

```
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level=VERBOSE
event-log.org.apache.flink.agents.api.InputEvent.level=OFF
```

### Hierarchical Inheritance

Log level resolution follows **hierarchy inheritance**. The dot-separated event 
type name defines a natural hierarchy. When an event type has no exact config 
match, the level is inherited from the nearest configured ancestor.

The root config key `event-log.level` serves as the global default — no special 
`default` keyword is needed.

**Resolution order** (most specific wins):

1. **Exact match**: 
`event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level`
2. **Parent package**: `event-log.org.apache.flink.agents.api.event.level`
3. **Grandparent package**: `event-log.org.apache.flink.agents.api.level`
4. ... _(walks up the hierarchy)_
5. **Root**: `event-log.level`
6. **Built-in default**: `STANDARD` (if `event-log.level` is not configured)

**Example**: Given these event types:

```
org.apache.flink.agents.api.InputEvent
org.apache.flink.agents.api.OutputEvent
org.apache.flink.agents.api.event.ChatRequestEvent
org.apache.flink.agents.api.event.ToolRequestEvent
```

And this config:

```yaml
event-log.level: STANDARD                                              # root 
default
event-log.org.apache.flink.agents.api.event.level: OFF                 # 
package-level
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE  # 
exact type
```

Resolution:

| Event Type | Resolved Level | Why |
|---|---|---|
| `...api.event.ChatRequestEvent` | `VERBOSE` | Exact match |
| `...api.event.ToolRequestEvent` | `OFF` | No exact match → inherits from 
`...api.event` |
| `...api.InputEvent` | `STANDARD` | No exact match, no `...api` key → inherits 
from root |
| `...api.OutputEvent` | `STANDARD` | Same as above |

### Complete Config Key Reference

| Config Key | Type | Default | Description |
|---|---|---|---|
| `event-log.level` | String (`OFF` / `STANDARD` / `VERBOSE`) | `STANDARD` | 
Root default log level for all event types. |
| `event-log.<EVENT_TYPE>.level` | String (`OFF` / `STANDARD` / `VERBOSE`) | 
_(inherits from parent in hierarchy)_ | Log level for a specific event type or 
package. |
| `event-log.standard.max-string-length` | Integer | `2000` | Maximum character 
length for individual string values at `STANDARD` level. Strings exceeding this 
limit are truncated. `0` disables string truncation. Has no effect at `VERBOSE` 
level. |
| `event-log.standard.max-array-elements` | Integer | `20` | Maximum number of 
elements retained in arrays at `STANDARD` level. Arrays exceeding this limit 
are trimmed. `0` disables array trimming. Has no effect at `VERBOSE` level. |
| `event-log.standard.max-depth` | Integer | `5` | Maximum object nesting depth 
at `STANDARD` level. Objects nested beyond this depth are collapsed. `0` 
disables depth collapsing. Has no effect at `VERBOSE` level. |

### Configuration Examples

**Config file:**

```yaml
# Root default: log everything at STANDARD
event-log.level: STANDARD

# Java events: use Java FQCN
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE
event-log.org.apache.flink.agents.api.event.ChatResponseEvent.level: VERBOSE
event-log.org.apache.flink.agents.api.event.ContextRetrievalRequestEvent.level: 
OFF
event-log.org.apache.flink.agents.api.event.ContextRetrievalResponseEvent.level:
 OFF

# Python events: use Python module path (the event type string from PythonEvent)
event-log.flink_agents.api.events.event.OutputEvent.level: VERBOSE
event-log.my_module.MyCustomEvent.level: OFF

# Truncation thresholds (defaults shown — adjust as needed):
# event-log.standard.max-string-length: 2000
# event-log.standard.max-array-elements: 20
# event-log.standard.max-depth: 5
```

The config key uses whatever type string appears in the event log's `eventType` 
field. For Java events, that's the Java FQCN (e.g., 
`org.apache.flink.agents.api.event.ChatRequestEvent`). For Python events, 
that's the Python module path (e.g., 
`flink_agents.api.events.event.OutputEvent`). The hierarchy inheritance works 
the same way for both — it walks up the dot-separated segments.

**Known limitations of the current model:**

- **Same logical event requires two config keys**: Java `OutputEvent` and 
Python `OutputEvent` are the same concept, but they have different type strings 
(`org.apache.flink.agents.api.OutputEvent` vs 
`flink_agents.api.events.event.OutputEvent`). There is no single config key 
that covers both.
- **Package-level config doesn't cross languages**: 
`event-log.org.apache.flink.agents.api.event.level: OFF` silences all Java 
events in that package, but equivalent Python events are unaffected.
- **No common ancestor below root**: Java hierarchies start with 
`org.apache...`, Python with `flink_agents...`. The only shared ancestor is the 
root `event-log.level`, which is too broad for targeted control.

These limitations are acceptable for the initial release because most jobs 
today are either pure Java or pure Python. See [Migration to 
Language-Independent Events](#migration-to-language-independent-events) for how 
these limitations are resolved when events become language-independent.

**Override at job submission time:**

```bash
# A shared config.yaml defines defaults for all jobs.
# Override just one event type for debugging a specific job run.
# Other per-type levels from config.yaml are preserved because
# each type has its own independent config key.
flink run ... \
  -Devent-log.org.apache.flink.agents.api.event.ChatRequestEvent.level=VERBOSE
```

## Truncation Strategy (STANDARD Level)

At `STANDARD` level, events may be truncated according to three independently 
configurable per-field thresholds. Each threshold controls one truncation 
strategy with clear, predictable behavior. Truncation **never** applies at 
`VERBOSE` level.

### Per-Field Thresholds

The three thresholds compose by walking the serialized event's object graph:

1. **`max-string-length`** (default: `2000`) — Any leaf string value exceeding 
this length is truncated. This most commonly affects LLM response text, tool 
call arguments, and tool response bodies.
2. **`max-array-elements`** (default: `20`) — Arrays with more elements than 
this limit are trimmed to retain only the first N elements.
3. **`max-depth`** (default: `5`) — Object structures nested beyond this depth 
are collapsed.

The thresholds compose by walking the object graph: array trimming reduces 
element count, then string truncation caps leaf string values within the 
retained structure, and depth collapsing handles deeply nested objects.

**Example**: Given a `ChatRequestEvent` with 50 messages and a 10K-char 
response:

- `max-array-elements: 20` trims `messages` from 50 to 20 elements
- `max-string-length: 2000` truncates the `content` field inside each retained 
message (e.g., a 5000-char content → 2000 chars), and truncates the 10K-char 
`response` to 2000 chars
- Short fields like `model`, `role`, `requestId` are untouched

Each threshold has clear, independent behavior — no opaque budget allocation 
across fields.

Setting any threshold to `0` disables that specific truncation strategy. 
Setting all three to `0` makes `STANDARD` behave identically to `VERBOSE` 
(except for the metadata label).

### Truncation Wrapper Format

Truncated fields are wrapped in metadata objects to keep the output as valid, 
parseable JSON. This enables downstream tooling to detect and reason about 
truncation programmatically.

**Wrapper formats by type:**

| Truncation Type | Wrapper Format |
|---|---|
| String | `{"truncatedString": "retained content...", "omittedChars": N}` |
| Array | `{"truncatedList": [retained elements...], "omittedElements": N}` |
| Object (depth) | `{"truncatedObject": {retained fields...}, "omittedFields": 
N}` |

This means truncated fields change type (e.g., a `string` field becomes an 
`object`). Consumers parsing event logs at `STANDARD` level should check for 
wrapper objects. Consumers needing a stable schema should use `VERBOSE` level, 
which never truncates.

### What Does NOT Get Truncated

Structural and identifying fields are always preserved in full:

- `eventType`, `id`, `attributes`, `timestamp`
- Top-level scalar fields (model name, request IDs, status flags)

## Event Log Record Schema

This section describes the JSON schema of each record written to the event log 
file. Two new top-level fields (`logLevel`, `eventType`) are added. Users and 
downstream tools that parse event log files should be aware of these additions.

Records include top-level `logLevel` and `eventType` fields:

```json
{
  "timestamp": "2024-01-15T10:30:00Z",
  "logLevel": "VERBOSE",
  "eventType": "org.apache.flink.agents.api.event.ChatRequestEvent",
  "event": {
    "eventType": "org.apache.flink.agents.api.event.ChatRequestEvent",
    "id": "...",
    "attributes": {},
    "model": "gpt-4",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant..."},
      {"role": "user", "content": "Analyze this document..."}
    ]
  }
}
```

At `STANDARD` level with truncation applied:

```json
{
  "timestamp": "2024-01-15T10:30:00Z",
  "logLevel": "STANDARD",
  "eventType": "org.apache.flink.agents.api.event.ChatRequestEvent",
  "event": {
    "eventType": "org.apache.flink.agents.api.event.ChatRequestEvent",
    "id": "...",
    "attributes": {},
    "model": "gpt-4",
    "messages": {
      "truncatedList": [
        {"role": "system", "content": "You are a helpful assistant..."},
        {"role": "user", "content": {"truncatedString": "Analyze this doc...", 
"omittedChars": 1000}},
        {"role": "assistant", "content": {"truncatedString": "Based on my...", 
"omittedChars": 3000}}
      ],
      "omittedElements": 30
    }
  }
}
```

The `eventType` field is emitted at the top level (alongside `timestamp`) for 
convenient downstream filtering without needing to parse into the `event` 
object.

Old records without `logLevel` or top-level `eventType` are deserialized 
correctly, defaulting to `VERBOSE` (since they were written before log levels 
existed and contain full untruncated content).

## EventFilter Removal

The existing `EventFilter` mechanism (`EventFilter.java`) is removed as part of 
this design. The log level system fully subsumes `EventFilter`'s functionality:

| EventFilter | Log Level Equivalent |
|---|---|
| `ACCEPT_ALL` (default) | `event-log.level: STANDARD` (default) |
| `REJECT_ALL` | `event-log.level: OFF` |
| `byEventType(ChatRequestEvent.class)` | Set `event-log.level: OFF`, then 
`event-log.<ChatRequestEvent>.level: STANDARD` |

Log levels provide additional capabilities that `EventFilter` cannot:

- **Declarative configuration** via YAML config files and CLI `-D` flags (no 
Java code required)
- **Runtime override** at job submission time without recompilation
- **Three-level verbosity** (OFF / STANDARD / VERBOSE) instead of binary 
accept/reject
- **Hierarchy inheritance** for package-level defaults
- **Cross-language support** (config keys work for both Java and Python events)

The only capability lost is custom `accept(Event, EventContext)` logic that 
filters by event content or context (e.g., filtering by key or attribute 
values). This is acceptable because:

- This is 0.x — no backward compatibility guarantees for pre-release APIs
- No known users are implementing custom `EventFilter` logic
- Users should not directly interact with `EventFilter` unless they implement 
their own event logger, which is unlikely in practice

**Files affected:**

- **Delete**: `api/src/main/java/org/apache/flink/agents/api/EventFilter.java`
- **Modify**: 
`api/src/main/java/org/apache/flink/agents/api/logger/EventLoggerConfig.java` — 
remove `eventFilter` field and `Builder.eventFilter()` method
- **Modify**: 
`runtime/src/main/java/org/apache/flink/agents/runtime/eventlog/FileEventLogger.java`
 — remove `eventFilter.accept()` check in `append()`
- **Modify**: 
`runtime/src/test/java/org/apache/flink/agents/runtime/eventlog/FileEventLoggerTest.java`
 — remove EventFilter-related tests

## Validation

No upfront validation of configured event type names is performed. If a 
configured type string doesn't match any event at runtime, it simply has no 
effect — the hierarchy fallback resolves the level from the nearest ancestor or 
the root default. This matches how standard logging frameworks like Log4j 
handle unknown logger names.

**Rationale**: Not all event types are known at initialization time. Users may 
define custom events that are only instantiated at runtime via 
`ctx.sendEvent()`. Strict upfront validation would produce false warnings for 
valid custom event types. The hierarchy fallback is safe — a misconfigured type 
name silently inherits from its parent, and events still get logged at the 
appropriate level.

## Observability

When truncation is active, a counter metric `eventLogTruncatedEvents` is 
incremented each time an event is truncated. This helps operators decide 
whether to adjust truncation thresholds or switch specific event types to 
`VERBOSE`.

## Backward Compatibility

- Default log level is `STANDARD` with active truncation. This is a **behavior 
change** from today — events at `STANDARD` level may be truncated. To restore 
previous behavior, set `event-log.level: VERBOSE`.
- JSON records without `logLevel` or top-level `eventType` fields deserialize 
correctly, defaulting to `VERBOSE` (old records contain full untruncated 
content).
- `EventFilter` is removed (see [EventFilter Removal](#eventfilter-removal)). 
This is acceptable for 0.x pre-release.
- No existing config keys are renamed or removed.

## Migration to Language-Independent Events

There is ongoing discussion about changing events to language-independent JSON 
objects to simplify custom event definitions, especially for cross-language use 
cases where users currently need to define the same event type in both Java and 
Python.

### Current Model (This Design)

Config keys use the event's type string as-is — Java FQCNs for Java events, 
Python module paths for Python events:

```yaml
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE   # 
Java
event-log.flink_agents.api.events.event.OutputEvent.level: VERBOSE             
# Python
```

This has known limitations in mixed-language jobs (see [Configuration 
Examples](#configuration-examples)), but is acceptable for the initial release 
because most jobs today are either pure Java or pure Python.

### Future Model (Language-Independent Events)

If events become plain JSON with a user-chosen type string (e.g., 
`"ChatRequestEvent"`, `"OutputEvent"`), the config keys simplify and the 
cross-language limitations disappear:

```yaml
event-log.ChatRequestEvent.level: VERBOSE    # one key covers both Java and 
Python
event-log.OutputEvent.level: VERBOSE         # no language-specific namespace
```

### Migration Plan

When language-independent events are adopted:

1. **Event type strings change**: The `eventType` field in log records would 
change from FQCNs/module paths to plain type strings. Config keys follow 
automatically since they are based on the `eventType` value.
2. **Deprecation period**: During migration, the system recognizes both old 
FQCN-style keys and new plain-string keys. If both are configured for the same 
event, the new key takes precedence. A warning is logged for deprecated 
FQCN-style keys.
3. **Hierarchy inheritance adapts**: With plain type strings that may not 
contain dots, hierarchy inheritance becomes less relevant. The root 
`event-log.level` still serves as the global default. If the community adopts a 
naming convention with dots (e.g., `chat.request`, `tool.response`), hierarchy 
inheritance continues to work.

### Design Decision

This design targets the current model (Java FQCNs + Python module paths) for 
the initial implementation. The `event-log.<TYPE>.level` config key pattern and 
hierarchy inheritance mechanism are compatible with both the current and future 
models — only the type strings that users write in config files would change 
during migration.


GitHub link: https://github.com/apache/flink-agents/discussions/552

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to