hanahmily commented on PR #1144:
URL:
https://github.com/apache/skywalking-banyandb/pull/1144#issuecomment-4565309141
Force-pushed another iteration: **`finalize` and `merge` events are now
independently toggleable via `enabled_events`**, with `[PIPELINE_EVENT_MERGE]`
as the default. The in-merge filter (with per-trace `merge_grace`) is restored
as `PIPELINE_EVENT_MERGE`; the finalization gate (with `finalize_grace`) is
`PIPELINE_EVENT_FINALIZE` and opt-in. Both share the same `tail_sampling`
keep-rules — one configured set, evaluated wherever events fire.
**Proto change (Option 3 from the design discussion):**
```protobuf
enum PipelineEvent {
PIPELINE_EVENT_UNSPECIFIED = 0;
PIPELINE_EVENT_MERGE = 1;
PIPELINE_EVENT_FINALIZE = 2;
}
message TracePipelineConfig {
// ...
TailSampling tail_sampling = 6; // shared keep-rules
repeated PipelineEvent enabled_events = 7; // default = [MERGE]
google.protobuf.Duration merge_grace = 8; // used iff MERGE enabled
google.protobuf.Duration finalize_grace = 9; // used iff FINALIZE
enabled
}
```
Why these defaults: out-of-box behavior is the in-merge filter (with
deferred drops via `merge_grace`) — cheap, runs often, reclaims space
incrementally during routine LSM compaction. Operators who want the heavier
per-segment-after-settle gate explicitly add `PIPELINE_EVENT_FINALIZE` to the
list. Per-stage retention (`StageRule`) is always implicit at migration-out
when a StageRule carries a predicate, and is not in the enum.
**Doc deltas:**
- Restored §5.1 *Compaction Rewrites* and §7.1 *LSM Compaction Merge Phase
Loop* (gated on `PIPELINE_EVENT_MERGE`). Both describe the per-trace
`merge_grace` gate that prevents partial-trace drops during compaction.
- §1.1 / §1.2 reframed: the gate fires at whichever events are enabled;
routine compaction is now governed by the toggle, not "always lossless."
- §2.1 paragraph: three anchored events (merge, finalize, migration-out)
instead of two.
- §2.2 message summary: `TracePipelineConfig` bullet lists `enabled_events`,
`merge_grace`, `finalize_grace`; new `PipelineEvent` bullet describes the enum.
- §2.4 admission: updated to handle the default = `[MERGE]` semantics
(config with `tail_sampling` set is valid without explicit `enabled_events`).
- §3.1 retention timing: bullets now cover all three filter points (merge /
finalize / migration-out).
- §6.1 / §6.2 scenarios: scenario 1 explicitly sets `enabled_events:
[PIPELINE_EVENT_MERGE, PIPELINE_EVENT_FINALIZE]` and includes both graces;
scenario 2 relies on the default to demonstrate the minimal config form.
- §7 data-flow diagram: three pathways again; the conditional gating is
called out per entry.
- §8 step-by-step: three phases (A in-merge, B finalization, C
migration-out) walking the same real trace through each.
`buf lint`/`buf build` and `make generate` both pass. **3 commits, +991/0 vs
`apache/main`** (up ~115 lines from the previous push, mostly the restored §5.1
and §7.1 chapters).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]