hanahmily opened a new pull request, #1144: URL: https://github.com/apache/skywalking-banyandb/pull/1144
## Summary Adds the design doc and API surface for a **storage-node post-trace pipeline** on BanyanDB data nodes. The pipeline runs asynchronously during storage lifecycle events to (1) **tail-sample** traces at finalization, and (2) apply **per-stage hard retention predicates** (`min_duration` / `keep_errors` / `keep_tag_rules`) as data ages through `Hot → Warm → Cold`. **No implementation code in this PR — design + proto only.** Marked as draft for review. **Design philosophy:** every aspect is grounded in BanyanDB's actual storage model (event-time segments, part-count-driven LSM compaction, watermark-driven settling, etc.) rather than a generic streaming-telemetry idea grafted on. Scenario examples use real data sampled from the `skywalking-showcase` BanyanDB cluster. ## What's in this PR (3 commits, 4 files) - **`docs/design/post-trace-pipeline.md`** (new, 493 lines) — full design: architecture, targeting model, uniqueness/conflict policy, admission validation, retention-decision timing & trace completeness, per-stage rising predicates, downstream lifecycle actions, two real-data scenario configs, data-flow architecture (LSM merge filter hook, pre-migration rewrite, settled-segment finalization). - **`api/proto/banyandb/pipeline/v1/trace_pipeline.proto`** (new, 187 lines) — `TracePipelineConfig`, `StageRule`, `TailSampling`, `TagSamplingRule`, `TagMatcher`, `StringList`, and the `StageEvent` enum (`COMPACTION` / `FINALIZE` / `MIGRATION_OUT`). PGV validation on bounds; `merge_grace` / `finalize_grace` `Duration` fields with engine defaults. - **`docs/api-reference.md`** — regenerated by `protoc-gen-doc` to include the new `banyandb/pipeline/v1` section. - **`docs/design/backlog/post-trace-scoring.md`** (new, 113 lines) — preserves the previously-considered weighted-scoring model (`S_trace`, score persistence, revision pinning) as deferred design with explicit rationale for the trade-off. ## Key design decisions (each grounded in real codebase facts) 1. **Targeting reuses catalog identifiers** (Group via `metadata`, stage names, schema names) — no parallel abstract `Pipeline`/`ExecutionTrigger`. Trace-specific rules stay out of catalog-generic `LifecycleStage`. 2. **§8.1 LSM merge filter** integrates at `mergeBlocks` (`banyand/trace/merger.go`), wraps both `mustWriteRawBlock` and `mustWriteBlock`. Per-`trace_id` verdicts. Drops are gated on per-trace `merge_grace` because compaction is **part-count-driven** (`getPartsToMerge`), not time-driven, so it routinely runs on the active write window. 3. **§8.3 finalization** uses an **event-time watermark + `finalize_grace`** rather than a "segment sealed" event — because BanyanDB has no seal: \`segmentController.create\` pre-creates the next segment up to \`creationGap\` (1h) *before* the current one ends, and event-time write routing means late data keeps landing in old segments. 4. **§8.2 pre-migration rewrite** runs as a standalone reduced-part producer between \`generateAllPartData\` and \`streamPartToTargetShard\`; the migration byte-copy stays lossless. 5. **\`merge_grace\` (per-trace) vs \`finalize_grace\` (per-segment)** — distinct knobs solving distinct correctness problems (intra-trace span spread vs segment-wide late arrival). Both are \`Duration\` fields on \`TracePipelineConfig\` with PGV \`gt: 0s\` and documented engine defaults (30s / 5m). 6. **Uniqueness policy (§2.3):** at most one active \`TracePipelineConfig\` per \`(Group, Schema, Stage, StageEvent)\` tuple; registry rejects conflicting writes. Multi-pipeline composition deliberately not supported — drops must be deterministic. Runtime fail-safe: if a registry inconsistency ever exposes >1 active match, the engine logs and **retains** (never destructive drop on ambiguity). 7. **Empty-predicate semantics (§4.2):** a \`StageRule\` with zero predicates is a no-op (no drops); with ≥1 predicate, predicates are OR-combined. 8. **Cross-segment traces and post-grace late spans** are honestly named as known limitations (§3.1 caveats), not papered over. ## What's intentionally NOT here - **No implementation** — this PR is design + proto only. Follow-up PRs will add the merge-filter hook, pre-migration rewrite, and finalization scheduler. - **No weighted scoring model** — the previously-proposed \`S_trace = w_d·log(D/D_th + 1) + w_e·I(err) + Σ w_tag·I(tag)\` formula and its per-stage thresholds were considered and **deferred to backlog** after a trade-off analysis: for the workloads we currently target, hard predicates capture intent more clearly than a weighted score with magic-number tuning. The full scoring spec is preserved at \`docs/design/backlog/post-trace-scoring.md\` for revival if a workload demonstrates the need. - **No \`TracePipelineRegistryService\`** — the message is a tracked schema resource (it has \`common.v1.Metadata\` with \`modRevision\`), but the registry RPC follows the pattern of the existing \`database/v1\` registries and can be added when implementation begins. ## Local CI status \`make generate / license-check / check-req / build / license-dep / lint / check\` all pass on this branch. \`make test*\` was not run in this loop — pure design + proto changes have no Go test impact, but reviewers may want to run them in their environment. ## Test plan - [ ] Review the design doc end-to-end; flag any storage-model assumption that doesn't match current code (\`banyand/trace/merger.go\`, \`banyand/internal/storage/rotation.go\`, \`banyand/backup/lifecycle/trace_migration_visitor.go\`). - [ ] Review \`trace_pipeline.proto\` field-by-field; in particular the validation bounds and the \`TagMatcher\` vs \`TagSamplingRule\` split. - [ ] Confirm the registry-side uniqueness + admission rules (§2.3, §2.4) are acceptable as implementation requirements. - [ ] Confirm the backlog file location (\`docs/design/backlog/post-trace-scoring.md\`) is the right home for deferred specs. - [ ] Confirm the two scenario JSONs (§7.1 \`sw_trace\`, §7.2 \`sw_zipkinTrace\`) match intended retention behavior. - [ ] Run \`make generate && make build && make lint\` locally — codegen produces clean \`.pb.go\` + \`.pb.validate.go\` for the new package. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
