[I] [Feature] Trace Pipeline Support in BanyanDB [skywalking]

via GitHub Thu, 25 Dec 2025 03:39:15 -0800


wu-sheng opened a new issue, #13634:
URL: https://github.com/apache/skywalking/issues/13634


   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no 
similar feature requirement.
   
   
   ### Description
   
   ## 🌟 Feature Request: Trace Pipeline Support in BanyanDB (Target: v0.11)
   
   ### **Summary**
   Introduce a **trace pipeline framework** into BanyanDB to enable 
**trace-oriented processing**, **sampling**, and **enrichment** directly within 
the database layer.  
   The pipeline will serve as the foundation for intelligent, context-aware 
trace handling—allowing dynamic update, analysis, and retention decisions while 
preserving schema and performance guarantees.
   
   ---
   
   ### **Background**
   BanyanDB currently supports efficient ingestion and querying of trace data, 
but processing is span-oriented and lacks capabilities for post-ingestion, 
trace-level operations.  
   
   Modern observability scenarios—such as **tail-based sampling**, 
**cross-service trace enrichment**, or **behavioral trace analysis**—require 
the database to treat a group of spans (sharing the same trace ID) as one 
logical unit.  
   
   Introducing a trace pipeline provides a structured mechanism to:
   - Analyze and modify trace data after ingestion.
   - Aggregate spans consistently, regardless of their reported order or source.
   - Make smarter data retention decisions based on complete trace context.
   
   ---
   
   ### **Goal**
   Deliver a **trace-level pipeline mechanism** that:
   1. Processes collected spans as unified trace entities in best-effort 
strategy(No 100% guaranteed).
   2. Supports runtime **enrichment and analysis** of existing tag values, 
further considering to declear trace-level schema, e.g. trace latency.
   3. Maintains persistent **file-based processor context** across ingestion 
batches for continuity and scale.  
   4. Enables **tail-based sampling** via a reserved deletion/retention flag.  
   5. Integrates cleanly with **OAP** for schema declaration, dynamic loading, 
and test validation.
   
   ### Use case
   
   This issue includes several key features and use cases, mostly around trace 
post-analysis, e.g., tail sampling.
   
   Most things were discussed between @hanahmily and me at a high level. No 
detail is confirmed yet.
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a pull request to implement this on your own?
   
   - [ ] Yes I am willing to submit a pull request on my own!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Feature] Trace Pipeline Support in BanyanDB [skywalking]

Reply via email to