Thank you for your interest in this project. The streaming export procedure should cover both the server and client sides. The server side indicates the data node of the BanyanDB cluster. It should provide chunked export to reduce memory and CPU overhead. The client side is bydbctl, which accesses the export streaming service provided by the data node.
For the format, the native format indicates BanyanDB's column-based file format. You can refer to the backup/restore feature for details. Additionally, a CSV-based plain format should be accessible. Both native and plain formats should contain two parts: schema and data. Furthermore, please undertake a simple task from the issue list[1]. While it's not mandatory for your proposal, completing it will help us estimate your capabilities for this task. I suggest this particular task[2], as it is suitable for beginners. 1. https://github.com/apache/skywalking/issues?q=is%3Aissue%20state%3Aopen%20label%3Adatabase 2. https://github.com/apache/skywalking/issues/13408 Best regards Hongtao On Tue, Feb 3, 2026 at 1:32 PM Tanay Paul <[email protected]> wrote: > > Hi Hongtao and SkyWalking Community, > > I've been reviewing the BanyanDB codebase to prepare a proposal for the > Native Data Export/Import Utility. > > I see that bydbctl is the natural home for this feature. My current > thinking for the architecture is: > > Streaming over Buffering: Instead of loading full query results into > memory, the export command (bydbctl data export) should implement a gRPC > Stream receiver that writes to the file buffer in chunks. This ensures we > can export GBs of logs with minimal RAM usage. > > Format Strategy: > > Parquet: Use schema reflection to map the BanyanDB TagFamilies directly to > Parquet columns for efficient analysis. > > Binary: A raw dump of the KV pairs for faster restore operations (Disaster > Recovery). > > I have prototyped a basic Parquet writer in Go to test the schema mapping. > Before I draft the full proposal, do you have a preference on how we handle > schema evolution? (e.g., if the imported data has extra tags that the > current server schema doesn't match). > > Best regards, > > Tanay Paul
