Hi Hongtao and SkyWalking Community, I've been reviewing the BanyanDB codebase to prepare a proposal for the Native Data Export/Import Utility.
I see that bydbctl is the natural home for this feature. My current thinking for the architecture is: Streaming over Buffering: Instead of loading full query results into memory, the export command (bydbctl data export) should implement a gRPC Stream receiver that writes to the file buffer in chunks. This ensures we can export GBs of logs with minimal RAM usage. Format Strategy: Parquet: Use schema reflection to map the BanyanDB TagFamilies directly to Parquet columns for efficient analysis. Binary: A raw dump of the KV pairs for faster restore operations (Disaster Recovery). I have prototyped a basic Parquet writer in Go to test the schema mapping. Before I draft the full proposal, do you have a preference on how we handle schema evolution? (e.g., if the imported data has extra tags that the current server schema doesn't match). Best regards, Tanay Paul
