Hi Impala Devs, I am planning a production pipeline to ingest data from Kafka into Impala with a high throughput of approximately 500,000 QPS. Given the metadata overhead and file management constraints in Impala, I would like to get your recommendations on the most robust architecture.
My Current Environment: Impala Version: 4.5.0 Storage: Tencent Cloud COS (Object Storage) Table Format: Apache Iceberg
