Hi there, We are testing out writing Kafka to hive table as parquet format. Currently, we have seen user has to choose to create lots of small files in min level folder to gain latency benefits. I recall FF2020 Global folks mentioned implement compaction logic during the checkpointing time. Wonder how that goes? Love collaborate on this topic.
Chen Pinterest