Hi, Mehul. Thanks for your attention. I think we don't need to introduce an extra post-commit hook to manage small files. In the design, all files that belong to same bucket(in iceberg, it'll be same partition) be distributed to same task to write. So, the task can compact these small files then for the partition. As this FIP said, while creating IcebergLakeWriter in one round of tiering, the writer can scan manifest to know the files in this bucket, if found compaction is available, it can compact these files while writing new files. We have a similar logic for tiering to paimon.
Best regards, Yuxia ----- 原始邮件 ----- 发件人: "Mehul Batra" <[email protected]> 收件人: "dev" <[email protected]> 发送时间: 星期四, 2025年 7 月 03日 下午 5:04:18 主题: Re: [DISCUSS] FIP-3: Support tiering Fluss data to Iceberg +1 This will help us to address the missing table format and provide better ecosystem interoperability. Iceberg's growing adoption in the data lakehouse space makes this a valuable addition to Fluss's tiering capabilities. Are there any plans to integrate the Maintenance services as part of tiering itself as a post-commit hook to manage small files? Warm regards, Mehul Batra On Thu, Jul 3, 2025 at 2:24 PM yuxia <[email protected]> wrote: > Hi, > > Fluss currently supports tiering data to Apache Paimon, enabling > cost-effective storage management for warm/cold data. However, the lack of > native Iceberg tiering support limits flexibility and ecosystem integration > for users who rely on Iceberg’s open table format. > > To address this gap, I’d like to propose FIP-3: Support Tiering Fluss Data > to Iceberg[1] which aims to integrate Iceberg into Fluss’s tiering > capabilities. > > Welcome your feedback and suggestions on this proposal. Looking forward to > a productive discussion! > > [1]: > https://cwiki.apache.org/confluence/display/FLUSS/FIP-3%3A+Support+tiering+Fluss+data+to+Iceberg > > Best regards, > Yuxia > >
