Hi Jingsong, Very thanks for your feedback!
I’ll reuse the `compact` to do incremental clustering. And the images had been updated. Best, Lei Li > 2025年9月23日 10:40,Jingsong Li <[email protected]> 写道: > > Hi Lei, > > Thanks for starting this discussion. > > 1. incremental-cluster.enabled can be clustering.incremental = true. > 2. I think we can reuse `compact`. CALL sys.compact is OK. > 3. Please update your image, I can not see them. > > Best, > Jingsong > > On Tue, Sep 23, 2025 at 10:37 AM Jingsong Li <[email protected]> wrote: >> >> Correct link should be: >> >> https://cwiki.apache.org/confluence/display/PAIMON/PIP-36%3A+Introduce+Incremental+Clustering+for+Paimon+Append+Table >> >> On Fri, Sep 19, 2025 at 5:41 PM lei li <[email protected]> wrote: >>> >>> Hi everyone, >>> >>> >>> I'd like to start a discussion about PIP-36: Introduce Incremental >>> Clustering for Paimon Append Table [1]. >>> >>> >>> Paimon currently supports ordering append tables using SFC (Space-Filling >>> Curve)[2]. The resulting data layout typically delivers better performance >>> for queries that target clustering keys. However, with the current >>> SortCompact, even when neither the data nor the clustering keys have >>> changed, each run still rewrites the entire dataset, which is extremely >>> costly. To address this, we plan to introduce a more flexible, incremental >>> clustering mechanism—Incremental Clustering. On each run, it selects only a >>> specific subset of files to cluster, avoiding a full rewrite. This enables >>> low-cost, sort-based optimization of the data layout and improves query >>> performance. In addition, with Incremental Clustering, you can adjust >>> clustering keys without rewriting existing data, the layout evolves >>> dynamically as cluster runs and gradually converges to an optimal state, >>> significantly reducing the decision-making complexity around data layout. >>> >>> >>> Incremental Clustering supports: >>> >>> * Support incremental clustering; minimizing write amplification as >>> possible. >>> * Support small-file compaction; during rewrites, respect >>> target-file-size. >>> * Support changing clustering keys; newly ingested data is clustered >>> according to the latest clustering keys. >>> * Provide a full mode; when selected, the entire dataset is reclustered. >>> >>> >>> The detailed design and PoC results can be see in PIP-36[1]. >>> >>> >>> Looking forward to your feedback, thanks! >>> >>> >>> [1] >>> https://cwiki.apache.org/confluence/display/PAIMON/PIP-36%3A+Introduce+Incremental+Clustering+for+Paimon+Append+Table[2] >>> >>> https://paimon.apache.org/docs/master/maintenance/dedicated-compaction/#sort-compact >>> Best, >>> >>> Lei Li
