Hi Lei, Thanks for starting this discussion.
1. incremental-cluster.enabled can be clustering.incremental = true. 2. I think we can reuse `compact`. CALL sys.compact is OK. 3. Please update your image, I can not see them. Best, Jingsong On Tue, Sep 23, 2025 at 10:37 AM Jingsong Li <[email protected]> wrote: > > Correct link should be: > > https://cwiki.apache.org/confluence/display/PAIMON/PIP-36%3A+Introduce+Incremental+Clustering+for+Paimon+Append+Table > > On Fri, Sep 19, 2025 at 5:41 PM lei li <[email protected]> wrote: > > > > Hi everyone, > > > > > > I'd like to start a discussion about PIP-36: Introduce Incremental > > Clustering for Paimon Append Table [1]. > > > > > > Paimon currently supports ordering append tables using SFC (Space-Filling > > Curve)[2]. The resulting data layout typically delivers better performance > > for queries that target clustering keys. However, with the current > > SortCompact, even when neither the data nor the clustering keys have > > changed, each run still rewrites the entire dataset, which is extremely > > costly. To address this, we plan to introduce a more flexible, incremental > > clustering mechanism—Incremental Clustering. On each run, it selects only a > > specific subset of files to cluster, avoiding a full rewrite. This enables > > low-cost, sort-based optimization of the data layout and improves query > > performance. In addition, with Incremental Clustering, you can adjust > > clustering keys without rewriting existing data, the layout evolves > > dynamically as cluster runs and gradually converges to an optimal state, > > significantly reducing the decision-making complexity around data layout. > > > > > > Incremental Clustering supports: > > > > * Support incremental clustering; minimizing write amplification as > > possible. > > * Support small-file compaction; during rewrites, respect > > target-file-size. > > * Support changing clustering keys; newly ingested data is clustered > > according to the latest clustering keys. > > * Provide a full mode; when selected, the entire dataset is reclustered. > > > > > > The detailed design and PoC results can be see in PIP-36[1]. > > > > > > Looking forward to your feedback, thanks! > > > > > > [1] > > https://cwiki.apache.org/confluence/display/PAIMON/PIP-36%3A+Introduce+Incremental+Clustering+for+Paimon+Append+Table[2] > > > > https://paimon.apache.org/docs/master/maintenance/dedicated-compaction/#sort-compact > > Best, > > > > Lei Li
