There is also a way to expire snapshots without using Spark, through the ExpireSnapshots API:
table.expireSnapshots().expireOlderThan(timestampInMs).commit(); That is what we used in production for a long time, but it isn’t as good as the action-based one that compares file trees. I’d recommend using the expire_snapshots procedure that Russell pointed to: https://iceberg.apache.org/spark-procedures/#expire_snapshots On Wed, Jun 23, 2021 at 7:49 AM Russell Spitzer <[email protected]> wrote: > There are "actions" which contain common table maintenance things, > > You are most likely interested in ExpireSnapshots, RewriteDataFiles and > RemoveOrphanFiles see > > https://iceberg.apache.org/spark-procedures/ > > On Tue, Jun 22, 2021 at 7:19 PM yong.sunny <[email protected]> wrote: > >> Hi Iceberg Dev, >> >> Is there any exising mechanism to do GC in iceberg? Or there is an >> implementation based on Spark? >> >> Thanks and Best regards, >> Yong >> >> >> >> > -- Ryan Blue Tabular
