There is also a way to expire snapshots without using Spark, through the
ExpireSnapshots API:

table.expireSnapshots().expireOlderThan(timestampInMs).commit();

That is what we used in production for a long time, but it isn’t as good as
the action-based one that compares file trees. I’d recommend using the
expire_snapshots procedure that Russell pointed to:
https://iceberg.apache.org/spark-procedures/#expire_snapshots

On Wed, Jun 23, 2021 at 7:49 AM Russell Spitzer <[email protected]>
wrote:

> There are "actions" which contain common table maintenance things,
>
> You are most likely interested in ExpireSnapshots, RewriteDataFiles and
> RemoveOrphanFiles see
>
> https://iceberg.apache.org/spark-procedures/
>
> On Tue, Jun 22, 2021 at 7:19 PM yong.sunny <[email protected]> wrote:
>
>> Hi Iceberg Dev,
>>
>> Is there any exising mechanism to do GC in iceberg? Or there is an
>> implementation based on Spark?
>>
>> Thanks and Best regards,
>> Yong
>>
>>
>>
>>
>

-- 
Ryan Blue
Tabular

Reply via email to