Thanks a lot Ryan, that would be very helpful!

Delta lake recently adds support for such operations in API level (
https://github.com/delta-io/delta/blob/master/src/main/scala/io/delta/tables/DeltaTable.scala).
I was thinking that in the API level the goal of Iceberg is similar, maybe
we could take that as a reference.

Besides directly using Iceberg API to manipulate data is not so
straightforward, so it would be great if we could also have a DF API/SQL
support later on.

Best regards
Saisai

Ryan Blue <rb...@netflix.com> 于2019年8月8日周四 上午1:22写道:

> Hi Saisai,
>
> We are working on adding row-level delete support to Iceberg, where the
> deletes are applied when data is read. We’ve had a few good design
> discussions and have come up with a good way to integrate these into the
> format. Erik has written a good document on it:
> https://docs.google.com/document/d/1FMKh_SQ6xSUUmoCA8LerTkzIxDUN5JbStQp5Hzot4eo/edit#heading=h.p74qmh3a6ets
>
> I’ve also started a milestone to track this work:
> https://github.com/apache/incubator-iceberg/issues?q=is%3Aopen+is%3Aissue+milestone%3A%22Row-level+Delete%22
>
> That’s assuming that you’re talking about row-level deletes. Iceberg
> already supports file-level delete, overwrite, etc.
>
> Iceberg also already supports a vacuum operation using ExpireSnapshots
> <http://iceberg.apache.org/javadoc/master/index.html?org/apache/iceberg/ExpireSnapshots.html>.
> But, Spark (and other engines) don’t have a way to call this yet. Same for 
> MERGE
> INTO, open source Spark doesn’t support the operation yet. We’re also
> working on building support into Spark as we go.
>
> I hope that helps!
>
> On Wed, Aug 7, 2019 at 4:25 AM Saisai Shao <sai.sai.s...@gmail.com> wrote:
>
>> Hi team,
>>
>> Delta lake project recently announced version 0.3.0, which added several
>> new features in API level, like update, delete, merge, vacuum, etc. May I
>> ask is there any plan to add such features in Iceberg?
>>
>> Thanks
>> Saisai
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Reply via email to