Thanks a lot Ryan, that would be very helpful! Delta lake recently adds support for such operations in API level ( https://github.com/delta-io/delta/blob/master/src/main/scala/io/delta/tables/DeltaTable.scala). I was thinking that in the API level the goal of Iceberg is similar, maybe we could take that as a reference.
Besides directly using Iceberg API to manipulate data is not so straightforward, so it would be great if we could also have a DF API/SQL support later on. Best regards Saisai Ryan Blue <rb...@netflix.com> 于2019年8月8日周四 上午1:22写道: > Hi Saisai, > > We are working on adding row-level delete support to Iceberg, where the > deletes are applied when data is read. We’ve had a few good design > discussions and have come up with a good way to integrate these into the > format. Erik has written a good document on it: > https://docs.google.com/document/d/1FMKh_SQ6xSUUmoCA8LerTkzIxDUN5JbStQp5Hzot4eo/edit#heading=h.p74qmh3a6ets > > I’ve also started a milestone to track this work: > https://github.com/apache/incubator-iceberg/issues?q=is%3Aopen+is%3Aissue+milestone%3A%22Row-level+Delete%22 > > That’s assuming that you’re talking about row-level deletes. Iceberg > already supports file-level delete, overwrite, etc. > > Iceberg also already supports a vacuum operation using ExpireSnapshots > <http://iceberg.apache.org/javadoc/master/index.html?org/apache/iceberg/ExpireSnapshots.html>. > But, Spark (and other engines) don’t have a way to call this yet. Same for > MERGE > INTO, open source Spark doesn’t support the operation yet. We’re also > working on building support into Spark as we go. > > I hope that helps! > > On Wed, Aug 7, 2019 at 4:25 AM Saisai Shao <sai.sai.s...@gmail.com> wrote: > >> Hi team, >> >> Delta lake project recently announced version 0.3.0, which added several >> new features in API level, like update, delete, merge, vacuum, etc. May I >> ask is there any plan to add such features in Iceberg? >> >> Thanks >> Saisai >> > > > -- > Ryan Blue > Software Engineer > Netflix >