I agree that option (a) is what user expects for row level changes. I feel the added deletes in given snapshots provides a PK of DELETED entry, existing deletes are used to read together with data files to find DELETED value (V1b) and result of columns.
Thanks, Steve Zhang > On Aug 20, 2024, at 6:06 PM, Wing Yew Poon <wyp...@cloudera.com.INVALID> > wrote: > > Hi, > > I have a PR open to add changelog support for the case where delete files are > present (https://github.com/apache/iceberg/pull/10935). I have a question > about what the changelog should emit in the following scenario: > > The table has a schema with a primary key/identifier column PK and additional > column V. > In snapshot 1, we write a data file DF1 with rows > PK1, V1 > PK2, V2 > etc. > In snapshot 2, we write an equality delete file ED1 with PK=PK1, and new data > file DF2 with rows > PK1, V1b > (possibly other rows) > In snapshot 3, we write an equality delete file ED2 with PK=PK1, and new data > file DF3 with rows > PK1, V1c > (possibly other rows) > > Thus, in snapshot 2 and snapshot 3, we update the row identified by PK1 with > new values by using an equality delete and writing new data for the row. > These are the files present in snapshot 3: > DF1 (sequence number 1) > DF2 (sequence number 2) > DF3 (sequence number 3) > ED1 (sequence number 2) > ED2 (sequence number 3) > > The question I have is what should the changelog emit for snapshot 3? > For snapshot 1, the changelog should emit a row for each row in DF1 as > INSERTED. > For snapshot 2, it should emit a row for PK1, V1 as DELETED; and a row for > PK1, V1b as INSERTED. > For snapshot 3, I see two possibilities: > (a) > PK1,V1b,DELETED > PK1,V1c,INSERTED > > (b) > PK1,V1,DELETED > PK1,V1b,DELETED > PK1,V1c,INSERTED > > The interpretation for (b) is that both ED1 and ED2 apply to DF1, with ED1 > being an existing delete file and ED2 being an added delete file for it. We > discount ED1 and apply ED2 and get a DELETED row for PK1,V1. > ED2 also applies to DF2, from which we get a DELETED row for PK1, V1b. > > The interpretation for (a) is that ED1 is an existing delete file for DF1 and > in snapshot 3, the row PK1,V1 already does not exist before the snapshot. > Thus we do emit a row for it. (We can think of it as ED1 is already applied > to DF1, and we only consider any additional rows that get deleted when ED2 is > applied.) > > I lean towards (a), as I think it is more reflective of net changes. > I am interested to hear what folks think. > > Thank you, > Wing Yew > >