Hi all,

In Flink SQL, in UPSERT mode, I have observed that if I INSERT a new record 
with a new equality field Id, then a equality delete file is also created with 
the corresponding entry, for example I executed following commands in Flink SQL 
with Apache Iceberg-


CREATE TABLE `hadoop_catalog`.`testdb`.`upsert_test1` (

     `id`  INT UNIQUE COMMENT 'unique id',

     `data` STRING NOT NULL,

     PRIMARY KEY(`id`) NOT ENFORCED

 ) with ('format-version'='2', 'write.upsert.enabled'='true');

now I inserted a record-


INSERT INTO upsert_test1 VALUES (7, 'new value');

It resulted in 2 files -
data file content-

{"id":7,"data":"new value"}

But it also created an equality delete file -


{"id":7}

I expect that it will create a delete file entry for UPDATE / DELETE but not 
for INSERT as it might lead to performance degradation for reads for CDC 
tables, right?
is it expected that fresh INSERTS will also have equality delete entries ? If 
yes, what is the benefit of having equality delete entry for INSERTS ?


Regards,
Aditya



Reply via email to