[DISCUSS] Changes for row-level deletes

2020-05-05 Thread Ryan Blue
Hi, everyone, I know several people that are planning to attend the sync tomorrow are interested in the row-level delete work, so I wanted to share some of the progress and my current thinking ahead of time. The codebase now supports a new version number, 2. Tables must be manually upgraded to ve

Re: [DISCUSS] Changes for row-level deletes

2020-05-05 Thread OpenInx
The two-phrase approach sounds good to me. the precondition is we have limited number of delete files so that memory can hold all of them, we will have the compaction service to reduce the delete files so it seems not a problem.

Re: [DISCUSS] Changes for row-level deletes

2020-05-05 Thread OpenInx
Besides I'd like to share some work in my flink team, hope it will be helpful for you. We have customers who want to try the flink+iceberg to build their business data lake, the classic scenarios are: 1. streaming click events into iceberg and analyze by other olap engines ; 2. streaming CDC even

Re: [DISCUSS] Changes for row-level deletes

2020-05-05 Thread Miao Wang
Hi Ryan, “Tables must be manually upgraded to version 2 in order to use any of the metadata changes we are making” If I understand correctly, for exist iceberg table in v1, we have to run some CLI/script to rewrite the metadata. “Next, we've added sequence numbers and the proposed inheritance s