Sorry, I didn't address the suggestion to add a Flink branch as well. The work needed for the Flink sink is to remove parts that are specific to Netflix, so I'm not sure what the rationale for a branch would be. Is there a reason why this can't be done in master, but requires a shared branch? If multiple people want to contribute, why not contribute to the same PR?
A shared PR branch makes the most sense to me for this because it is regularly tested against master. On Mon, Mar 30, 2020 at 2:48 PM Ryan Blue <[email protected]> wrote: > I think we will eventually may want a branch, but I think it is too early > to create one now. > > Branches are expensive. They require maintenance to stay in sync with > master, usually copying changes from master into the branch with updates. > Updating the changes to master for the branch is more difficult because it > is usually not the original contributor or reviewer porting them. And it is > better to catch problems between changes in master and the branch early. > > I'm not against branches, but I don't want to create them unless they are > valuable. In this case, I don't see the value. We plan to add v2 in > parallel so you can still write v1 tables for compatibility, and most of > the work that needs to be done -- like creating readers and writers for > diff formats -- can be done in master. > > rb > > On Mon, Mar 30, 2020 at 9:00 AM Gautam <[email protected]> wrote: > >> Thanks for bringing this up OpenInx. That's a great idea: to open a >> separate branch for row-level deletes. >> >> I would like to help support/contribute/review this as well. If there are >> sub-tasks you guys have identified that can be added to >> https://github.com/apache/incubator-iceberg/milestone/4 we can start >> taking those up too. >> >> thanks for the good work, >> - Gautam. >> >> >> >> On Mon, Mar 30, 2020 at 8:39 AM Junjie Chen <[email protected]> >> wrote: >> >>> +1 to create the branch. Some row-level delete subtasks must be based on >>> the sequence number as well as end to end tests. >>> >>> On Fri, Mar 27, 2020 at 4:42 PM OpenInx <[email protected]> wrote: >>> >>>> Dear Dev: >>>> >>>> Tuesday, we had a sync meeting. and discussed about the things: >>>> 1. cut the 0.8.0 release; >>>> 2. flink connector ; >>>> 3. iceberg row-level delete; >>>> 4. Map-Reduce Formats and Hive support. >>>> >>>> We'll release version 0.8.0 around April 15, the following 0.9.0 >>>> will be >>>> released in the next few month. On the other hand, Ryan, Junjie >>>> Chen >>>> and I have done three PoC versions for the row-level deletes. We >>>> had >>>> a full discussion[4] and started to do the relevant code design. >>>> we're sure that >>>> the feature will introduce some incompatible specification, such >>>> as the >>>> sequence_number spec[1], file_type spec[2], the sortedOrder >>>> feature seems >>>> also to be a breaking change [3]. >>>> >>>> To avoid affecting the release of version 0.8.0 and push the >>>> row-delete feature >>>> early. I suggest to open a new branch for the row-delete feature, >>>> name it branch-1. >>>> Once the row-delete feature is stable, we could release the 1.0.0. >>>> Or we can just >>>> open a row-delete feature branch and once the work is done we will >>>> merge >>>> the row-delete feature branch back to master branch, and continue >>>> to release the 0.9.0 >>>> version. >>>> >>>> I guess the flink connector dev are facing the same problem ? >>>> >>>> What do you think about this ? >>>> >>>> Thank you. >>>> >>>> >>>> [1]. https://github.com/apache/incubator-iceberg/pull/588 >>>> [2]. https://github.com/apache/incubator-iceberg/issues/824 >>>> [3]. https://github.com/apache/incubator-iceberg/issues/317 >>>> [4]. >>>> https://docs.google.com/document/d/1CPFun2uG-eXdJggqKcPsTdNa2wPMpAdw8loeP-0fm_M/edit?usp=sharing >>>> >>>> >>> >>> -- >>> Best Regards >>> >> > > -- > Ryan Blue > Software Engineer > Netflix > -- Ryan Blue Software Engineer Netflix
