Hi Ryan,

Thanks for your input.

I think the Flink Connector API is relatively stable now, compared to the
previous versions.
We have verified the latest Iceberg connector with the upcoming 1.16
release, and it works well.
I think API stability is something for the future and we should have some
workflow or mechanism
to guarantee this from an external connector side.

We will come up with a proposal about the API compatibility guarantee
workflow/mechanism and
a best practice + PoC for multi-version support. We are willing to join the
Iceberg community to
improve/refactor the connector and deliver a better experience of the
connector for users.

How about holding the voting a bit and waiting until we have a conclusion
about the discussion?

Best,
Jark

On Tue, 25 Oct 2022 at 03:55, Ryan Blue <b...@tabular.io> wrote:

> I don't think we want to talk about the Flink community accepting the
> Iceberg connector just yet. The goal of Abid's exploration is to see
> what it would look like as an external connector. We'd need to decide
> in the Iceberg community if that's something that we'd want to do long
> term. If it were me, I'd probably say wait until the connector APIs
> are stable and there is a best practice for releasing.
>
> Ryan
>
> On Mon, Oct 24, 2022 at 11:16 AM Martijn Visser
> <martijnvis...@apache.org> wrote:
> >
> > Hi all,
> >
> > There are many valid points raised in this discussion thread, but I
> think we should not mix up different topics. From my perspective, there's
> two things ongoing:
> >
> > 1. This thread is about the Flink community accepting the Iceberg
> connector, with various maintainers from Iceberg volunteering to help with
> the maintenance of the connector itself.
> > 2. Also included in this thread are discussions about the
> externalization of connectors from Flink. There have been recent
> discussions on this [1] and there is engineering activity happening on that
> topic and it is a big focus point for the next couple weeks/months. With
> regards to seeing different opinions, I actually don't see those on the
> mailing list because after the discussions, voting is passing.
> >
> > Best regards,
> >
> > Martijn
> >
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development
> >
> > On Fri, Oct 21, 2022 at 3:01 AM Jark Wu <imj...@gmail.com> wrote:
> >>
> >> Hi Abid and all,
> >>
> >> I added the Iceberg dev community for a wider discussion.
> >>
> >> I agree with Yuxia and have the same concern as Steven Wu.
> >>
> >> There were long discussions around the externalizing connector and many
> >> different opinions.
> >> If I remember correctly[1][2], at last, we would like to externalize
> >> ElasticSearch as an example,
> >> and see how it works and what we can standardize (e.g., docs, releases,
> >> versions, CI).
> >> When everything works well, we can externalize other connectors.
> >>
> >> However, from what I see, currently, the externalized ElasticSearch
> >> connector
> >> is still at an early stage without releasing any versions.
> >> It looks like we still don't have a mature workflow.
> >> It's also not clear to me how much maintenance increased.
> >> Is this a scalable way to support dozens of connectors?
> >> Does the community have so many resources/committers to merge PR?
> >> How much impact on contributors' contribution when it's not in the main
> >> repo?
> >>
> >> IMO, the Iceberg connector is a very important connector for the Flink
> >> ecosystem.
> >> It's a mature connector and many users like it! I hope it can have a
> better
> >> future.
> >> However, the externalizing workflow is still evolving and under
> >> verification.
> >> It might not be the best place for popular connectors at the current
> point
> >> in time.
> >>
> >> For the reasons of moving the Iceberg connection that Abid mentioned,
> >> 1) API stability to reduce multiple version maintenance.
> >> 2) Flink experts to help maintain the connector.
> >>
> >> I think the moving doesn't help much for the API issues because it is
> still
> >> in a separate repo.
> >> On the contrary, the connector has to struggle with additional API
> issues
> >> from the Iceberg project.
> >> Besides, the connector may need to maintain 6 more versions (3x3 vs 3)
> >> which is un-maintainable.
> >> Actually, Flink API is becoming stable in recent versions. We have also
> >> verified the latest Iceberg
> >> connector on the upcoming 1.16 release, and it works well. Flink
> community
> >> also proposed FLIPs[3][4]
> >> for API stability guarantees. On the other side, I also don't like the
> >> version matrix modules/branches.
> >> We use a shim layer to support different versions of Hive for
> >> flink-connector-hive with only 1 module
> >> for different hive versions. We have similar practices in
> >> flink-cdc-connectors[5] and end-to-end tests
> >> to guarantee compatibility with different Flink versions[6]. The
> >> maintenance is acceptable to us for so long.
> >>
> >> In a word, I think we have ways to solve API issues and Flink API is
> >> becoming stable.
> >> For the Flink experts, Yuxia is the component owner of
> >> flink-connector-hive. He has plenty
> >> of knowledge of cross-version compatibility. He is willing to join the
> >> Iceberg community to
> >> help improve the version problem and maintain the connector. What do you
> >> think about it?
> >>
> >> Best,
> >> Jark
> >> Ververica (Alibaba)
> >>
> >> [1] https://lists.apache.org/thread/8k1xonqt7hn0xldbky1cxfx3fzh6sj7h
> >> [2] https://lists.apache.org/thread/9mzxnl4948ddq07f980mmzoz0c9stnlb
> >> [3]:
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-196%3A+Source+API+stability+guarantees
> >> [4]:
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> >> [5] https://github.com/ververica/flink-cdc-connectors/
> >> [6]
> >>
> https://github.com/ververica/flink-cdc-connectors/blob/master/flink-cdc-e2e-tests/src/test/java/com/ververica/cdc/connectors/tests/utils/FlinkContainerTestEnvironment.java#L124
> >>
> >> On Thu, 20 Oct 2022 at 22:41, Jing Ge <j...@ververica.com> wrote:
> >>
> >> > I agree with Steven Wu that those points are applicable to every
> >> > externalized connector. So those were actually concerns about
> externalizing
> >> > connector development and there were already some discussions and
> consensus
> >> > has already been made to do it.
> >> >
> >> > Speaking of the 3x3 concern, I think the concept[1] proposed by
> Chesnay and
> >> > voted at [2] could help you.
> >> >
> >> > [1]
> >> >
> >> >
> https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development
> >> > [2] https://lists.apache.org/thread/7qr8jc053y8xpygcwbhlqq4r7c7fj1p3
> >> >
> >> > Best regards,
> >> > Jing
> >> >
> >> > On Thu, Oct 20, 2022 at 3:46 PM Steven Wu <stevenz...@gmail.com>
> wrote:
> >> >
> >> > > Yuxia, those are valid points. But they are applicable to every
> connector
> >> > > (not just Iceberg).
> >> > >
> >> > > I also had a similar concern expressed in the discussion thread of
> >> > > "Externalized connector release details&workflow". My main concern
> is the
> >> > > multiplication factor of two upstream projects (Flink &
> storage/Iceberg).
> >> > > if we limit both to two versions, it will be 2x2, which might still
> be
> >> > ok.
> >> > > but if we need to do 3x3, that will probably be too many to manage.
> >> > >
> >> > > On Thu, Oct 20, 2022 at 5:27 AM yuxia <luoyu...@alumni.sjtu.edu.cn>
> >> > wrote:
> >> > >
> >> > > > Hi, abmo, Abid!
> >> > > > Thanks you guys for diriving it.
> >> > > >
> >> > > > As Iceberg is more and more pupular and is an important
> >> > > > upstream/downstream system to Flink, I believe Flink community
> has paid
> >> > > > much attention to Icberg and hope to be closer to Icberg
> community. No
> >> > > > mather it's moved to Flink unbrella or not, I believe Flink
> experts are
> >> > > > glad to give feedbacks to Iceberg and take part in the
> development of
> >> > > > Icberg Flink connector.
> >> > > >
> >> > > >
> >> > > > Personaly, as a Flink contributor and main maintainer of Hive
> Flink
> >> > > > connector, I'm really glad to take part in Iceberg community for
> the
> >> > > > maintenance and future development of Icberg Flink connector. I
> think I
> >> > > can
> >> > > > provide some views from Flink side and bring some feedbacks from
> Icberg
> >> > > > comminuty to Flink community.
> >> > > >
> >> > > > But I have some concerns for moving the connector from Icberg
> >> > repository
> >> > > > to a separate connector under Flink umbrella:
> >> > > >
> >> > > > 1: If Iceberg develops new features, for icberg flink connector,
> it
> >> > have
> >> > > > to wait the Iceberg to be released before starting the
> development and
> >> > > > release for making use of the new features.  For users, they may
> need
> >> > to
> >> > > > wait a much longer time before enjoying the new features of
> Icberg by
> >> > > using
> >> > > > Flink.
> >> > > >
> >> > > > 2: If we move it to a sepreate repositoy, I'm afrad of it'll loss
> >> > > > attention from both Flink and Iceberg sides which is definitely a
> harm
> >> > to
> >> > > > Flink and Icerberg community. What's more, whenever Flink and
> icberge
> >> > > > release a version, we need to update the version in the sepreate
> >> > > > repositoy, which I think may be easily forgotten and tedious.
> >> > > >
> >> > > > Feel sorry for raising a different voice in this dicussion, but I
> think
> >> > > it
> >> > > > deserves a further dicussion in dev mail list, at least it will
> help to
> >> > > get
> >> > > > Flink developer's attention to Iceberg.
> >> > > >
> >> > > > Best regards,
> >> > > > Yuxia
> >> > > >
> >> > > > ----- 原始邮件 -----
> >> > > > 发件人: "abmo work" <abmo.w...@icloud.com.INVALID>
> >> > > > 收件人: "dev" <d...@flink.apache.org>
> >> > > > 发送时间: 星期四, 2022年 10 月 20日 上午 6:33:40
> >> > > > 主题: Re: [Discuss]- Donate Iceberg Flink Connector
> >> > > >
> >> > > > Hi Martijn,
> >> > > >
> >> > > > I created a FLIP for this, its FLIP 267: Iceberg Connector  <
> >> > > >
> >> > >
> >> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP+267:+Iceberg+Connector
> >> > > > >
> >> > > > Please let me know if anything else is needed. My email on
> confluence
> >> > is
> >> > > > abmo.w...@icloud.com.
> >> > > >
> >> > > > As 1.0 was released today, from Iceberg perspective we need to
> figure
> >> > out
> >> > > > what versions of Flink we will support and the release timeline
> as to
> >> > > when
> >> > > > the connector will be built and release off of the new repo vs
> Iceberg.
> >> > > >
> >> > > > Thanks
> >> > > > Abid
> >> > > >
> >> > > > > On Oct 19, 2022, at 12:43 PM, Martijn Visser <
> >> > martijnvis...@apache.org
> >> > > >
> >> > > > wrote:
> >> > > > >
> >> > > > > Hi Abid,
> >> > > > >
> >> > > > > We should have a FLIP as this would be a code contribution. If
> you
> >> > > > provide
> >> > > > > your Confluence user name, we can grant you access to create
> one.
> >> > > > >
> >> > > > > Is there also something from an Iceberg point of view needed to
> agree
> >> > > > with
> >> > > > > the code contribution?
> >> > > > >
> >> > > > > Best regards,
> >> > > > >
> >> > > > > Martijn
> >> > > > >
> >> > > > > Op wo 19 okt. 2022 om 19:11 schreef
> <abmo.w...@icloud.com.invalid>
> >> > > > >
> >> > > > >> Thanks Martijn!
> >> > > > >>
> >> > > > >> Thanks for all the support and positive responses. I will
> start a
> >> > vote
> >> > > > >> thread and send it out to the dev list.
> >> > > > >>
> >> > > > >> Also, we need help with creation of a new repo for the Iceberg
> >> > > > Connector.
> >> > > > >>
> >> > > > >> Can someone help with the creation of a repo? Please let me
> know if
> >> > I
> >> > > > need
> >> > > > >> to create an issue or flip for that.
> >> > > > >> Following similar naming for other connectors, I propose
> >> > > > >> https://github.com/apache/flink-connector-iceberg (doesn’t
> exist)
> >> > > > >>
> >> > > > >> Thanks
> >> > > > >> Abid
> >> > > > >>
> >> > > > >> On 2022/10/19 08:41:02 Martijn Visser wrote:
> >> > > > >>> Hi all,
> >> > > > >>>
> >> > > > >>> Thanks for the info and also thanks Peter and Steven for
> offering
> >> > to
> >> > > > >>> volunteer. I think that's a great idea and a necessity.
> >> > > > >>>
> >> > > > >>> Overall +1 given the current ideas to make this contribution
> >> > happen.
> >> > > > >>>
> >> > > > >>> BTW congrats on reaching Iceberg 1.0, a great accomplishment
> :)
> >> > > > >>>
> >> > > > >>> Thanks,
> >> > > > >>>
> >> > > > >>> Martijn
> >> > > > >>>
> >> > > > >>> On Tue, Oct 18, 2022 at 12:31 AM Steven Wu <st...@gmail.com>
> >> > wrote:
> >> > > > >>>
> >> > > > >>>> I was one of the maintainers for the Flink Iceberg connector
> in
> >> > > > Iceberg
> >> > > > >>>> repo. I can volunteer as one of the initial maintainers if we
> >> > decide
> >> > > > to
> >> > > > >>>> move forward.
> >> > > > >>>>
> >> > > > >>>> On Mon, Oct 17, 2022 at 3:26 PM <ab...@icloud.com.invalid>
> wrote:
> >> > > > >>>>
> >> > > > >>>>> Hi Martijn,
> >> > > > >>>>>
> >> > > > >>>>> Yes, It is considered a connector in Flink terms.
> >> > > > >>>>>
> >> > > > >>>>> We wanted to join the Flink connector externalization
> effort so
> >> > > that
> >> > > > >> we
> >> > > > >>>>> can bring the Iceberg connector closer to the Flink
> community. We
> >> > > are
> >> > > > >>>>> hoping any issues with the APIs for Iceberg connector will
> >> > surface
> >> > > > >> sooner
> >> > > > >>>>> and get more attention from the Flink community when the
> >> > connector
> >> > > is
> >> > > > >>>>> within Flink umbrella rather than in Iceberg repo. Also to
> get
> >> > > better
> >> > > > >>>>> feedback from Flink experts when it comes to things related
> to
> >> > > adding
> >> > > > >>>>> things in a connector vs Flink itself.
> >> > > > >>>>>
> >> > > > >>>>> Thanks everyone for all your responses! Looking forward to
> the
> >> > next
> >> > > > >>>> steps.
> >> > > > >>>>>
> >> > > > >>>>> Thanks
> >> > > > >>>>> Abid
> >> > > > >>>>>
> >> > > > >>>>> On 2022/10/14 03:37:09 Jark Wu wrote:
> >> > > > >>>>>> Thank Abid for the discussion,
> >> > > > >>>>>>
> >> > > > >>>>>> I'm also fine with maintaining it under the Flink project.
> >> > > > >>>>>> But I'm also interested in the response to Martijn's
> question.
> >> > > > >>>>>>
> >> > > > >>>>>> Besides, once the code is moved to the Flink project, are
> there
> >> > > any
> >> > > > >>>>> initial
> >> > > > >>>>>> maintainers for the connector we can find?
> >> > > > >>>>>> In addition, do we still maintain documentation under
> Iceberg
> >> > > > >>>>>> https://iceberg.apache.org/docs/latest/flink/ ?
> >> > > > >>>>>>
> >> > > > >>>>>> Best,
> >> > > > >>>>>> Jark
> >> > > > >>>>>>
> >> > > > >>>>>>
> >> > > > >>>>>> On Thu, 13 Oct 2022 at 17:52, yuxia <
> lu...@alumni.sjtu.edu.cn>
> >> > > > >> wrote:
> >> > > > >>>>>>
> >> > > > >>>>>>> +1. Thanks for driving it. Hope I can find some chances
> to take
> >> > > > >> part
> >> > > > >>>> in
> >> > > > >>>>>>> the future development of Iceberg Flink Connector.
> >> > > > >>>>>>>
> >> > > > >>>>>>> Best regards,
> >> > > > >>>>>>> Yuxia
> >> > > > >>>>>>>
> >> > > > >>>>>>> ----- 原始邮件 -----
> >> > > > >>>>>>> 发件人: "Zheng Yu Chen" <ja...@gmail.com>
> >> > > > >>>>>>> 收件人: "dev" <de...@flink.apache.org>
> >> > > > >>>>>>> 发送时间: 星期四, 2022年 10 月 13日 上午 11:26:29
> >> > > > >>>>>>> 主题: Re: [Discuss]- Donate Iceberg Flink Connector
> >> > > > >>>>>>>
> >> > > > >>>>>>> +1, thanks to drive it
> >> > > > >>>>>>>
> >> > > > >>>>>>> Abid Mohammed <ab...@icloud.com.invalid> 于2022年10月10日周一
> >> > 09:22写道:
> >> > > > >>>>>>>
> >> > > > >>>>>>>> Hi,
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> I would like to start a discussion about contributing
> Iceberg
> >> > > > >> Flink
> >> > > > >>>>>>>> Connector to Flink.
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> I created a doc <
> >> > > > >>>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>
> >> > > > >>>>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> https://docs.google.com/document/d/1WC8xkPiVdwtsKL2VSPAUgzm9EjrPs8ZRjEtcwv93ISI/edit?usp=sharing
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> with all the details following the Flink Connector
> template as
> >> > > > >> I
> >> > > > >>>>> don’t
> >> > > > >>>>>>> have
> >> > > > >>>>>>>> permissions to create a FLIP yet.
> >> > > > >>>>>>>> High level details are captured below:
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> Motivation:
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> This FLIP aims to contribute the existing Apache Iceberg
> Flink
> >> > > > >>>>> Connector
> >> > > > >>>>>>>> to Flink.
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> Apache Iceberg is an open table format for huge analytic
> >> > > > >> datasets.
> >> > > > >>>>>>> Iceberg
> >> > > > >>>>>>>> adds tables to compute engines including Spark, Trino,
> >> > > > >> PrestoDB,
> >> > > > >>>>> Flink,
> >> > > > >>>>>>>> Hive and Impala using a high-performance table format
> that
> >> > > > >> works
> >> > > > >>>> just
> >> > > > >>>>>>> like
> >> > > > >>>>>>>> a SQL table.
> >> > > > >>>>>>>> Iceberg avoids unpleasant surprises. Schema evolution
> works
> >> > and
> >> > > > >>>> won’t
> >> > > > >>>>>>>> inadvertently un-delete data. Users don’t need to know
> about
> >> > > > >>>>> partitioning
> >> > > > >>>>>>>> to get fast queries. Iceberg was designed to solve
> correctness
> >> > > > >>>>> problems
> >> > > > >>>>>>> in
> >> > > > >>>>>>>> eventually-consistent cloud object stores.
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> Iceberg supports both Flink’s DataStream API and Table
> API.
> >> > > > >> Based
> >> > > > >>>> on
> >> > > > >>>>> the
> >> > > > >>>>>>>> guideline of the Flink community, only the latest 2 minor
> >> > > > >> versions
> >> > > > >>>>> are
> >> > > > >>>>>>>> actively maintained. See the Multi-Engine
> Support#apache-flink
> >> > > > >> for
> >> > > > >>>>>>> further
> >> > > > >>>>>>>> details.
> >> > > > >>>>>>>>
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> Iceberg connector supports:
> >> > > > >>>>>>>>
> >> > > > >>>>>>>>        • Source: detailed Source design <
> >> > > > >>>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>
> >> > > > >>>>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> https://docs.google.com/document/d/1q6xaBxUPFwYsW9aXWxYUh7die6O7rDeAPFQcTAMQ0GM/edit#
> >> > > > >>>>>>>> ,
> >> > > > >>>>>>>> based on FLIP-27
> >> > > > >>>>>>>>        • Sink: detailed Sink design and interfaces used <
> >> > > > >>>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>
> >> > > > >>>>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> https://docs.google.com/document/d/1O-dPaFct59wUWQECXEEYIkl9_MOoG3zTbC2V-fZRwrg/edit#
> >> > > > >>>>>>>>>
> >> > > > >>>>>>>>        • Usable in both DataStream and Table API/SQL
> >> > > > >>>>>>>>        • DataStream read/append/overwrite
> >> > > > >>>>>>>>        • SQL create/alter/drop table, select, insert
> into,
> >> > > > >> insert
> >> > > > >>>>>>>> overwrite
> >> > > > >>>>>>>>        • Streaming or batch read in Java API
> >> > > > >>>>>>>>        • Support for Flink’s Python API
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> See Iceberg Flink  <
> >> > > > >>>>> https://iceberg.apache.org/docs/latest/flink/#flink
> >> > > > >>>>>>>> for
> >> > > > >>>>>>>> detailed usage instructions.
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> Looking forward to the discussion!
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> Thanks
> >> > > > >>>>>>>> Abid
> >> > > > >>>>>>>
> >> > > > >>>>>>
> >> > > > >>>>
> >> > > > >>>
> >> > > > >
> >> > > > > --
> >> > > > > Martijn
> >> > > > > https://twitter.com/MartijnVisser82
> >> > > > > https://github.com/MartijnVisser
> >> > > >
> >> > >
> >> >
>
>
>
> --
> Ryan Blue
> Tabular
>

Reply via email to