Hello everyone, I've drafted a FLIP that describes the current design of the Pulsar connector:
https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit# Please take a look and let me know what you think. Thanks, Yijie On Sat, Sep 14, 2019 at 12:08 AM Rong Rong <walter...@gmail.com> wrote: > > Hi All, > > Sorry for joining the discussion late and thanks Yijie & Sijie for driving > the discussion. > I also think the Pulsar connector would be a very valuable addition to > Flink. I can also help out a bit on the review side :-) > > Regarding the timeline, I also share concerns with Becket on the > relationship between the new Pulsar connector and FLIP-27. > There's also another discussion just started by Stephan on dropping Kafka > 9/10 support on next Flink release [1]. Although the situation is somewhat > different, and Kafka 9/10 connector has been in Flink for almost 3-4 years, > based on the discussion I am not sure if a major version release is a > requirement for removing old connector supports. > > I think there shouldn't be a blocker if we agree the old connector will be > removed once FLIP-27 based Pulsar connector is there. As Stephan stated, it > is easier to contribute the source sooner and adjust it later. > We should also ensure we clearly communicate the message: for example, > putting an experimental flag on the pre-FLIP27 connector page of the > website, documentations, etc. Any other thoughts? > > -- > Rong > > [1] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Drop-older-versions-of-Kafka-Connectors-0-9-0-10-for-Flink-1-10-td29916.html > > > On Fri, Sep 13, 2019 at 8:15 AM Becket Qin <becket....@gmail.com> wrote: > > > Technically speaking, removing the old connector code is a backwards > > incompatible change which requires a major version bump, i.e. Flink 2.x. > > Given that we don't have a clear plan on when to have the next major > > version release, it seems unclear how long the old connector code will be > > there if we check it in right now. Or will we remove the old connector > > without a major version bump? In any case, it sounds not quite user > > friendly to the those who might use the old Pulsar connector. I am not sure > > if it is worth these potential problems in order to have the Pulsar source > > connector checked in one or two months earlier. > > > > Thanks, > > > > Jiangjie (Becket) Qin > > > > On Thu, Sep 12, 2019 at 3:52 PM Stephan Ewen <se...@apache.org> wrote: > > > > > Agreed, if we check in the old code, we should make it clear that it will > > > be removed as soon as the FLIP-27 based version of the connector is > > there. > > > We should not commit to maintaining the old version, that would be indeed > > > too much overhead. > > > > > > On Thu, Sep 12, 2019 at 3:30 AM Becket Qin <becket....@gmail.com> wrote: > > > > > > > Hi Stephan, > > > > > > > > Thanks for the volunteering to help. > > > > > > > > Yes, the overhead would just be review capacity. In fact, I am not > > > worrying > > > > too much about the review capacity. That is just a one time cost. My > > > > concern is mainly about the long term burden. Assume we have new source > > > > interface ready in 1.10 with newly added Pulsar connectors in old > > > > interface. Later on if we migrate Pulsar to new source interface, the > > old > > > > Pulsar connector might be deprecated almost immediately after checked > > in, > > > > but we may still have to maintain two code bases. For the existing > > > > connectors, we have to do that anyways. But it would be good to avoid > > > > introducing a new connector with the same problem. > > > > > > > > Thanks, > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > On Tue, Sep 10, 2019 at 6:51 PM Stephan Ewen <se...@apache.org> wrote: > > > > > > > > > Hi all! > > > > > > > > > > Nice to see this lively discussion about the Pulsar connector. > > > > > Some thoughts on the open questions: > > > > > > > > > > ## Contribute to Flink or maintain as a community package > > > > > > > > > > Looks like the discussion is more going towards contribution. I think > > > > that > > > > > is good, especially if we think that we want to build a similarly > > deep > > > > > integration with Pulsar as we have for example with Kafka. The > > > connector > > > > > already looks like a more thorough connector than many others we have > > > in > > > > > the repository. > > > > > > > > > > With either a repo split, or the new build system, I hope that the > > > build > > > > > overhead is not a problem. > > > > > > > > > > ## Committer Support > > > > > > > > > > Becket offered some help already, I can also help a bit. I hope that > > > > > between us, we can cover this. > > > > > > > > > > ## Contribute now, or wait for FLIP-27 > > > > > > > > > > As Becket said, FLIP-27 is actually making some PoC-ing progress, but > > > > will > > > > > take 2 more months, I would estimate, before it is fully available. > > > > > > > > > > If we want to be on the safe side with the contribution, we should > > > > > contribute the source sooner and adjust it later. That would also > > help > > > us > > > > > in case things get crazy towards the 1.10 feature freeze and it would > > > be > > > > > hard to find time to review the new changes. > > > > > What would be the overhead of contributing now? Given that the code > > is > > > > > already there, it looks like it would be only review capacity, right? > > > > > > > > > > Best, > > > > > Stephan > > > > > > > > > > On Tue, Sep 10, 2019 at 11:04 AM Yijie Shen < > > henry.yijies...@gmail.com > > > > > > > > > wrote: > > > > > > > > > > > Hi everyone! > > > > > > > > > > > > Thanks for your attention and the promotion of this work. > > > > > > > > > > > > We will prepare a FLIP as soon as possible for more specific > > > > discussions. > > > > > > > > > > > > For FLIP-27, it seems that we have not reached a consensus. > > > Therefore, > > > > > > I will explain all the functionalities of the existing connector in > > > > > > the FLIP (including Source, Sink, and Catalog) to continue our > > > > > > discussions in FLIP. > > > > > > > > > > > > Thanks for your kind help. > > > > > > > > > > > > Best, > > > > > > Yijie > > > > > > > > > > > > On Tue, Sep 10, 2019 at 9:57 AM Becket Qin <becket....@gmail.com> > > > > wrote: > > > > > > > > > > > > > > Hi Sijie, > > > > > > > > > > > > > > If we agree that the goal is to have Pulsar connector in 1.10, > > how > > > > > about > > > > > > we > > > > > > > do the following: > > > > > > > > > > > > > > 0. Start a FLIP to add Pulsar connector to Flink main repo as it > > > is a > > > > > new > > > > > > > public interface to Flink main repo. > > > > > > > 1. Start to review the Pulsar sink right away as there is no > > change > > > > to > > > > > > the > > > > > > > sink interface so far. > > > > > > > 2. Wait a little bit on FLIP-27. Flink 1.10 is going to be code > > > > freeze > > > > > in > > > > > > > late Nov and let's say we give a month to the development and > > > review > > > > of > > > > > > > Pulsar connector, we need to have FLIP-27 by late Oct. There are > > > > still > > > > > 7 > > > > > > > weeks. Personally I think it is doable. If FLIP-27 is not ready > > by > > > > late > > > > > > > Oct, we can review and check in Pulsar connector with the > > existing > > > > > source > > > > > > > interface. This means we will have Pulsar connector in Flink > > 1.10, > > > > > either > > > > > > > with or without FLIP-27. > > > > > > > > > > > > > > Because we are going to have Pulsar sink and source checked in > > > > > > separately, > > > > > > > it might make sense to have two FLIPs, one for Pulsar sink and > > > > another > > > > > > for > > > > > > > Pulsar source. And we can start the work on Pulsar sink right > > away. > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > On Mon, Sep 9, 2019 at 4:13 PM Sijie Guo <guosi...@gmail.com> > > > wrote: > > > > > > > > > > > > > > > Thank you Bowen and Becket. > > > > > > > > > > > > > > > > What's the take from Flink community? Shall we wait for FLIP-27 > > > or > > > > > > shall we > > > > > > > > proceed to next steps? And what the next steps are? :-) > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Sijie > > > > > > > > > > > > > > > > On Thu, Sep 5, 2019 at 2:43 PM Bowen Li <bowenl...@gmail.com> > > > > wrote: > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > I think having a Pulsar connector in Flink can be a good > > mutual > > > > > > benefit > > > > > > > > to > > > > > > > > > both communities. > > > > > > > > > > > > > > > > > > Another perspective is that Pulsar connector is the 1st > > > streaming > > > > > > > > connector > > > > > > > > > that integrates with Flink's metadata management system and > > > > Catalog > > > > > > APIs. > > > > > > > > > It'll be cool to see how the integration turns out and > > whether > > > we > > > > > > need to > > > > > > > > > improve Flink Catalog stack, which are currently in Beta, to > > > > cater > > > > > to > > > > > > > > > streaming source/sink. Thus I'm in favor of merging Pulsar > > > > > connector > > > > > > into > > > > > > > > > Flink 1.10. > > > > > > > > > > > > > > > > > > I'd suggest to submit smaller sized PRs, e.g. maybe one for > > > basic > > > > > > > > > source/sink functionalities and another for schema and > > catalog > > > > > > > > integration, > > > > > > > > > just to make them easier to review. > > > > > > > > > > > > > > > > > > It doesn't seem to hurt to wait for FLIP-27. But I don't > > think > > > > > > FLIP-27 > > > > > > > > > should be a blocker in cases where it cannot make its way > > into > > > > 1.10 > > > > > > or > > > > > > > > > doesn't leave reasonable amount of time for committers to > > > review > > > > or > > > > > > for > > > > > > > > > Pulsar connector to fully adapt to new interfaces. > > > > > > > > > > > > > > > > > > Bowen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Sep 5, 2019 at 3:21 AM Becket Qin < > > > becket....@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Till, > > > > > > > > > > > > > > > > > > > > You are right. It all depends on when the new source > > > interface > > > > is > > > > > > going > > > > > > > > > to > > > > > > > > > > be ready. Personally I think it would be there in about a > > > month > > > > > or > > > > > > so. > > > > > > > > > But > > > > > > > > > > I could be too optimistic. It would also be good to hear > > what > > > > do > > > > > > > > Aljoscha > > > > > > > > > > and Stephan think as they are also involved in FLIP-27. > > > > > > > > > > > > > > > > > > > > In general I think we should have Pulsar connector in Flink > > > > 1.10, > > > > > > > > > > preferably with the new source interface. We can also check > > > it > > > > in > > > > > > right > > > > > > > > > now > > > > > > > > > > with old source interface, but I suspect few users will use > > > it > > > > > > before > > > > > > > > the > > > > > > > > > > next official release. Therefore, it seems reasonable to > > > wait a > > > > > > little > > > > > > > > > bit > > > > > > > > > > to see whether we can jump to the new source interface. As > > > long > > > > > as > > > > > > we > > > > > > > > > make > > > > > > > > > > sure Flink 1.10 has it, waiting a little bit doesn't seem > > to > > > > hurt > > > > > > much. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > > > > > On Thu, Sep 5, 2019 at 3:59 PM Till Rohrmann < > > > > > trohrm...@apache.org > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > I'm wondering what the problem would be if we committed > > the > > > > > > Pulsar > > > > > > > > > > > connector before the new source interface is ready. If I > > > > > > understood > > > > > > > > it > > > > > > > > > > > correctly, then we need to support the old source > > interface > > > > > > anyway > > > > > > > > for > > > > > > > > > > the > > > > > > > > > > > existing connectors. By checking it in early I could see > > > the > > > > > > benefit > > > > > > > > > that > > > > > > > > > > > our users could start using the connector earlier. > > > Moreover, > > > > it > > > > > > would > > > > > > > > > > > prevent that the Pulsar integration is being delayed in > > > case > > > > > > that the > > > > > > > > > > > source interface should be delayed. The only downside I > > see > > > > is > > > > > > the > > > > > > > > > extra > > > > > > > > > > > review effort and potential fixes which might be > > irrelevant > > > > for > > > > > > the > > > > > > > > new > > > > > > > > > > > source interface implementation. I guess it mainly > > depends > > > on > > > > > how > > > > > > > > > certain > > > > > > > > > > > we are when the new source interface will be ready. > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > On Thu, Sep 5, 2019 at 8:56 AM Becket Qin < > > > > > becket....@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi Sijie and Yijie, > > > > > > > > > > > > > > > > > > > > > > > > Thanks for sharing your thoughts. > > > > > > > > > > > > > > > > > > > > > > > > Just want to have some update on FLIP-27. Although the > > > FLIP > > > > > > wiki > > > > > > > > and > > > > > > > > > > > > discussion thread has been quiet for some time, a few > > > > > > committer / > > > > > > > > > > > > contributors in Flink community were actually > > prototyping > > > > the > > > > > > > > entire > > > > > > > > > > > thing. > > > > > > > > > > > > We have made some good progress there but want to > > update > > > > the > > > > > > FLIP > > > > > > > > > wiki > > > > > > > > > > > > after the entire thing is verified to work in case > > there > > > > are > > > > > > some > > > > > > > > > last > > > > > > > > > > > > minute surprise in the implementation. I don't have an > > > > exact > > > > > > ETA > > > > > > > > yet, > > > > > > > > > > > but I > > > > > > > > > > > > guess it is going to be within a month or so. > > > > > > > > > > > > > > > > > > > > > > > > I am happy to review the current Flink Pulsar connector > > > and > > > > > > see if > > > > > > > > it > > > > > > > > > > > would > > > > > > > > > > > > fit in FLIP-27. It would be good to avoid the case that > > > we > > > > > > checked > > > > > > > > in > > > > > > > > > > the > > > > > > > > > > > > Pulsar connector with some review efforts and shortly > > > after > > > > > > that > > > > > > > > the > > > > > > > > > > new > > > > > > > > > > > > Source interface is ready. > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Sep 5, 2019 at 8:39 AM Yijie Shen < > > > > > > > > henry.yijies...@gmail.com > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for all the feedback and suggestions! > > > > > > > > > > > > > > > > > > > > > > > > > > As Sijie said, the goal of the connector has always > > > been > > > > to > > > > > > > > provide > > > > > > > > > > > > > users with the latest features of both systems as > > soon > > > as > > > > > > > > possible. > > > > > > > > > > We > > > > > > > > > > > > > propose to contribute the connector to Flink and hope > > > to > > > > > get > > > > > > more > > > > > > > > > > > > > suggestions and feedback from Flink experts to ensure > > > the > > > > > > high > > > > > > > > > > quality > > > > > > > > > > > > > of the connector. > > > > > > > > > > > > > > > > > > > > > > > > > > For FLIP-27, we noticed its existence at the > > beginning > > > of > > > > > > > > reworking > > > > > > > > > > > > > the connector implementation based on Flink 1.9; we > > > also > > > > > > wanted > > > > > > > > to > > > > > > > > > > > > > build a connector that supports both batch and stream > > > > > > computing > > > > > > > > > based > > > > > > > > > > > > > on it. > > > > > > > > > > > > > However, it has been inactive for some time, so we > > > > decided > > > > > to > > > > > > > > > provide > > > > > > > > > > > > > a connector with most of the new features, such as > > the > > > > new > > > > > > type > > > > > > > > > > system > > > > > > > > > > > > > and the new catalog API first. We will pay attention > > to > > > > the > > > > > > > > > progress > > > > > > > > > > > > > of FLIP-27 continually and incorporate it with the > > > > > connector > > > > > > as > > > > > > > > > soon > > > > > > > > > > > > > as possible. > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding the test status of the connector, we are > > > > > following > > > > > > the > > > > > > > > > > other > > > > > > > > > > > > > connectors' test in Flink repository and aimed to > > > provide > > > > > > > > > throughout > > > > > > > > > > > > > tests as we could. We are also happy to hear > > > suggestions > > > > > and > > > > > > > > > > > > > supervision from the Flink community to improve the > > > > > > stability and > > > > > > > > > > > > > performance of the connector continuously. > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > Yijie > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Sep 5, 2019 at 5:59 AM Sijie Guo < > > > > > guosi...@gmail.com > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks everyone for the comments and feedback. > > > > > > > > > > > > > > > > > > > > > > > > > > > > It seems to me that the main question here is > > about - > > > > > "how > > > > > > can > > > > > > > > > the > > > > > > > > > > > > Flink > > > > > > > > > > > > > > community maintain the connector?". > > > > > > > > > > > > > > > > > > > > > > > > > > > > Here are two thoughts from myself. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1) I think how and where to host this integration > > is > > > > kind > > > > > > of > > > > > > > > less > > > > > > > > > > > > > important > > > > > > > > > > > > > > here. I believe there can be many ways to achieve > > it. > > > > > > > > > > > > > > As part of the contribution, what we are looking > > for > > > > here > > > > > > is > > > > > > > > how > > > > > > > > > > > these > > > > > > > > > > > > > two > > > > > > > > > > > > > > communities can build the collaboration > > relationship > > > on > > > > > > > > > developing > > > > > > > > > > > > > > the integration between Pulsar and Flink. Even we > > can > > > > try > > > > > > our > > > > > > > > > best > > > > > > > > > > to > > > > > > > > > > > > > catch > > > > > > > > > > > > > > up all the updates in Flink community. We are still > > > > > > > > > > > > > > facing the fact that we have less experiences in > > > Flink > > > > > than > > > > > > > > folks > > > > > > > > > > in > > > > > > > > > > > > > Flink > > > > > > > > > > > > > > community. In order to make sure we maintain and > > > > deliver > > > > > > > > > > > > > > a high-quality pulsar-flink integration to the > > users > > > > who > > > > > > use > > > > > > > > both > > > > > > > > > > > > > > technologies, we need some help from the experts > > from > > > > > Flink > > > > > > > > > > > community. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2) We have been following FLIP-27 for a while. > > > > Originally > > > > > > we > > > > > > > > were > > > > > > > > > > > > > thinking > > > > > > > > > > > > > > of contributing the connectors back after > > integrating > > > > > with > > > > > > the > > > > > > > > > > > > > > new API introduced in FLIP-27. But we decided to > > > > initiate > > > > > > the > > > > > > > > > > > > > conversation > > > > > > > > > > > > > > as early as possible. Because we believe there are > > > more > > > > > > > > benefits > > > > > > > > > > > doing > > > > > > > > > > > > > > it now rather than later. As part of contribution, > > it > > > > can > > > > > > help > > > > > > > > > > Flink > > > > > > > > > > > > > > community understand more about Pulsar and the > > > > potential > > > > > > > > > > integration > > > > > > > > > > > > > points. > > > > > > > > > > > > > > Also we can also help Flink community verify the > > new > > > > > > connector > > > > > > > > > API > > > > > > > > > > as > > > > > > > > > > > > > well > > > > > > > > > > > > > > as other new API (e.g. catalog API). > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Sijie > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 4, 2019 at 5:24 AM Becket Qin < > > > > > > > > becket....@gmail.com> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Yijie, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the interest in contributing the > > Pulsar > > > > > > connector. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In general, I think having Pulsar connector with > > > > strong > > > > > > > > support > > > > > > > > > > is > > > > > > > > > > > a > > > > > > > > > > > > > > > valuable addition to Flink. So I am happy the > > > > shepherd > > > > > > this > > > > > > > > > > effort. > > > > > > > > > > > > > > > Meanwhile, I would also like to provide some > > > context > > > > > and > > > > > > > > recent > > > > > > > > > > > > > efforts on > > > > > > > > > > > > > > > the Flink connectors ecosystem. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The current way Flink maintains its connector has > > > hit > > > > > the > > > > > > > > > > > scalability > > > > > > > > > > > > > bar. > > > > > > > > > > > > > > > With more and more connectors coming into Flink > > > repo, > > > > > we > > > > > > are > > > > > > > > > > > facing a > > > > > > > > > > > > > few > > > > > > > > > > > > > > > problems such as long build and testing time. To > > > > > address > > > > > > this > > > > > > > > > > > > problem, > > > > > > > > > > > > > we > > > > > > > > > > > > > > > have attempted to do the following: > > > > > > > > > > > > > > > 1. Split out the connectors into a separate > > > > repository. > > > > > > This > > > > > > > > is > > > > > > > > > > > > > temporarily > > > > > > > > > > > > > > > on hold due to potential solution to shorten the > > > > build > > > > > > time. > > > > > > > > > > > > > > > 2. Encourage the connectors to stay as ecosystem > > > > > project > > > > > > > > while > > > > > > > > > > > Flink > > > > > > > > > > > > > tries > > > > > > > > > > > > > > > to provide good support for functionality and > > > > > > compatibility > > > > > > > > > > tests. > > > > > > > > > > > > > Robert > > > > > > > > > > > > > > > has driven to create a Flink Ecosystem project > > > > website > > > > > > and it > > > > > > > > > is > > > > > > > > > > > > going > > > > > > > > > > > > > > > through some final approval process. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Given the above efforts, it would be great to > > first > > > > see > > > > > > if we > > > > > > > > > can > > > > > > > > > > > > have > > > > > > > > > > > > > > > Pulsar connector as an ecosystem project with > > great > > > > > > support. > > > > > > > > It > > > > > > > > > > > would > > > > > > > > > > > > > be > > > > > > > > > > > > > > > good to hear how the Flink Pulsar connector is > > > tested > > > > > > > > currently > > > > > > > > > > to > > > > > > > > > > > > see > > > > > > > > > > > > > if > > > > > > > > > > > > > > > we can learn something to maintain it as an > > > ecosystem > > > > > > project > > > > > > > > > > with > > > > > > > > > > > > good > > > > > > > > > > > > > > > quality and test coverage. If the quality as an > > > > > ecosystem > > > > > > > > > project > > > > > > > > > > > is > > > > > > > > > > > > > hard > > > > > > > > > > > > > > > to guarantee, we may as well adopt it into the > > main > > > > > repo. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > BTW, another ongoing effort is FLIP-27 where we > > are > > > > > > making > > > > > > > > > > changes > > > > > > > > > > > to > > > > > > > > > > > > > the > > > > > > > > > > > > > > > Flink source connector architecture and > > interface. > > > > This > > > > > > > > change > > > > > > > > > > will > > > > > > > > > > > > > likely > > > > > > > > > > > > > > > land in 1.10. Therefore timing wise, if we are > > > going > > > > to > > > > > > have > > > > > > > > > the > > > > > > > > > > > > Pulsar > > > > > > > > > > > > > > > connector in main repo, I am wondering if we > > should > > > > > hold > > > > > > a > > > > > > > > > little > > > > > > > > > > > bit > > > > > > > > > > > > > and > > > > > > > > > > > > > > > let the Pulsar connector adapt to the new > > interface > > > > to > > > > > > avoid > > > > > > > > > > > shortly > > > > > > > > > > > > > > > deprecated work? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 4, 2019 at 4:32 PM Chesnay Schepler < > > > > > > > > > > > ches...@apache.org> > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm quite worried that we may end up repeating > > > > > history. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > There were already 2 attempts at contributing a > > > > > pulsar > > > > > > > > > > connector, > > > > > > > > > > > > > both > > > > > > > > > > > > > > > > of which failed because no committer was > > getting > > > > > > involved, > > > > > > > > > > > despite > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > contributor opening a dedicated discussion > > thread > > > > > > about the > > > > > > > > > > > > > contribution > > > > > > > > > > > > > > > > beforehand and getting several +1's from > > > > committers. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We should really make sure that if we > > > > welcome/approve > > > > > > such > > > > > > > > a > > > > > > > > > > > > > > > > contribution it will actually get the attention > > > it > > > > > > > > deserves. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As such, I'm inclined to recommend maintaining > > > the > > > > > > > > connector > > > > > > > > > > > > outside > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > Flink. We could link to it from the > > documentation > > > > to > > > > > > give > > > > > > > > it > > > > > > > > > > more > > > > > > > > > > > > > > > exposure. > > > > > > > > > > > > > > > > With the upcoming page for sharing artifacts > > > among > > > > > the > > > > > > > > > > community > > > > > > > > > > > > > (what's > > > > > > > > > > > > > > > > the state of that anyway?), this may be a > > better > > > > > > option. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 04/09/2019 10:16, Till Rohrmann wrote: > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > thanks a lot for starting this discussion > > > Yijie. > > > > I > > > > > > think > > > > > > > > > the > > > > > > > > > > > > Pulsar > > > > > > > > > > > > > > > > > connector would be a very valuable addition > > > since > > > > > > Pulsar > > > > > > > > > > > becomes > > > > > > > > > > > > > more > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > more popular and it would further expand > > > Flink's > > > > > > > > > > > > interoperability. > > > > > > > > > > > > > Also > > > > > > > > > > > > > > > > > from a project perspective it makes sense for > > > me > > > > to > > > > > > place > > > > > > > > > the > > > > > > > > > > > > > connector > > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > the downstream project. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My main concern/question is how can the Flink > > > > > > community > > > > > > > > > > > maintain > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > connector? We have seen in the past that > > > > connectors > > > > > > are > > > > > > > > > some > > > > > > > > > > of > > > > > > > > > > > > the > > > > > > > > > > > > > > > most > > > > > > > > > > > > > > > > > actively developed components because they > > need > > > > to > > > > > be > > > > > > > > kept > > > > > > > > > in > > > > > > > > > > > > sync > > > > > > > > > > > > > with > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > external system and with Flink. Given that > > the > > > > > Pulsar > > > > > > > > > > community > > > > > > > > > > > > is > > > > > > > > > > > > > > > > willing > > > > > > > > > > > > > > > > > to help with maintaining, improving and > > > evolving > > > > > the > > > > > > > > > > connector, > > > > > > > > > > > > I'm > > > > > > > > > > > > > > > > > optimistic that we can achieve this. Hence, > > +1 > > > > for > > > > > > > > > > contributing > > > > > > > > > > > > it > > > > > > > > > > > > > back > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > Flink. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 4, 2019 at 2:03 AM Sijie Guo < > > > > > > > > > guosi...@gmail.com > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> Hi Yun, > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> Since I was the main driver behind > > FLINK-9641 > > > > and > > > > > > > > > > FLINK-9168, > > > > > > > > > > > > let > > > > > > > > > > > > > me > > > > > > > > > > > > > > > > try to > > > > > > > > > > > > > > > > >> add more context on this. > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> FLINK-9641 and FLINK-9168 was created for > > > > bringing > > > > > > > > Pulsar > > > > > > > > > as > > > > > > > > > > > > > source > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > >> sink for Flink. The integration was done > > with > > > > > Flink > > > > > > > > 1.6.0. > > > > > > > > > > We > > > > > > > > > > > > > sent out > > > > > > > > > > > > > > > > pull > > > > > > > > > > > > > > > > >> requests about a year ago and we ended up > > > > > > maintaining > > > > > > > > > those > > > > > > > > > > > > > connectors > > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > >> Pulsar for Pulsar users to use Flink to > > > process > > > > > > event > > > > > > > > > > streams > > > > > > > > > > > in > > > > > > > > > > > > > > > Pulsar. > > > > > > > > > > > > > > > > >> (See > > > > > > > > > > > > > https://github.com/apache/pulsar/tree/master/pulsar-flink > > > > > > > > > > > > ). > > > > > > > > > > > > > The > > > > > > > > > > > > > > > > Flink > > > > > > > > > > > > > > > > >> 1.6 integration is pretty simple and there > > is > > > no > > > > > > schema > > > > > > > > > > > > > > > considerations. > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> In the past year, we have made a lot of > > > changes > > > > in > > > > > > > > Pulsar > > > > > > > > > > and > > > > > > > > > > > > > brought > > > > > > > > > > > > > > > > >> Pulsar schema as the first-class citizen in > > > > > Pulsar. > > > > > > We > > > > > > > > > also > > > > > > > > > > > > > integrated > > > > > > > > > > > > > > > > with > > > > > > > > > > > > > > > > >> other computing engines for processing > > Pulsar > > > > > event > > > > > > > > > streams > > > > > > > > > > > with > > > > > > > > > > > > > > > Pulsar > > > > > > > > > > > > > > > > >> schema. > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> It led us to rethink how to integrate with > > > Flink > > > > > in > > > > > > the > > > > > > > > > best > > > > > > > > > > > > way. > > > > > > > > > > > > > Then > > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > >> reimplement the pulsar-flink connectors from > > > the > > > > > > ground > > > > > > > > up > > > > > > > > > > > with > > > > > > > > > > > > > schema > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > >> bring table API and catalog API as the > > > > first-class > > > > > > > > citizen > > > > > > > > > > in > > > > > > > > > > > > the > > > > > > > > > > > > > > > > >> integration. With that being said, in the > > new > > > > > > > > pulsar-flink > > > > > > > > > > > > > > > > implementation, > > > > > > > > > > > > > > > > >> you can register pulsar as a flink catalog > > and > > > > > > query / > > > > > > > > > > process > > > > > > > > > > > > the > > > > > > > > > > > > > > > event > > > > > > > > > > > > > > > > >> streams using Flink SQL. > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> This is an example about how to use Pulsar > > as > > > a > > > > > > Flink > > > > > > > > > > catalog: > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/streamnative/pulsar-flink/blob/3eeddec5625fc7dddc3f8a3ec69f72e1614ca9c9/README.md#use-pulsar-catalog > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> Yijie has also written a blog post > > explaining > > > > why > > > > > we > > > > > > > > > > > > re-implement > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > flink > > > > > > > > > > > > > > > > >> connector with Flink 1.9 and what are the > > > > changes > > > > > we > > > > > > > > made > > > > > > > > > in > > > > > > > > > > > the > > > > > > > > > > > > > new > > > > > > > > > > > > > > > > >> connector: > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://medium.com/streamnative/use-apache-pulsar-as-streaming-table-with-8-lines-of-code-39033a93947f > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> We believe Pulsar is not just a simple data > > > sink > > > > > or > > > > > > > > source > > > > > > > > > > for > > > > > > > > > > > > > Flink. > > > > > > > > > > > > > > > It > > > > > > > > > > > > > > > > >> actually can be a fully integrated streaming > > > > data > > > > > > > > storage > > > > > > > > > > for > > > > > > > > > > > > > Flink in > > > > > > > > > > > > > > > > many > > > > > > > > > > > > > > > > >> areas (sink, source, schema/catalog and > > > state). > > > > > The > > > > > > > > > > > combination > > > > > > > > > > > > of > > > > > > > > > > > > > > > Flink > > > > > > > > > > > > > > > > >> and Pulsar can create a great streaming > > > > warehouse > > > > > > > > > > architecture > > > > > > > > > > > > for > > > > > > > > > > > > > > > > >> streaming-first, unified data processing. > > > Since > > > > we > > > > > > are > > > > > > > > > > talking > > > > > > > > > > > > to > > > > > > > > > > > > > > > > >> contribute Pulsar integration to Flink here, > > > we > > > > > are > > > > > > also > > > > > > > > > > > > > dedicated to > > > > > > > > > > > > > > > > >> maintain, improve and evolve the integration > > > > with > > > > > > Flink > > > > > > > > to > > > > > > > > > > > help > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > users > > > > > > > > > > > > > > > > >> who use both Flink and Pulsar. > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> Hope this give you a bit more background > > about > > > > the > > > > > > > > pulsar > > > > > > > > > > > flink > > > > > > > > > > > > > > > > >> integration. Let me know what are your > > > thoughts. > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> Thanks, > > > > > > > > > > > > > > > > >> Sijie > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> On Tue, Sep 3, 2019 at 11:54 AM Yun Tang < > > > > > > > > > myas...@live.com> > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >>> Hi Yijie > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> I can see that Pulsar becomes more and more > > > > > popular > > > > > > > > > > recently > > > > > > > > > > > > and > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > >> glad > > > > > > > > > > > > > > > > >>> to see more people willing to contribute to > > > > Flink > > > > > > > > > > ecosystem. > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> Before any further discussion, would you > > > please > > > > > > give > > > > > > > > some > > > > > > > > > > > > > explanation > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > >>> the relationship between this thread to > > > current > > > > > > > > existing > > > > > > > > > > > JIRAs > > > > > > > > > > > > of > > > > > > > > > > > > > > > > pulsar > > > > > > > > > > > > > > > > >>> source [1] and sink [2] connector? Will the > > > > > > > > contribution > > > > > > > > > > > > contains > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > >>> those PRs or totally different > > > implementation? > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> [1] > > > > > > https://issues.apache.org/jira/browse/FLINK-9641 > > > > > > > > > > > > > > > > >>> [2] > > > > > > https://issues.apache.org/jira/browse/FLINK-9168 > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> Best > > > > > > > > > > > > > > > > >>> Yun Tang > > > > > > > > > > > > > > > > >>> ________________________________ > > > > > > > > > > > > > > > > >>> From: Yijie Shen < > > henry.yijies...@gmail.com> > > > > > > > > > > > > > > > > >>> Sent: Tuesday, September 3, 2019 13:57 > > > > > > > > > > > > > > > > >>> To: dev@flink.apache.org < > > > dev@flink.apache.org > > > > > > > > > > > > > > > > > > > > > >>> Subject: [DISCUSS] Contribute Pulsar Flink > > > > > > connector > > > > > > > > back > > > > > > > > > > to > > > > > > > > > > > > > Flink > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> Dear Flink Community! > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> I would like to open the discussion of > > > > > contributing > > > > > > > > > Pulsar > > > > > > > > > > > > Flink > > > > > > > > > > > > > > > > >>> connector [0] back to Flink. > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> ## A brief introduction to Apache Pulsar > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> Apache Pulsar[1] is a multi-tenant, > > > > > > high-performance > > > > > > > > > > > > distributed > > > > > > > > > > > > > > > > >>> pub-sub messaging system. Pulsar includes > > > > > multiple > > > > > > > > > features > > > > > > > > > > > > such > > > > > > > > > > > > > as > > > > > > > > > > > > > > > > >>> native support for multiple clusters in a > > > > Pulsar > > > > > > > > > instance, > > > > > > > > > > > with > > > > > > > > > > > > > > > > >>> seamless geo-replication of messages across > > > > > > clusters, > > > > > > > > > very > > > > > > > > > > > low > > > > > > > > > > > > > > > publish > > > > > > > > > > > > > > > > >>> and end-to-end latency, seamless > > scalability > > > to > > > > > > over a > > > > > > > > > > > million > > > > > > > > > > > > > > > topics, > > > > > > > > > > > > > > > > >>> and guaranteed message delivery with > > > persistent > > > > > > message > > > > > > > > > > > storage > > > > > > > > > > > > > > > > >>> provided by Apache BookKeeper. Nowadays, > > > Pulsar > > > > > has > > > > > > > > been > > > > > > > > > > > > adopted > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > >>> more and more companies[2]. > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> ## The status of Pulsar Flink connector > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> The Pulsar Flink connector we are planning > > to > > > > > > > > contribute > > > > > > > > > is > > > > > > > > > > > > built > > > > > > > > > > > > > > > upon > > > > > > > > > > > > > > > > >>> Flink 1.9.0 and Pulsar 2.4.0. The main > > > features > > > > > > are: > > > > > > > > > > > > > > > > >>> - Pulsar as a streaming source with > > > > exactly-once > > > > > > > > > guarantee. > > > > > > > > > > > > > > > > >>> - Sink streaming results to Pulsar with > > > > > > at-least-once > > > > > > > > > > > > semantics. > > > > > > > > > > > > > (We > > > > > > > > > > > > > > > > >>> would update this to exactly-once as well > > > when > > > > > > Pulsar > > > > > > > > > gets > > > > > > > > > > > all > > > > > > > > > > > > > > > > >>> transaction features ready in its 2.5.0 > > > > version) > > > > > > > > > > > > > > > > >>> - Build upon Flink new Table API Type > > system > > > > > > > > > (FLIP-37[3]), > > > > > > > > > > > and > > > > > > > > > > > > > can > > > > > > > > > > > > > > > > >>> automatically (de)serialize messages with > > the > > > > > help > > > > > > of > > > > > > > > > > Pulsar > > > > > > > > > > > > > schema. > > > > > > > > > > > > > > > > >>> - Integrate with Flink new Catalog API > > > > > > (FLIP-30[4]), > > > > > > > > > which > > > > > > > > > > > > > enables > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > >>> use of Pulsar topics as tables in Table API > > > as > > > > > > well as > > > > > > > > > SQL > > > > > > > > > > > > > client. > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> ## Reference > > > > > > > > > > > > > > > > >>> [0] > > > > https://github.com/streamnative/pulsar-flink > > > > > > > > > > > > > > > > >>> [1] https://pulsar.apache.org/ > > > > > > > > > > > > > > > > >>> [2] > > https://pulsar.apache.org/en/powered-by/ > > > > > > > > > > > > > > > > >>> [3] > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > > > > > > > > > > > > > > > > >>> [4] > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >>> Best, > > > > > > > > > > > > > > > > >>> Yijie Shen > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >