Well. It's a bit off-topic. For deprecating SourceFunction as FLIP-27 series works go ahead, +1 from my side. It's a significant work towards the unification of batch and streaming effort :)
Best, tison. tison <wander4...@gmail.com> 于2022年6月6日周一 21:54写道: > The starting point of the version bump and removal question is that > downstream projects may experience a tough time to adapt new interfaces > while Flink keeps in 1.x versions so that users may expect it as an easy > task. From my experience, it's really challenge to maintain > compatibility between multiple versions of Flink while significant changes > made but sharing 1.x version series - users may not be aware that it's > almost a major version bump. > > Best, > tison. > > > tison <wander4...@gmail.com> 于2022年6月6日周一 21:51写道: > >> One question from my side: >> >> As SourceFunction a @Public interface, we cannot remove it before doing >> a major version bump (Flink 2.0). >> >> Of course it's not a blocker to make such deprecation and let the new >> interface step in. My question is whether we have a plan to finally remove >> the deprecated interfaces, or postpone it until a clear plan of Flink 2.0? >> >> Best, >> tison. >> >> >> David Anderson <dander...@apache.org> 于2022年6月6日周一 21:35写道: >> >>> > >>> > David, can you elaborate why you need watermark generation in the >>> source >>> > for your data generators? >>> >>> >>> The training exercises should strive to provide examples of best >>> practices. >>> If the exercises and their solutions use >>> >>> env.fromSource(source, WatermarkStrategy.noWatermarks(), >>> "name-of-source") >>> .map(...) >>> .assignTimestampsAndWatermarks(...) >>> >>> this will help establish this anti-pattern as the normal way of doing >>> things. >>> >>> Most new Flink users are using a KafkaSource with a noWatermarks strategy >>> and a SimpleStringSchema, followed by a map that does the real >>> deserialization, followed by the real watermarking -- because they aren't >>> seeing examples that teach how these interfaces are meant to be used. >>> >>> When we redo the sources used in training exercises, I want to avoid >>> these >>> pitfalls. >>> >>> David >>> >>> On Mon, Jun 6, 2022 at 9:12 AM Konstantin Knauf <kna...@apache.org> >>> wrote: >>> >>> > Hi everyone, >>> > >>> > very interesting thread. The proposal for deprecation seems to have >>> sparked >>> > a very important discussion. Do we what users struggle with >>> specifically? >>> > >>> > Speaking for myself, when I upgrade flink-faker to the new Source API >>> an >>> > unbounded version of the NumberSequenceSource would have been all I >>> needed, >>> > but that's just the data generator use case. I think, that one could be >>> > solved quite easily. David, can you elaborate why you need watermark >>> > generation in the source for your data generators? >>> > >>> > Cheers, >>> > >>> > Konstantin >>> > >>> > >>> > >>> > >>> > >>> > Am So., 5. Juni 2022 um 17:48 Uhr schrieb Piotr Nowojski < >>> > pnowoj...@apache.org>: >>> > >>> > > Also +1 to what David has written. But it doesn't mean we should be >>> > waiting >>> > > indefinitely to deprecate SourceFunction. >>> > > >>> > > Best, >>> > > Piotrek >>> > > >>> > > niedz., 5 cze 2022 o 16:46 Jark Wu <imj...@gmail.com> napisał(a): >>> > > >>> > > > +1 to David's point. >>> > > > >>> > > > Usually, when we deprecate some interfaces, we should point users >>> to >>> > use >>> > > > the recommended alternatives. >>> > > > However, implementing the new Source interface for some simple >>> > scenarios >>> > > is >>> > > > too challenging and complex. >>> > > > We also found it isn't easy to push the internal connector to >>> upgrade >>> > to >>> > > > the new Source because >>> > > > "FLIP-27 are hard to understand, while SourceFunction is easy". >>> > > > >>> > > > +1 to make implementing a simple Source easier before deprecating >>> > > > SourceFunction. >>> > > > >>> > > > Best, >>> > > > Jark >>> > > > >>> > > > >>> > > > On Sun, 5 Jun 2022 at 07:29, Jingsong Lee <lzljs3620...@apache.org >>> > >>> > > wrote: >>> > > > >>> > > > > +1 to David and Ingo. >>> > > > > >>> > > > > Before deprecate and remove SourceFunction, we should have some >>> > easier >>> > > > APIs >>> > > > > to wrap new Source, the cost to write a new Source is too high >>> now. >>> > > > > >>> > > > > >>> > > > > >>> > > > > Ingo Bürk <airbla...@apache.org>于2022年6月5日 周日05:32写道: >>> > > > > >>> > > > > > I +1 everything David said. The new Source API raised the >>> > complexity >>> > > > > > significantly. It's great to have such a rich, powerful API >>> that >>> > can >>> > > do >>> > > > > > everything, but in the process we lost the ability to onboard >>> > people >>> > > to >>> > > > > > the APIs. >>> > > > > > >>> > > > > > >>> > > > > > Best >>> > > > > > Ingo >>> > > > > > >>> > > > > > On 04.06.22 21:21, David Anderson wrote: >>> > > > > > > I'm in favor of this, but I think we need to make it easier >>> to >>> > > > > implement >>> > > > > > > data generators and test sources. As things stand in 1.15, >>> unless >>> > > you >>> > > > > can >>> > > > > > > be satisfied with using a NumberSequenceSource followed by a >>> map, >>> > > > > things >>> > > > > > > get quite complicated. I looked into reworking the data >>> > generators >>> > > > used >>> > > > > > in >>> > > > > > > the training exercises, and got discouraged by the amount of >>> work >>> > > > > > involved. >>> > > > > > > (The sources used in the training want to be unbounded, and >>> need >>> > > > > > > watermarking in the sources, which means that using >>> > > > > NumberSequenceSource >>> > > > > > > isn't an option.) >>> > > > > > > >>> > > > > > > I think the proposed deprecation will be better received if >>> it >>> > can >>> > > be >>> > > > > > > accompanied by something that makes implementing a simple >>> Source >>> > > > easier >>> > > > > > > than it is now. People are continuing to implement new >>> > > > SourceFunctions >>> > > > > > > because the interfaces defined by FLIP-27 are hard to >>> understand, >>> > > > while >>> > > > > > > SourceFunction is easy. Alex, I believe you were looking into >>> > > > > > implementing >>> > > > > > > an easier-to-use building block that could be used in >>> situations >>> > > like >>> > > > > > this. >>> > > > > > > Can we get something like that in place first? >>> > > > > > > >>> > > > > > > David >>> > > > > > > >>> > > > > > > On Fri, Jun 3, 2022 at 4:52 PM Jing Ge <j...@ververica.com> >>> > wrote: >>> > > > > > > >>> > > > > > >> Hi, >>> > > > > > >> >>> > > > > > >> Thanks Alex for driving this! >>> > > > > > >> >>> > > > > > >> +1 To give the Flink developers, especially Connector >>> developers >>> > > the >>> > > > > > clear >>> > > > > > >> signal that the new Source API is recommended according to >>> > > FLIP-27, >>> > > > we >>> > > > > > >> should mark them as deprecated. >>> > > > > > >> >>> > > > > > >> There are some open questions to discuss: >>> > > > > > >> >>> > > > > > >> 1. Do we need to mark all subinterfaces/subclasses as >>> > deprecated? >>> > > > e.g. >>> > > > > > >> FromElementsFunction, etc. there are many. What are the >>> > > > replacements? >>> > > > > > >> 2. Do we need to mark all subclasses that have replacement >>> as >>> > > > > > deprecated? >>> > > > > > >> e.g. ExternallyInducedSource whose replacement class, if I >>> am >>> > not >>> > > > > > mistaken, >>> > > > > > >> ExternallyInducedSourceReader is @Experimental >>> > > > > > >> 3. Do we need to mark all related test utility classes as >>> > > > deprecated? >>> > > > > > >> >>> > > > > > >> I think it might make sense to create an umbrella ticket to >>> > cover >>> > > > all >>> > > > > of >>> > > > > > >> these with the following process: >>> > > > > > >> >>> > > > > > >> 1. Mark SourceFunction as deprecated asap. >>> > > > > > >> 2. Mark subinterfaces and subclasses as deprecated, if >>> there are >>> > > > > > graduated >>> > > > > > >> replacements. Good example is that KafkaSource replaced >>> > > > KafkaConsumer >>> > > > > > which >>> > > > > > >> has been marked as deprecated. >>> > > > > > >> 3. Do not mark subinterfaces and subclasses as deprecated, >>> if >>> > > > > > replacement >>> > > > > > >> classes are still experimental, check if it is time to >>> graduate >>> > > > them. >>> > > > > > After >>> > > > > > >> graduation, go to step 2. It might take a while for >>> graduation. >>> > > > > > >> 4. Do not mark subinterfaces and subclasses as deprecated, >>> if >>> > the >>> > > > > > >> replacement classes are experimental and are too young to >>> > > graduate. >>> > > > We >>> > > > > > have >>> > > > > > >> to wait. But in this case we could create new tickets under >>> the >>> > > > > umbrella >>> > > > > > >> ticket. >>> > > > > > >> 5. Do not mark subinterfaces and subclasses as deprecated, >>> if >>> > > there >>> > > > is >>> > > > > > no >>> > > > > > >> replacement at all. We have to create new tickets and wait >>> until >>> > > the >>> > > > > new >>> > > > > > >> implementation has been done and graduated. It will take a >>> > longer >>> > > > > time, >>> > > > > > >> roughly 1,5 years. >>> > > > > > >> 6. For test classes, we could follow the same rule. But I >>> think >>> > > for >>> > > > > some >>> > > > > > >> cases, we could consider doing the replacement directly >>> without >>> > > > going >>> > > > > > >> through the deprecation phase. >>> > > > > > >> >>> > > > > > >> When we look back on all of these, we can realize it is a >>> big >>> > epic >>> > > > > (even >>> > > > > > >> bigger than an epic). It needs someone to drive it and keep >>> > focus >>> > > on >>> > > > > it >>> > > > > > >> continuously with support from the community and push the >>> > > > development >>> > > > > > >> towards the new Source API of FLIP-27. >>> > > > > > >> >>> > > > > > >> If we could have consensus for this, Alex and I could >>> create >>> > the >>> > > > > > umbrella >>> > > > > > >> ticket to kick it off. >>> > > > > > >> >>> > > > > > >> Best regards, >>> > > > > > >> Jing >>> > > > > > >> >>> > > > > > >> >>> > > > > > >> On Fri, Jun 3, 2022 at 3:54 PM Alexander Fedulov < >>> > > > > > alexan...@ververica.com> >>> > > > > > >> wrote: >>> > > > > > >> >>> > > > > > >>> Hi everyone, >>> > > > > > >>> >>> > > > > > >>> I would like to start the discussion about marking >>> > > > > SourceFunction-based >>> > > > > > >>> interfaces as deprecated. With the FLIP-27 APIs becoming >>> the >>> > new >>> > > > > > >> standard, >>> > > > > > >>> the old ones have to be eventually phased out. Although >>> this >>> > > state >>> > > > is >>> > > > > > >> well >>> > > > > > >>> known within the community and no new connectors based on >>> the >>> > old >>> > > > > > >>> interfaces can be accepted into the project, the footprint >>> of >>> > > > > > >>> SourceFunction in the user code still keeps growing >>> (primarily >>> > > for >>> > > > > data >>> > > > > > >>> generators and test utilities). I believe it is best to >>> mark >>> > > > > > >> SourceFunction >>> > > > > > >>> as deprecated as soon as possible. What do you think? >>> > > > > > >>> >>> > > > > > >>> Best, >>> > > > > > >>> Alexander Fedulov >>> > > > > > >>> >>> > > > > > >> >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > > >>> > >>> > >>> > -- >>> > https://twitter.com/snntrable >>> > https://github.com/knaufk >>> > >>> >>