Hi Dong,

Thanks for the heads-up. It was really nice to be aware of this issue
before removing FlinkKafkaConsumer. I will check FLIP-208 and work on it.
Thanks!

Best regards,
Jing


On Thu, Nov 17, 2022 at 4:34 PM Dong Lin <lindon...@gmail.com> wrote:

> Hi Jing,
>
> Thanks for opening the discussion. I am not sure we are ready to
> remove FlinkKafkaConsumer.
> The reason is that for existing users of FlinkKafkaConsumer who rely
> on KafkaDeserializationSchema::isEndOfStream(),
> there is currently no migration path for them to use FlinkKafkaConsumer.
>
> This issue was explained in FLIP-208
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-208%3A+Add+RecordEvaluator+to+dynamically+stop+source+based+on+de-serialized+records
> >.
> The design was pretty much ready. I didn't start the voting thread
> because Fabian said he wanted more time to explore alternative solutions.
> My priority changed recently and don't plan to get this FLIP done for 1.17.
> It will be great if someone can address this issue so that we can move
> forward to remove FlinkKafkaConsumer.
>
> Thanks,
> Dong
>
>
>
>
>
> On Fri, Nov 11, 2022 at 8:53 PM Jing Ge <j...@ververica.com.invalid>
> wrote:
>
> > Hi all,
> >
> > Thank you all for the informative feedback. I figure there is a
> requirement
> > to improve the documentation wrt the migration from FlinkKafkaConsumer to
> > KafkaSource. I've fired a ticket[1] and connected it with [2]. This
> > shouldn't be the blocker for removing FlinkKafkaConsumer.
> >
> > Given there will be some ongoing SinkV2 upgrades, I will start a vote
> only
> > limited to FlinkKafkaConsumer elimination and related APIs graduation.
> As a
> > follow-up task, I will sync with Yun Gao before the coding freeze of 1.17
> > release to check if we can start the second vote to remove
> > FlinkKafkaProducer with 1.17.
> >
> > Best regards,
> > Jing
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-29999
> > [2] https://issues.apache.org/jira/browse/FLINK-28302
> >
> >
> > On Wed, Nov 2, 2022 at 11:39 AM Martijn Visser <martijnvis...@apache.org
> >
> > wrote:
> >
> > > Hi David,
> > >
> > > I believe that for the DataStream this is indeed documented [1] but it
> > > might be missed given that there is a lot of documentation and you need
> > to
> > > know that your problem is related to idleness. For the Table API I
> think
> > > this is never mentioned, so it should definitely be at least documented
> > > there.
> > >
> > > Thanks,
> > >
> > > Martijn
> > >
> > > [1]
> > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/datastream/kafka/#idleness
> > >
> > > On Wed, Nov 2, 2022 at 11:28 AM David Anderson <dander...@apache.org>
> > > wrote:
> > >
> > > > >
> > > > > For the partition
> > > > > idleness problem could you elaborate more about it? I assume both
> > > > > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to
> decide
> > > > > whether to mark the partition as idle.
> > > >
> > > >
> > > > As a matter of fact, no, that's not the case -- which is why I
> > mentioned
> > > > it.
> > > >
> > > > The FlinkKafkaConsumer automatically treats all initially empty (or
> > > > non-existent) partitions as idle, while the KafkaSource only does
> this
> > if
> > > > the WatermarkStrategy specifies that idleness handling is desired by
> > > > configuring withIdleness. This can be a source of confusion for folks
> > > > upgrading to the new connector. It most often shows up in situations
> > > where
> > > > the number of Kafka partitions is less than the parallelism of the
> > > > connector, which is a rather common occurrence in development and
> > testing
> > > > environments.
> > > >
> > > > I believe this change in behavior was made deliberately, so as to
> > create
> > > a
> > > > more consistent experience across all FLIP-27 connectors. This isn't
> > > > something that needs to be fixed, but does need to be communicated
> more
> > > > clearly. Unfortunately, the whole idleness mechanism remained
> > > significantly
> > > > broken until 1.16 (considering the impact of [1] and [2]), further
> > > > complicating the situation. Because of FLINK-28975 [2], users with
> > > > partitions that are initially empty may have problems with versions
> > > before
> > > > 1.15.3 (still unreleased) and 1.16.0. See [3] for an example of this
> > > > confusion.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-18934 (idleness
> didn't
> > > > work
> > > > with connected streams)
> > > > [2] https://issues.apache.org/jira/browse/FLINK-28975 (idle streams
> > > could
> > > > never become active again)
> > > > [3]
> > > >
> > > >
> > >
> >
> https://stackoverflow.com/questions/70096166/parallelism-in-flink-kafka-source-causes-nothing-to-execute/70101290#70101290
> > > >
> > > > Best,
> > > > David
> > > >
> > > > On Wed, Nov 2, 2022 at 5:26 AM Qingsheng Ren <re...@apache.org>
> wrote:
> > > >
> > > > > Thanks Jing for starting the discussion.
> > > > >
> > > > > +1 for removing FlinkKafkaConsumer, as KafkaSource has evolved for
> > many
> > > > > release cycles and should be stable enough. I have some concerns
> > about
> > > > the
> > > > > new Kafka sink based on sink v2, as sink v2 still has some ongoing
> > work
> > > > in
> > > > > 1.17 (maybe Yun Gao could provide some inputs). Also we found some
> > > issues
> > > > > of KafkaSink related to the internal mechanism of sink v2, like
> > > > > FLINK-29492.
> > > > >
> > > > > @David
> > > > > About the ability of DeserializationSchema#isEndOfStream, FLIP-208
> is
> > > > > trying to complete this piece of the puzzle, and Hang Ruan (
> > > > > ruanhang1...@gmail.com) plans to work on it in 1.17. For the
> > partition
> > > > > idleness problem could you elaborate more about it? I assume both
> > > > > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to
> decide
> > > > > whether to mark the partition as idle.
> > > > >
> > > > > Best,
> > > > > Qingsheng
> > > > > Ververica (Alibaba)
> > > > >
> > > > > On Thu, Oct 27, 2022 at 8:06 PM Jing Ge <j...@ververica.com>
> wrote:
> > > > >
> > > > > > Hi Dev,
> > > > > >
> > > > > > I'd like to start a discussion about removing FlinkKafkaConsumer
> > and
> > > > > > FlinkKafkaProducer in 1.17.
> > > > > >
> > > > > > Back in the past, it was originally announced to remove it with
> > Flink
> > > > > 1.15
> > > > > > after Flink 1.14 had been released[1]. And then postponed to the
> > next
> > > > > 1.15
> > > > > > release which meant to remove it with Flink 1.16 but forgot to
> > change
> > > > the
> > > > > > doc[2]. I have created a PRs to fix it. Since the 1.16 release
> > branch
> > > > has
> > > > > > code freeze, it makes sense to, first of all, update the doc to
> say
> > > > that
> > > > > > FlinkKafkaConsumer will be removed with Flink 1.17 [3][4] and
> > second
> > > > > start
> > > > > > the discussion about removing them with the current master branch
> > > i.e.
> > > > > for
> > > > > > the coming 1.17 release. I'm all ears and looking forward to your
> > > > > feedback.
> > > > > > Thanks!
> > > > > >
> > > > > > Best regards,
> > > > > > Jing
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/#kafka-sourcefunction
> > > > > > [2]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kafka/#kafka-sourcefunction
> > > > > > [3] https://github.com/apache/flink/pull/21172
> > > > > > [4] https://github.com/apache/flink/pull/21171
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to