KafkaSource

2021-05-17 Thread Alexey Trenikhun
Hello, Is new KafkaSource/KafkaSourceBuilder ready to be used ? If so, is KafkaSource state compatible with legacy FlinkKafkaConsumer, for example if I replace FlinkKafkaConsumer by KafkaSource, will offsets continue from what we had in FlinkKafkaConsumer ? Thanks, Alexey

KafkaSource Problem

2021-03-08 Thread Bobby Richard
I'm receiving the following exception when trying to use a KafkaSource from the new DataSource API. Exception in thread "main" java.lang.NullPointerException at org.apache.flink.connector.kafka.source.reader.deserializer.ValueDeserializerWrapper.getProducedType(ValueDeserializer

KafkaSource metrics

2021-05-24 Thread 陳樺威
Hello, Our team tries to test reactive mode and replace FlinkKafkaConsumer with the new KafkaSource. But we can’t find the KafkaSource metrics list. Does anyone have any idea? In our case, we want to know the Kafka consume delay and consume rate. Thanks, Oscar

Re: KafkaSource

2021-05-28 Thread Matthias Pohl
On Tue, May 18, 2021 at 2:21 AM Alexey Trenikhun wrote: > Hello, > > Is new KafkaSource/KafkaSourceBuilder ready to be used ? If so, is KafkaSource > state compatible with legacy FlinkKafkaConsumer, for example if I replace > FlinkKafkaConsumer > by KafkaSource, will offsets

Re: KafkaSource Problem

2021-03-09 Thread Till Rohrmann
eceiving the following exception when trying to use a KafkaSource > from the new DataSource API. > > Exception in thread "main" java.lang.NullPointerException > at > org.apache.flink.connector.kafka.source.reader.deserializer.ValueDeserializerWrapper.getProducedType(Value

Re: KafkaSource Problem

2021-03-09 Thread Bobby Richard
m receiving the following exception when trying to use a KafkaSource >> from the new DataSource API. >> >> Exception in thread "main" java.lang.NullPointerException >> at >> org.apache.flink.connector.kafka.source.read

Re: KafkaSource Problem

2021-03-10 Thread Till Rohrmann
>> >> [1] https://issues.apache.org/jira/browse/FLINK-21691 >> >> Cheers, >> Till >> >> On Mon, Mar 8, 2021 at 3:35 PM Bobby Richard >> wrote: >> >>> I'm receiving the following exception when trying to u

Re: KafkaSource metrics

2021-05-24 Thread Ardhani Narasimha
mode makes any difference. On Mon, May 24, 2021 at 7:44 PM 陳樺威 wrote: > Hello, > > Our team tries to test reactive mode and replace FlinkKafkaConsumer with > the new KafkaSource. > But we can’t find the KafkaSource metrics list. Does anyone have any idea? > In our case, we want

Re: KafkaSource metrics

2021-05-24 Thread 陳樺威
Hi Ardhani, Thanks for your kindly reply. Our team use your provided metrics before, but the metrics disappear after migrate to new KafkaSource. We initialize KafkaSource in following code. ``` val consumer: KafkaSource[T] = KafkaSource.builder() .setProperties(properties) .setTopics(topic

Re: KafkaSource metrics

2021-05-25 Thread Qingsheng Ren
Email: renqs...@gmail.com On May 25, 2021, 2:35 PM +0800, 陳樺威 , wrote: > Hi Ardhani, > > Thanks for your kindly reply. > > Our team use your provided metrics before, but the metrics disappear after > migrate to new KafkaSource. > > We initialize KafkaSource in following

Re: KafkaSource metrics

2021-05-25 Thread Alexey Trenikhun
Looks like when KafkaSource is used instead of FlinkKafkaConsumer, metrics listed below are not available. Bug? Work in progress? Thanks, Alexey From: Ardhani Narasimha Sent: Monday, May 24, 2021 9:08 AM To: 陳樺威 Cc: user Subject: Re: KafkaSource metrics Use

Re: KafkaSource metrics

2021-05-26 Thread Alexey Trenikhun
Found https://issues.apache.org/jira/browse/FLINK-22766 From: Alexey Trenikhun Sent: Tuesday, May 25, 2021 3:25 PM To: Ardhani Narasimha ; 陳樺威 ; Flink User Mail List Subject: Re: KafkaSource metrics Looks like when KafkaSource is used instead of

KafkaSource vs FlinkKafkaConsumer010

2022-01-31 Thread HG
Hello all I am confused. What is the difference between KafkaSource as defined in : https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/ and FlinkKafkaConsumer010 as defined in https://nightlies.apache.org/flink/flink-docs-release- 1.2/api/java/org/apache

KafkaSource consumer group

2023-03-30 Thread Roberts, Ben (Senior Developer) via user
Hi, Is there a way to run multiple flink jobs with the same Kafka group.id and have them join the same consumer group? It seems that setting the group.id using KafkaSource.builder().set_group_id() does not have the effect of creating an actual consumer group in Kafka. Running the same flink jo

FlinkKafkaConsumer -> KafkaSource State Migration

2021-10-26 Thread Mason Chen
Hi all, I read these instructions for migrating to the KafkaSource: https://nightlies.apache.org/flink/flink-docs-release-1.14/release-notes/flink-1.14/#deprecate-flinkkafkaconsumer . Do we need to employ any uid/allowNonRestoredState tricks if our Flink job is also stateful outside of the

Re: KafkaSource vs FlinkKafkaConsumer010

2022-01-31 Thread Francesco Guardiani
The latter link you posted refers to a very old flink release. You shold use the first link, which refers to latest release FG On Tue, Feb 1, 2022 at 8:20 AM HG wrote: > Hello all > > I am confused. > What is the difference between KafkaSource as defined in : > https://night

Re: KafkaSource vs FlinkKafkaConsumer010

2022-02-01 Thread HG
Hello Francesco Perhaps I copied the wrong link of 1.2. But there is also https://nightlies.apache.org/flink/flink-docs-release-1.4/api/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer010.html It seems there are 2 ways to use Kafka KafkaSource source = KafkaSource.builder

Re: KafkaSource vs FlinkKafkaConsumer010

2022-02-01 Thread Francesco Guardiani
I think the FlinkKakfaConsumer010 you're talking about is the old source api. You should use only KafkaSource now, as they use the new source infrastructure. On Tue, Feb 1, 2022 at 9:02 AM HG wrote: > Hello Francesco > Perhaps I copied the wrong link of 1.2. > But there i

Re: KafkaSource vs FlinkKafkaConsumer010

2022-02-01 Thread David Anderson
in Flink 1.14 in favor of KafkaSource, which implements the unified batch/streaming interface defined in FLIP-27. Regards, David On Tue, Feb 1, 2022 at 9:21 AM Francesco Guardiani wrote: > I think the FlinkKakfaConsumer010 you're talking about is the old source > api. You shou

Flink 1.12.1 and KafkaSource

2022-02-02 Thread Marco Villalobos
According to the Flink 1.12 documentation ( https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/connectors/kafka.html), it states to use FlinkKafkaSource when consuming from Kafka. However, I noticed that the newer API uses KafkaSource, which uses KafkaSourceBuilder and

Re: KafkaSource consumer group

2023-03-30 Thread Tzu-Li (Gordon) Tai
Hi Robert, This is a design choice. Flink's KafkaSource doesn't rely on consumer groups for assigning partitions / rebalancing / offset tracking. It manually assigns whatever partitions are in the specified topic across its consumer instances, and rebalances only when the Flink job / Ka

Re: KafkaSource consumer group

2023-03-30 Thread Tzu-Li (Gordon) Tai
Sorry, I meant to say "Hi Ben" :-) On Thu, Mar 30, 2023 at 9:52 AM Tzu-Li (Gordon) Tai wrote: > Hi Robert, > > This is a design choice. Flink's KafkaSource doesn't rely on consumer > groups for assigning partitions / rebalancing / offset tracking. It > manua

Handling Bounded Sources with KafkaSource

2021-03-12 Thread Rion Williams
Hi all, I've been using the KafkaSource API as opposed to the classic consumer and things have been going well. I configured my source such that it could be used in either a streaming or bounded mode, with the bounded approach specifically aimed at improving testing (unit/integration).

RE: FlinkKafkaConsumer -> KafkaSource State Migration

2021-10-27 Thread Schwalbe Matthias
I would also be interested on instructions/discussion on how to state-migrate from pre-unified sources/sinks to unified ones (Kafka) 😊 Thias From: Mason Chen Sent: Mittwoch, 27. Oktober 2021 01:52 To: user Subject: FlinkKafkaConsumer -> KafkaSource State Migration Hi all, I read th

Re: FlinkKafkaConsumer -> KafkaSource State Migration

2021-10-27 Thread Fabian Paul
Hi, Sorry for the late reply but most of use were involved in the Flink Forward conference. The upgrade strategies for the Kafka sink and source are pretty similar. Source and sink do not rely on state migration but leveraging Kafka as source of truth. When running with FlinkKafkaConsumer Maso

RE: FlinkKafkaConsumer -> KafkaSource State Migration

2021-11-01 Thread Schwalbe Matthias
Thanks Fabian, That was the information I was missing. (Late reply ... same here, FlinkForward 😊 ) Thias -Original Message- From: Fabian Paul Sent: Donnerstag, 28. Oktober 2021 08:38 To: Schwalbe Matthias Cc: Mason Chen ; user Subject: Re: FlinkKafkaConsumer -> KafkaSou

Re: Flink 1.12.1 and KafkaSource

2022-02-07 Thread Chesnay Schepler
ation (https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/connectors/kafka.html), it states to use FlinkKafkaSource when consuming from Kafka. However, I noticed that the newer API uses KafkaSource, which uses KafkaSourceBuilder and OffsetsInitializer. Although I am on the Flink

RE: Re: KafkaSource consumer group

2023-03-31 Thread Roberts, Ben (Senior Developer) via user
guess we’ll need to just run it in a single cluster and be aware of the risks if we lost that cluster. Thanks, Ben On 2023/03/30 16:52:31 "Tzu-Li (Gordon) Tai" wrote: > Hi Robert, > > This is a design choice. Flink's KafkaSource doesn't rely on consumer >

Re: Re: KafkaSource consumer group

2023-03-31 Thread Andrew Otto
Otherwise I guess we’ll need to just run it in > a single cluster and be aware of the risks if we lost that cluster. > > Thanks, > Ben > > On 2023/03/30 16:52:31 "Tzu-Li (Gordon) Tai" wrote: > > Hi Robert, > > > > This is a design choice. Flin

Re: Handling Bounded Sources with KafkaSource

2021-03-13 Thread Rion Williams
would be streaming and unbounded, however I’d love to have a reliable integration test or a pattern that I could use to guarantee the processing of a finite set of data via a KafkaSource (I.e. send finite records to Kafka, read from topic, process all records, apply assertion after processing

Re: Handling Bounded Sources with KafkaSource

2021-03-14 Thread Maciej Obuchowski
at I could use to guarantee the processing of a finite set of > data via a KafkaSource (I.e. send finite records to Kafka, read from topic, > process all records, apply assertion after processing). > > Any ideas/recommendations/workarounds would be greatly welcome and I’d be > happy to share

unpredictable behaviour on KafkaSource deserialisation error

2022-02-09 Thread Frank Dekervel
Hello, When trying to reproduce a bug, we made a DeserialisationSchema that throws an exception when a malformed message comes in. Then, we sent a malformed message together with a number of well formed messages to see what happens. valsource= KafkaSource.builder[OurMessage]() .setValueOnlyDe

Problem with KafkaSource and watermark idleness

2022-08-09 Thread Yan Shen
Hi, I am using a org.apache.flink.connector.kafka.source.KafkaSource with a watermark strategy like this: WatermarkStrategy.forMonotonousTimestamps().withIdleness(Duration.ofSeconds(10)) I noticed that after a short while all the partitions seem to be marked as idle even though there are message

Flink KafkaSource still referencing deleted topic

2022-10-04 Thread Robert Cullen
We've changed the KafkaSource to ingest from a new topic but the old name is still being referenced: 2022-10-04 07:03:41org.apache.flink.util.FlinkException: Global failure triggered by OperatorCoordinator for 'Source: Grokfailures' (operator feca28aff5a3958840bee985e

Moving from flinkkafkaconsumer to kafkasource issues

2023-04-20 Thread naga sudhakar
Hi Team, Greetings of the day.. we are on flink 1.16.1 version and using flinkkafkaconsumer today. When I replaced it with kafkasource,it's failing with not able to connect with kafka jaas configuration. Error says Kafka client entry not found in /tmp/jass config file. We are passing the

Flink KafkaSource failure on empty partitions

2023-09-06 Thread David Clutter
I am using Flink 1.13.1 on AWS EMR and I seem to have hit this bug: https://issues.apache.org/jira/browse/FLINK-27041. My job will fail when there are empty partitions. I see it is fixed in a newer version of Flink but I cannot update Flink version at this time. Suggestions on a workaround? I a

Issues about removed topics with KafkaSource

2023-10-31 Thread Emily Li via user
Hey We have a flinkapp which is subscribing to multiple topics, we recently upgraded our application from 1.13 to 1.15, which we started to use KafkaSource instead of FlinkKafkaConsumer (deprecated). But we noticed some weird issue with KafkaSource after the upgrade, we are setting the topics

Re: unpredictable behaviour on KafkaSource deserialisation error

2022-02-14 Thread Niklas Semmler
Hi Frank, This sounds like an interesting issue. Can you share a minimal working example? Best regards, Niklas > On 9. Feb 2022, at 23:11, Frank Dekervel wrote: > > Hello, > > When trying to reproduce a bug, we made a DeserialisationSchema that throws > an exception when a malformed message

Re: Problem with KafkaSource and watermark idleness

2022-08-13 Thread Yan Shen
Hi all, After examining the source code further, I am quite sure org.apache.flink.api.common.eventtime.WatermarksWithIdleness does not work with FLIP-27 sources. In org.apache.flink.streaming.api.operators.SourceOperator, there are separate instances of WatermarksWithIdleness created for each spl

Re: Problem with KafkaSource and watermark idleness

2022-08-14 Thread David Anderson
Although I'm not very familiar with the design of the code involved, I also looked at the code, and I'm inclined to agree with you that this is a bug. Please do raise an issue. I'm wondering how you noticed this. I was thinking about how to write a failing test, and I'm wondering if this has some

Re: Problem with KafkaSource and watermark idleness

2022-08-14 Thread Yan Shen
Thanks David, I am working on a flink datastream job that does a temporal join of two kafka topics based on watermarks. The problem was quite obvious when I enabled idleness and data flowed through much faster with different results even though the topics were not idle. Regards. On Mon, Aug 15,

Re: Problem with KafkaSource and watermark idleness

2022-08-15 Thread David Anderson
Yan, I've created https://issues.apache.org/jira/browse/FLINK-28975 to track this. Regards, David On Sun, Aug 14, 2022 at 6:38 PM Yan Shen wrote: > Thanks David, > > I am working on a flink datastream job that does a temporal join of two > kafka topics based on watermarks. The problem was quite

KafkaSource watermarkLag metrics per topic per partition

2022-09-08 Thread Alexey Trenikhun
Hello, Is there way to configure Flink to expose watermarLag metric per topic per partition? I think it could be useful to detect data skew between partitions Thanks, Alexey

Re: Flink KafkaSource still referencing deleted topic

2022-10-04 Thread Martijn Visser
Hi Robert, Based on https://stackoverflow.com/questions/72870074/apache-flink-restoring-state-from-checkpoint-with-changes-kafka-topic I think you'll need to change the UID for your KafkaSource and restart your job with allowNonRestoredState enabled. Best regards, Martijn On Tue, Oct 4,

Re: Flink KafkaSource still referencing deleted topic

2022-10-04 Thread Mason Chen
Hi Martjin, I notice that this question comes up quite often. Would this be a good addition to the KafkaSource documentation? I'd be happy to contribute to the documentation. Best, Mason On Tue, Oct 4, 2022 at 11:23 AM Martijn Visser wrote: > Hi Robert, > > Ba

Re: Flink KafkaSource still referencing deleted topic

2022-10-04 Thread Martijn Visser
Hi Mason, Definitely! Feel free to open a PR and ping me for a review. Cheers, Martijn On Tue, Oct 4, 2022 at 3:51 PM Mason Chen wrote: > Hi Martjin, > > I notice that this question comes up quite often. Would this be a good > addition to the KafkaSource documentation? I&#

Re: Flink KafkaSource still referencing deleted topic

2022-10-12 Thread Robert Cullen
l free to open a PR and ping me for a review. > > Cheers, Martijn > > On Tue, Oct 4, 2022 at 3:51 PM Mason Chen wrote: > >> Hi Martjin, >> >> I notice that this question comes up quite often. Would this be a good >> addition to the KafkaSource documentatio

Re: Flink KafkaSource still referencing deleted topic

2022-10-12 Thread Hang Ruan
rs, Martijn >> >> On Tue, Oct 4, 2022 at 3:51 PM Mason Chen wrote: >> >>> Hi Martjin, >>> >>> I notice that this question comes up quite often. Would this be a good >>> addition to the KafkaSource documentation? I'd be happy to contribu

Accessing kafka message key from a KafkaSource

2022-12-07 Thread Noel OConnor
Hi, I'm using a kafka source to read in messages from kafka into a datastream. However I can't seem to access the key of the kafka message in the datastream. Is this even possible ? cheers Noel

KafkaSource and Event Time in Message Payload

2022-12-08 Thread Niklas Wilcke
Hi Flink Community, I have a few questions regarding the new KafkaSource and event time, which I wasn't able to answer myself via checking the docs, but please point me to the right pages in case I missed something. I'm not entirely whether my knowledge entirely holds for the new K

Re: Moving from flinkkafkaconsumer to kafkasource issues

2023-04-20 Thread Shammon FY
> When I replaced it with kafkasource,it's failing with not able to connect > with kafka jaas configuration. Error says Kafka client entry not found in > /tmp/jass config file. We are passing the flonk runtime arg for the > security.auth.login conf details. Same was working g w

Re: Moving from flinkkafkaconsumer to kafkasource issues

2023-04-21 Thread Martijn Visser
it may be useful for > positioning the issue > > Best, > Shammon FY > > > On Fri, Apr 21, 2023 at 12:56 AM naga sudhakar > wrote: > >> Hi Team, >> Greetings of the day.. >> we are on flink 1.16.1 version and using flinkkafkaconsumer today. >> When

Re: Issues about removed topics with KafkaSource

2023-11-01 Thread Martijn Visser
viewpage.action?pageId=217389320 On Wed, Nov 1, 2023 at 7:29 AM Emily Li via user wrote: > > Hey > > We have a flinkapp which is subscribing to multiple topics, we recently > upgraded our application from 1.13 to 1.15, which we started to use > KafkaSource instead of FlinkKafka

Re: Issues about removed topics with KafkaSource

2023-11-01 Thread Emily Li via user
ng this feature? For our current situation, we are subscribing to hundreds of topics, and we add/remove topics quite often (every few days probably), adding topics seems to be okay at the moment, but with the current KafkaSource design, if removing a topic means we need to change the kafka soure

Re: Issues about removed topics with KafkaSource

2023-11-02 Thread Hector Rios
's any plan in releasing > this feature? > > For our current situation, we are subscribing to hundreds of topics, and > we add/remove topics quite often (every few days probably), adding topics > seems to be okay at the moment, but with the current KafkaSource design, if > re

KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread Lars Skjærven
Using KafkaSource builder with a job parallelism larger than the number of kafka partitions, the job is unable to checkpoint. With a job parallelism of 4, 3 of the tasks are marked as FINISHED for the kafka topic with one partition. For this reason checkpointing seems to be disabled. When using

How to refresh topics to ingest with KafkaSource?

2021-10-13 Thread Preston Price
The KafkaSource, and KafkaSourceBuilder appear to prevent users from providing their own KafkaSubscriber. Am I overlooking something? In my case I have an external system that controls which topics we should be ingesting, and it can change over time. I need to add, and remove topics as we refresh

Hybrid Source with Parquet Files from GCS + KafkaSource

2021-12-09 Thread Meghajit Mazumdar
data from a locally saved Parquet File, and a KafkaSource consuming events from a remote Kafka broker. I was wondering if instead of using a local Parquet file, whether it is possible to directly stream the file from a GCS bucket and construct a File Source out of it at runtime ? The Parquet Files

Re: Accessing kafka message key from a KafkaSource

2022-12-07 Thread Yaroslav Tkachenko
Hi Noel, It's definitely possible. You need to implement a custom KafkaRecordDeserializationSchema: its "deserialize" method gives you a ConsumerRecord as an argument so that you can extract Kafka message key, headers, timestamp, etc. Then pass that when you create a

Re: Accessing kafka message key from a KafkaSource

2022-12-08 Thread Noel OConnor
argument so that you can extract Kafka message key, > headers, timestamp, etc. > > Then pass that when you create a KafkaSource via "setDeserializer" method. > > On Wed, Dec 7, 2022 at 6:14 AM Noel OConnor wrote: >> >> Hi, >> I'm using a kafka source to re

Re: KafkaSource and Event Time in Message Payload

2022-12-13 Thread Martijn Visser
27+Sources [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-217%3A+Support+watermark+alignment+of+source+splits On Thu, Dec 8, 2022 at 6:21 PM Niklas Wilcke wrote: > Hi Flink Community, > > I have a few questions regarding the new KafkaSource and event time, which > I wasn'

Re: KafkaSource and Event Time in Message Payload

2022-12-15 Thread Niklas Wilcke
nce/display/FLINK/FLIP-217%3A+Support+watermark+alignment+of+source+splits > > On Thu, Dec 8, 2022 at 6:21 PM Niklas Wilcke <mailto:niklas.wil...@uniberg.com>> wrote: >> Hi Flink Community, >> >> I have a few questions regarding the new KafkaSource and event time,

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread Matthias Pohl
/#execution-checkpointing-checkpoints-after-tasks-finish-enabled On Wed, Sep 15, 2021 at 11:26 AM Lars Skjærven wrote: > Using KafkaSource builder with a job parallelism larger than the number of > kafka partitions, the job is unable to checkpoint. > > With a job parallelism of 4, 3 of t

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread Lars Skjærven
Sep 15, 2021 at 11:26 AM Lars Skjærven wrote: > >> Using KafkaSource builder with a job parallelism larger than the number >> of kafka partitions, the job is unable to checkpoint. >> >> With a job parallelism of 4, 3 of the tasks are marked as FINISHED for >> th

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread David Morávek
execution-checkpointing-checkpoints-after-tasks-finish-enabled >> >> On Wed, Sep 15, 2021 at 11:26 AM Lars Skjærven wrote: >> >>> Using KafkaSource builder with a job parallelism larger than the number >>> of kafka partitions, the job is unable to checkpoint. >

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-16 Thread Fabian Paul
Hi all, The problem you are seeing Lars is somewhat intended behaviour, unfortunately. With the batch/stream unification every Kafka partition is treated as kind of workload assignment. If one subtask receives a signal that there is no workload anymore it goes into the FINISHED state. As alread

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-16 Thread Lars Skjærven
Thanks for the feedback. > May I ask why you have less partitions than the parallelism? I would be happy to learn more about your use-case to better understand the > motivation. The use case is that topic A, contains just a few messages with product metadata that rarely gets updated, while topic

Re: How to refresh topics to ingest with KafkaSource?

2021-10-13 Thread Caizhi Weng
: > The KafkaSource, and KafkaSourceBuilder appear to prevent users from > providing their own KafkaSubscriber. Am I overlooking something? > > In my case I have an external system that controls which topics we should > be ingesting, and it can change over time. I need to add, and rem

Re: How to refresh topics to ingest with KafkaSource?

2021-10-14 Thread Preston Price
suppose you want to read from different topics every now and then? Does > the topic-pattern option [1] in Table API Kafka connector meet your needs? > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/kafka/#topic-pattern > > Preston Price 于20

Re: How to refresh topics to ingest with KafkaSource?

2021-10-14 Thread Denis Nutiu
ant to read from different topics every now and then? Does >> the topic-pattern option [1] in Table API Kafka connector meet your needs? >> >> [1] >> https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/kafka/#topic-pattern >> >> Preston

Re: How to refresh topics to ingest with KafkaSource?

2021-10-14 Thread Denis Nutiu
>>> >>> I suppose you want to read from different topics every now and then? >>> Does the topic-pattern option [1] in Table API Kafka connector meet your >>> needs? >>> >>> [1] >>> https://ci.apache.org/projects/flink/flink-docs-master/do

Re: How to refresh topics to ingest with KafkaSource?

2021-10-14 Thread Preston Price
;>> >>> On Wed, Oct 13, 2021 at 8:51 PM Caizhi Weng >>> wrote: >>> >>>> Hi! >>>> >>>> I suppose you want to read from different topics every now and then? >>>> Does the topic-pattern option [1] in Table API Kafka connec

Re: How to refresh topics to ingest with KafkaSource?

2021-10-14 Thread Prasanna kumar
t;. My understanding is it would not pick up new >>>> topics that match the pattern after the job starts. >>>> >>>> >>>> On Wed, Oct 13, 2021 at 8:51 PM Caizhi Weng >>>> wrote: >>>> >>>>> Hi! >>>>> &

Re: How to refresh topics to ingest with KafkaSource?

2021-10-18 Thread Arvid Heise
gt;>> topics that match the pattern after the job starts. >>>>> >>>>> >>>>> On Wed, Oct 13, 2021 at 8:51 PM Caizhi Weng >>>>> wrote: >>>>> >>>>>> Hi! >>>>>> >>>>>> I suppose yo

Re: How to refresh topics to ingest with KafkaSource?

2021-10-26 Thread Mason Chen
Oct 13, 2021 at 8:51 PM Caizhi Weng <mailto:tsreape...@gmail.com>> wrote: > Hi! > > I suppose you want to read from different topics every now and then? Does the > topic-pattern option [1] in Table API Kafka connector meet your needs? > > [1] > https://ci.apache.org/

Re: How to refresh topics to ingest with KafkaSource?

2021-10-30 Thread Arvid Heise
at would >>>>>> fit the state of all potential topics. Additionally the documentation you >>>>>> linked seems to suggest that the regular expression is evaluated only >>>>>> once >>>>>> "when the job starts run

Re: How to refresh topics to ingest with KafkaSource?

2021-11-02 Thread Mason Chen
starts running". My understanding is it would not pick up new topics that >> match the pattern after the job starts. >> >> >> On Wed, Oct 13, 2021 at 8:51 PM Caizhi Weng > <mailto:tsreape...@gmail.com>> wrote: >> Hi! >> >> I suppose you

Re: How to refresh topics to ingest with KafkaSource?

2021-11-03 Thread Martijn Visser
gt; system, so there's no way to craft a single regular expression that >>>>>>> would >>>>>>> fit the state of all potential topics. Additionally the documentation >>>>>>> you >>>>>>> linked seems to suggest

behavior change with idle partitions and the new KafkaSource?

2021-11-22 Thread David Anderson
I've seen a few questions recently from folks migrating from FlinkKafkaConsumer to KafkaSource that make me suspect that something has changed. In FlinkKafkaConsumerBase we have this code which sets a source subtask to idle if all of its partitions are empty when the subtask s

Re: Hybrid Source with Parquet Files from GCS + KafkaSource

2021-12-10 Thread Roman Khachatryan
s data from a locally saved Parquet File, and a KafkaSource consuming > events from a remote Kafka broker. > > I was wondering if instead of using a local Parquet file, whether it is > possible to directly stream the file from a GCS bucket and construct a File > Source out of it

Re: Hybrid Source with Parquet Files from GCS + KafkaSource

2021-12-14 Thread Meghajit Mazumdar
gt; We have a requirement as follows: > > > > We want to stream events from 2 sources: Parquet files stored in a GCS > Bucket, and a Kafka topic. > > With the release of Hybrid Source in Flink 1.14, we were able to > construct a Hybrid Source which produces events from two sources:

Re: Hybrid Source with Parquet Files from GCS + KafkaSource

2021-12-15 Thread Fabian Paul
nts from 2 sources: Parquet files stored in a GCS >> > Bucket, and a Kafka topic. >> > With the release of Hybrid Source in Flink 1.14, we were able to construct >> > a Hybrid Source which produces events from two sources: a FileSource which >> > reads data from

Need help on implementing custom`snapshotState` in KafkaSource & KafkaSourceReader

2022-02-12 Thread santosh joshi
We are migrating to KafkaSource <https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/KafkaSource.java> from FlinkKafkaConsumer <https://github.com/apache/flink/blob/master/flink-connectors/flink-connec

Re: behavior change with idle partitions and the new KafkaSource?

2021-11-22 Thread Arvid Heise
Hi David, yes that's intentionally [1] as it could lead to correctness issues and it was inconsistently used across sources. Yes it should be documented. For now I'd put it in the KafkaSource docs because I'm not sure in which release notes it would fit best. In which release

Re: behavior change with idle partitions and the new KafkaSource?

2021-11-22 Thread David Anderson
Thanks, Arvid. Can you clarify how the KafkaSource currently behaves in the situation where it starts with fewer partitions than subtasks? Is that the case described in FLIP-180 as case 1: "Static assignment + too few splits"? The implementation described there (emit MAX_WATERMARK) sh

Re: behavior change with idle partitions and the new KafkaSource?

2021-11-24 Thread Arvid Heise
wrote: > Thanks, Arvid. > > Can you clarify how the KafkaSource currently behaves in the situation > where it starts with fewer partitions than subtasks? Is that the case > described in FLIP-180 as case 1: "Static assignment + too few splits"? The > implementation de

Re: Need help on implementing custom`snapshotState` in KafkaSource & KafkaSourceReader

2022-02-14 Thread Niklas Semmler
Hi Santosh, It’s best to avoid cross-posting. Let’s keep the discussion to SO. Best regards, Niklas > On 12. Feb 2022, at 16:39, santosh joshi wrote: > > We are migrating to KafkaSource from FlinkKafkaConsumer. We have disabled > auto commit of offset and instead committing them

New KafkaSource API : Change in default behavior regarding starting offset

2022-06-14 Thread bastien dine
om : https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/kafka/#starting-offset Before in old FlinkKafkaConsumer it was from committed offset (i.e : setStartFromGroupOffsets() method) which match with this behaviour in new KafkaSource : : OffsetsInitializer. commit

Re: New KafkaSource API : Change in default behavior regarding starting offset

2022-06-14 Thread Jing Ge
Hi Bastien, Thanks for asking. I didn't find any call of setStartFromGroupOffsets() within Flink in the master branch. Could you please point out the code that committed offset is used as default? W.r.t. the new KafkaSource, if OffsetsInitializer.committedOffsets() is used, an exception wi

Re: New KafkaSource API : Change in default behavior regarding starting offset

2022-06-15 Thread bastien dine
fsets() within > Flink in the master branch. Could you please point out the code that > committed offset is used as default? > > W.r.t. the new KafkaSource, if OffsetsInitializer.committedOffsets() > is used, an exception will be thrown at runtime in case there is no > committed off

Re: New KafkaSource API : Change in default behavior regarding starting offset

2022-06-15 Thread Martijn Visser
Flink in the master branch. Could you please point out the code that >> committed offset is used as default? >> >> W.r.t. the new KafkaSource, if OffsetsInitializer.committedOffsets() >> is used, an exception will be thrown at runtime in case there is no >> committed offset,

Re: New KafkaSource API : Change in default behavior regarding starting offset

2022-06-15 Thread bastien dine
e for sure .committedOffsets - we have it by default in our custom KafkaSource builder to be sure we do not read all the previous data (earliest) What bother me is just this change in starting offset default behavior from FlinkKafkaConsumer to KafkaSource (this can lead to mistake) In fact it happe

Re: New KafkaSource API: Change in default behavior regarding starting offset

2022-06-18 Thread liangzai
请问这个邮件咋退订? Replied Message | From | bastien dine | | Date | 06/15/2022 17:50 | | To | Martijn Visser | | Cc | Jing Ge, user | | Subject | Re: New KafkaSource API : Change in default behavior regarding starting offset | Hello Martijn, Thanks for the link to the release note

Re: New KafkaSource API: Change in default behavior regarding starting offset

2022-06-20 Thread Shengkai Fang
15/2022 17:50 > To Martijn Visser > Cc Jing Ge , > user > Subject Re: New KafkaSource API : Change in default behavior regarding > starting offset > Hello Martijn, > > Thanks for the link to the release note, especially : > "When resuming f

Reading KafkaSource state from a savepoint using the State Processor API

2023-05-23 Thread Charles Tan
Hi everyone, I have a few questions about reading KafkaSource state using the State Processor API. I have a simple Flink application which reads from a Kafka topic then produces to a different topic. After running the Flink job and stopping it with a savepoint, I then write a few more records to

How do I configure commit offsets on checkpoint in new KafkaSource api?

2021-11-19 Thread Marco Villalobos
The FlinkKafkaConsumer that will be deprecated has the method "setCommitOffsetsOnCheckpoints(boolan)" method. However, that functionality is not the new KafkaSource class. How is this behavior / functionality configured in the new API? -Marco A. Villalobos

KafkaSource, consumer assignment, and Kafka ‘stretch’ clusters for multi-DC streaming apps

2022-10-05 Thread Andrew Otto
Hi all, *tl;dr: Would it be possible to make a KafkaSource that uses Kafka's built in consumer assignment for Flink tasks?* At the Wikimedia Foundation we are evaluating <https://phabricator.wikimedia.org/T307944> whether we can use a Kafka 'stretch' cluster to simplify

Re: Reading KafkaSource state from a savepoint using the State Processor API

2023-05-24 Thread Hang Ruan
`KafkaPartitionSplitSerializer`. Best, Hang Charles Tan 于2023年5月24日周三 06:27写道: > Hi everyone, > > I have a few questions about reading KafkaSource state using the State > Processor API. I have a simple Flink application which reads from a Kafka > topic then produces to a different topic. After runnin

Re: How do I configure commit offsets on checkpoint in new KafkaSource api?

2021-11-19 Thread Mason Chen
Hi Marco, > https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/datastream/kafka/#additional-properties In the new KafkaSource, you can configure it in your properties. You can take a look at `KafkaSourceOptions#COMMIT_OFFSETS_ON_CHECKPOINT` for the specific config, which

Re: KafkaSource, consumer assignment, and Kafka ‘stretch’ clusters for multi-DC streaming apps

2022-10-05 Thread Andrew Otto
: > Hi all, > > *tl;dr: Would it be possible to make a KafkaSource that uses Kafka's built > in consumer assignment for Flink tasks?* > > At the Wikimedia Foundation we are evaluating > <https://phabricator.wikimedia.org/T307944> whether we can use a Kafka > 

  1   2   >