Re: Does the Kafka source perform retractions on Key?

2021-03-01 Thread Arvid Heise
Hi Rex, yes you can go directly into Flink since 1.11.0 [1], but afaik only through Table API/SQL currently (which you seem to be using anyways most of the time). I'd recommend using 1.11.1+ (some bugfixes) or even 1.12.0+ (many new useful features [2]). You can also check the main doc [3]. If yo

Re: Flink’s Kubernetes HA services - NOT working

2021-03-01 Thread Till Rohrmann
Hmm, this is strange. From the logs it looks as if certain communications between components don't arrive at the receiver's end. I think we have to further dig into the problem. In order to further narrow it down, could you try to start the cluster with using pod IPs instead of K8s services for in

Debugging long Flink checkpoint durations

2021-03-01 Thread Dan Hill
Hi. Are there good ways to debug long Flink checkpoint durations? I'm running a backfill job that runs ~10 days of data and then starts checkpointing failing. Since I only see the last 10 checkpoints in the jobmaster UI, I don't see when it starts. I looked through the text logs and didn't see

Allocating tasks to specific TaskManagers

2021-03-01 Thread 황혜조
Hi, I am looking for a way to allocate each created subTask to a specific TaskManager. Is there any way to force assigning tasks to specific taskManagers? Thank you Best regards, Hyejo Hwang

Re: Does the Kafka source perform retractions on Key?

2021-03-01 Thread Rex Fenley
Thanks Arvid, I think my confusion lies in misinterpreting the meaning of CDC. We basically don't want CDC, we just use it to get data into a compacted Kafka topic where we hold the current state of the world to consume from multiple consumers. You have described pretty thoroughly where we want to

Re: [Statefun] Exception occurs during function chaining / Async function

2021-03-01 Thread Seth Wiesman
Hi Le, I believe the issue is the bounded source[1]. Stateful Functions only supports unbounded inputs. Additionally, you can remove all the `synchronized` blocks from your code; statefun handles all synchronization for you. Seth [1] https://gist.github.com/flint-stone/cbc60f2d41507fdf33507ba99

Re: Handling Data Separation / Watermarking from Kafka in Flink

2021-03-01 Thread Rion Williams
Hey David et all, I had one follow up question for this as I've been putting together some integration/unit tests to verify that things are working as expected with finite datasets (e.g. a text file with several hundred records that are serialized, injected into Kafka, and processed through the pi

Timeout Exception When Producing/Consuming Messages to Hundreds of Topics

2021-03-01 Thread Claude M
Hello, I'm trying to run an experiment w/ two flink jobs: - A producer producing messages to hundreds of topics - A consumer consuming the messages from all the topics After the job runs after a few minutes, it will fail w/ following error: Caused by: org.apache.kafka.common.errors.Timeou

State Schema Evolution within SQL API

2021-03-01 Thread Jan Oelschlegel
Hi at all, i would like to know how far a state schema evolution is possible by using SQL API of Flink. Which query changes can I do without disrupting the schema of my savepoint? In the documentation is, only for the DataStream API , written what are the do's and don'ts regarding a safe sch

timeWindow()s and queryable state

2021-03-01 Thread Ron Crocker
Hi all - I’m trying to keep some state around for a little while after a window fires to use as queryable state. I am intending on using something like: .keyBy() .timeWindow(Time.minutes(1)).allowedLateness(Time.minutes(90)) .aggregate(…) .keyBy() .asQueryableState(...) My intent is to keep tha

Scaling Higher than 10k Nodes

2021-03-01 Thread Joey Tran
Hi, I was looking at Apache Beam/Flink for some of our data processing needs, but when reading about the resource managers (YARN/mesos/Kubernetes), it seems like they all top out at around 10k nodes. What are recommended solutions for scaling higher than this? Thanks in advance, Joey

Re: Flink application kept restarting

2021-03-01 Thread Matthias Pohl
Another question is: The timeout of 48 hours sounds strange. There should have been some other system noticing the connection problem earlier assuming that you have a reasonably low heartbeat interval configured. Matthias On Mon, Mar 1, 2021 at 1:22 PM Matthias Pohl wrote: > Thanks for providin

Re: Flink application kept restarting

2021-03-01 Thread Matthias Pohl
Thanks for providing this information, Rainie. Are other issues documented in the logs besides the TimeoutException in the JM logs which you already shared? For now, it looks like that there was a connection problem between the TaskManager and the JobManager that caused the shutdown of the operator

Re: Suspected classloader leak in Flink 1.11.1

2021-03-01 Thread Kezhu Wang
Hi Tamir, > The histogram has been taken from Task Manager using jcmd tool. >From that histogram, I guest there is no classloader leaking. > A simple batch job with single operation . The memory bumps to ~600MB (after single execution). once the job is finished the memory never freed. It could

how to propagate watermarks across multiple jobs

2021-03-01 Thread yidan zhao
I have a job which includes about 50+ tasks. I want to split it to multiple jobs, and the data is transferred through Kafka, but how about watermark? Is anyone have do something similar and solved this problem? Here I give an example: The original job: kafkaStream1(src-topic) => xxxProcess => xxx

Re: Flink’s Kubernetes HA services - NOT working

2021-03-01 Thread Till Rohrmann
Hi Omer, thanks for the logs. Could you tell us a bit more about the concrete setup of your Flink K8s cluster? It looks to me as if the ResourceManager cannot talk to the JobMaster which tries to register at the RM. Also some JobMasters don't seem to reach the ResourceManager. Could it be that you

AW: Kafka SQL Connector: dropping events if more partitions then source tasks

2021-03-01 Thread Jan Oelschlegel
Hi Shengkai, thanks for this hint. I solves the issue having more consumer tasks than kafka partitions. But the case with dropping events while having less consumer tasks than kafka partitions is still there. I think it will be okay in version 1.12 [1] [1] https://issues.apache.org/jira/browse

Re: Does the Kafka source perform retractions on Key?

2021-03-01 Thread Arvid Heise
> We are rereading the topics, at any time we might want a completely different materialized view for a different web service for some new application feature. Other jobs / new jobs need to read all the up-to-date rows from the databases. > I still don't see how this is the case if everything just

Re: Issues running multiple Jobs using the same JAR

2021-03-01 Thread Morgan Geldenhuys
That solved it, thank you very much Kezhu :) On 28.02.21 16:12, Kezhu Wang wrote: Hi Morgan, You could check FLINK-11654, from its description, I think it is the problem you encountered. > We run multiple jobs on a cluster which write a lot to the same Kafka topic from identically named sin

Re: Kubernetes HA - attempting to restore from wrong (non-existing) savepoint

2021-03-01 Thread Alexey Trenikhun
Hi Yang, Unfortunately I didn't save log. Trying to reproduce again, but now hitting different error - about incompatible version of ImmutableMapSerializer, which is strange, since while serialVersionUID indeed changed, however this serializer is only registered but not used, (there is no state