Re: Flink Kafka Connector Source Parallelism

2020-05-27 Thread Chen, Mason
I think I may have just answered my own question. There’s only one Kafka partition, so the maximum parallelism is one and it doesn’t really make sense to make another kafka consumer under the same group id. What threw me off is that there’s a 2nd subtask for the kafka source created even though

Re: Webinar: Unlocking the Power of Apache Beam with Apache Flink

2020-05-27 Thread Marta Paes Moreira
Thanks for sharing, Aizhamal - it was a great webinar! Marta On Wed, 27 May 2020 at 23:17, Aizhamal Nurmamat kyzy wrote: > Thank you all for attending today's session! Here is the YT recording: > https://www.youtube.com/watch?v=ZCV9aRDd30U > And link to the slides: > https://github.com/aijamaln

Flink Kafka Connector Source Parallelism

2020-05-27 Thread Chen, Mason
Hi all, I’m currently trying to understand Flink’s Kafka Connector and how parallelism affects it. So, I am running the flink playground click count job and the parallelism is set to 2 by default. However, I don’t see the 2nd subtask of the Kafka Connector sending any records: https://imgur.c

Re: Apache Flink - Question about application restart

2020-05-27 Thread Zhu Zhu
Hi M, Sorry I missed your message. JobID will not change for a generated JobGraph. However, a new JobGraph will be generated each time a job is submitted. So that multiple submissions will have multiple JobGraphs. This is because different submissions are considered as different jobs, as Till ment

Re: Flink Dashboard UI Tasks hard limit

2020-05-27 Thread Xintong Song
Ah, I guess I had misunderstood what your mean. Below 18000 tasks, the Flink Job is able to start up. > Even though I increased the number of slots, it still works when 312 slots > are being used. > When you say "it still works", I thought that you increased the parallelism the job was sill execut

Re: [DISCUSS] FLINK-17989 - java.lang.NoClassDefFoundError org.apache.flink.fs.azure.common.hadoop.HadoopRecoverableWriter

2020-05-27 Thread Guowei Ma
Hi, I think the StreamingFileSink could not support Azure currently. You could find more detailed info from here[1]. [1] https://issues.apache.org/jira/browse/FLINK-17444 Best, Guowei Israel Ekpo 于2020年5月28日周四 上午6:04写道: > You can assign the task to me and I will like to collaborate with someon

Executing a controllable benchmark in Flink

2020-05-27 Thread Felipe Gutierrez
Hi, I am trying to benchmark a stream application in Flink. So, I am using the source Function that reads events from the NYC Taxi Rides (http://training.ververica.com/trainingData/nycTaxiRides.gz) and I control the emission with System.nanoTime(). I am not using Thread.sleep because Java does not

Re: [DISCUSS] FLINK-17989 - java.lang.NoClassDefFoundError org.apache.flink.fs.azure.common.hadoop.HadoopRecoverableWriter

2020-05-27 Thread Israel Ekpo
You can assign the task to me and I will like to collaborate with someone to fix it. On Wed, May 27, 2020 at 5:52 PM Israel Ekpo wrote: > Some users are running into issues when using Azure Blob Storage for the > StreamFileSink > > https://issues.apache.org/jira/browse/FLINK-17989 > > The issue

[DISCUSS] FLINK-17989 - java.lang.NoClassDefFoundError org.apache.flink.fs.azure.common.hadoop.HadoopRecoverableWriter

2020-05-27 Thread Israel Ekpo
Some users are running into issues when using Azure Blob Storage for the StreamFileSink https://issues.apache.org/jira/browse/FLINK-17989 The issue is because certain packages are relocated in the POM file and some classes are dropped in the final shaded jar I have attempted to comment out the r

Installing Ververica, unable to write to file system

2020-05-27 Thread Corrigan, Charlie
Hello, I’m trying to install Ververica (community edition for a simple poc deploy) via helm using these directions, but the pod is failing with the following error: ``` org.springframework.context.ApplicationContextException: Unable to st

Re: Webinar: Unlocking the Power of Apache Beam with Apache Flink

2020-05-27 Thread Aizhamal Nurmamat kyzy
Thank you all for attending today's session! Here is the YT recording: https://www.youtube.com/watch?v=ZCV9aRDd30U And link to the slides: https://github.com/aijamalnk/beam-learning-month/blob/master/Unlocking%20the%20Power%20of%20Apache%20Beam%20with%20Apache%20Flink.pdf On Tue, May 26, 2020 at 8

Re: Stateful functions Harness

2020-05-27 Thread Boris Lublinsky
Thanks Seth Will take a look. > On May 27, 2020, at 3:15 PM, Seth Wiesman wrote: > > Hi Boris, > > Example usage of flink sources and sink is available in the documentation[1]. > > [1] > https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/io-module/flink-connectors.html >

Re: Stateful functions Harness

2020-05-27 Thread Seth Wiesman
Hi Boris, Example usage of flink sources and sink is available in the documentation[1]. [1] https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/io-module/flink-connectors.html On Wed, May 27, 2020 at 1:08 PM Boris Lublinsky < boris.lublin...@lightbend.com> wrote: > Thats not ex

RE: History Server Not Showing Any Jobs - File Not Found?

2020-05-27 Thread Hailu, Andreas
Hi Chesney, apologies for not getting back to you sooner here. So I did what you suggested - I downloaded a few files from my jobmanager.archive.fs.dir HDFS directory to a locally available directory named /local/scratch/hailua_p2epdlsuat/historyserver/archived/. I then changed my historyserver

Re: Tumbling windows - increasing checkpoint size over time

2020-05-27 Thread Wissman, Matt
Hello Till & Guowei, Thanks for the replies! Here is a snippet of the window function: SingleOutputStreamOperator aggregatedStream = dataStream .keyBy(idKeySelector()) .window(TumblingProcessingTimeWindows.of(seconds(15))) .apply(new Aggreg

Re: Stateful functions Harness

2020-05-27 Thread Boris Lublinsky
Thats not exactly the usage question that I am asking When I am writing IO module I have to write Ingress and Egress spec. You have an example for Kafka, which looks like def getIngressSpec: IngressSpec[GreetRequest] = KafkaIngressBuilder.forIdentifier(GREETING_INGRESS_ID) .withKafkaAddress(

Re: Stateful functions Harness

2020-05-27 Thread Boris Lublinsky
Ok, so the bug in the examples is an absence of resources. Having classes in the classpath is not sufficient Modules.java is using ServiceLoader, which is setting private static final String PREFIX = "META-INF/services/" So all the modules have to be listed in the resource files > On May 27, 2

Age old stop vs cancel debate

2020-05-27 Thread Senthil Kumar
We are on flink 1.9.0 I have a custom SourceFunction, where I rely on isRunning set to false via the cancel() function to exit out of the run loop. My run loop essentially gets the data from S3, and then simply sleeps (Thread.sleep) for a specified time interval. When a job gets cancelled, the

Re: [EXTERNAL] Re: Memory growth from TimeWindows

2020-05-27 Thread Slotterback, Chris
Aljoscha, Maybe “lazy” isn’t the right term haha it’s my interpretation that during mark and sweep of the default GC, memory from older windows wasn’t being fully marked for collection. Since switching to G1, collection seems to be much more aggressive, and whenever the young generation memory e

Re: Memory issue in Flink 1.10

2020-05-27 Thread Andrey Zagrebin
Hi Steve, RocksDB does not contribute to the JVM direct memory. RocksDB off-heap memory consumption is part of managed memory [1]. You got `OutOfMemoryError: Direct buffer memory` which is related to the JVM direct memory, also off-heap but managed by JVM. The JVM direct memory limit depends on t

Re: Stateful functions Harness

2020-05-27 Thread Tzu-Li (Gordon) Tai
On Thu, May 28, 2020, 12:19 AM Boris Lublinsky < boris.lublin...@lightbend.com> wrote: > I think I figured this out. > The project seems to be missing > > resources > > /META-INF >

Re: Stateful functions Harness

2020-05-27 Thread Boris Lublinsky
I think I figured this out. The project seems to be missing resources /META-INF

Re: Stateful functions Harness

2020-05-27 Thread Tzu-Li (Gordon) Tai
Hi, The example is working fine on my side (also using IntelliJ). This could most likely be a problem with your project setup in the IDE, where the classpath isn't setup correctly. What do you see when you right click on the statefun-flink-harness-example directory (in the IDE) --> Open Module Se

Re: Flink Dashboard UI Tasks hard limit

2020-05-27 Thread Vijay Balakrishnan
Thanks so much, Xintong for guiding me through this. I looked at the Flink logs to see the errors. I had to change taskmanager.network.memory.max: 4gb and akka.ask.timeout: 240s to increase the number of tasks. Now, I am able to increase the number of Tasks/ aka Task vertices. taskmanager.network.

Re: [EXTERNAL] Re: Memory growth from TimeWindows

2020-05-27 Thread Mitch Lloyd
Chris, What version of Flink are you using? I also have an issue with slow but continual memory growth in a windowing function but it seems like the taskmanager.sh script I'm using already has the -XX+UseG1GC flag set: https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/bin/t

Re: Stateful functions Harness

2020-05-27 Thread Boris Lublinsky
The project https://github.com/apache/flink-statefun/tree/release-2.0/statefun-examples/statefun-flink-harness-example Does not work in Intellij. The problem is that when running in Inte

Memory issue in Flink 1.10

2020-05-27 Thread Steven Nelson
We recently migrated to Flink 1.10, but are experiencing some issues with memory. Our cluster is: 1) Running inside of Kubernetes 2) Running in HA mode 3) Checkpointing/Savepointing to an HDFS cluster inside of Kubernetes 4) Using RocksDB for checkpointing 5) Running on m5d.4xlarge EC2 instances w

Re: Collecting operators real output cardinalities as json files

2020-05-27 Thread Francesco Ventura
Thank you very much for your explanation. I will keep it in mind. Best, Francesco > Il giorno 27 mag 2020, alle ore 15:43, Piotr Nowojski > ha scritto: > > Hi Francesco, > > As long as you do not set update interval of metric reporter to some very low > value, there should be no visible per

Re: Apache Flink - Question about application restart

2020-05-27 Thread Till Rohrmann
Hi, if you submit the same job multiple times, then it will get every time a different JobID assigned. For Flink, different job submissions are considered to be different jobs. Once a job has been submitted, it will keep the same JobID which is important in order to retrieve the checkpoints associ

Re: Tumbling windows - increasing checkpoint size over time

2020-05-27 Thread Till Rohrmann
Hi Matt, could you give us a bit more information about the windows you are using? They are tumbling windows. What's the size of the windows? Do you allow lateness of events? What's your checkpoint interval? Are you using event time? If yes, how is the watermark generated? You said that the numb

Re: multiple sources

2020-05-27 Thread Till Rohrmann
Hi Aissa, Flink supports to read from multiple sources in one job. You have to call multiple times `StreamExecutionEnvironment.addSource()` with the respective `SourceFunction`. Flink does not come with a ready-made MongoDB connector. However, there is a project which tried to implement a MongoDB

Re: Collecting operators real output cardinalities as json files

2020-05-27 Thread Piotr Nowojski
Hi Francesco, As long as you do not set update interval of metric reporter to some very low value, there should be no visible performance degradation. Maybe worth keeping in mind is that if you jobs are bounded (they are working on bounded input and they finish/complete at some point of time),

Re: Singal task backpressure problem with Credit-based Flow Control

2020-05-27 Thread Piotr Nowojski
Hi Weihua, Good to hear that you have found the problem. Let us know if you find some other problems after all. Piotrek > On 27 May 2020, at 14:18, Weihua Hu wrote: > > Hi Piotrek, > > Thanks for your suggestions, I found some network issues which seems to be > the cause of back pressure. >

Re: Singal task backpressure problem with Credit-based Flow Control

2020-05-27 Thread Weihua Hu
Hi Piotrek, Thanks for your suggestions, I found some network issues which seems to be the cause of back pressure. Best Weihua Hu > 2020年5月26日 02:54,Piotr Nowojski 写道: > > Hi Weihua, > > > After dumping the memory and analyzing it, I found: > > Sink (121)'s RemoteInputChannel.unannouncedCred

Re: In consistent Check point API response

2020-05-27 Thread Vijay Bhaskar
Created JIRA for it: https://issues.apache.org/jira/browse/FLINK-17966 Regards Bhaskar On Wed, May 27, 2020 at 1:28 PM Vijay Bhaskar wrote: > Thanks Yun. In that case it would be good to give the reference of that > documentation in the Flink Rest API: > https://ci.apache.org/projects/flink/

Re: unsubcribe

2020-05-27 Thread Leonard Xu
Hi, Please send mail to user-unsubscr...@flink.apache.org to unsubscribe the mails from user@flink.apache.org , refer [1] Best, Leonard [1] https://flink.apache.org/community.html#how-to-subscribe-to-a-mailing-list

unsubcribe

2020-05-27 Thread 王洪达

Re: Collecting operators real output cardinalities as json files

2020-05-27 Thread Francesco Ventura
Hi Piotrek, Thank you for you replay and for your suggestions. Just another doubt. Does the usage of metrics reporter and custom metrics will affect the performances of the running jobs in term of execution time? Since I need the information about the exact netRunTime of each job maybe using the

multiple sources

2020-05-27 Thread Aissa Elaffani
Hello everyone, I hope you all doing well.I am reading from a Kafka topic some real-time messages produced by some sensors, and in order to do some aggregations, I need to enrich the stream with other data that are stocked in a mongoDB. So, I want to know if it is possible to work with two sources

Re: In consistent Check point API response

2020-05-27 Thread Vijay Bhaskar
Thanks Yun. In that case it would be good to give the reference of that documentation in the Flink Rest API: https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html while explaining about the checkpoints. Tomorrow any one want to use REST API, they will get easy reference o

Re: ClusterClientFactory selection

2020-05-27 Thread Kostas Kloudas
Hi Singh, The only thing to add to what Yang said is that the "execution.target" configuration option (in the config file) is also used for the same purpose from the execution environments. Cheers, Kostas On Wed, May 27, 2020 at 4:49 AM Yang Wang wrote: > > Hi M Singh, > > The Flink CLI picks u