Re: Flink Performance Issue

2021-09-27 Thread Arvid Heise
Hi Kamaal, I did a quick test with a local Kafka in docker. With parallelism 1, I can process 20k messages of size 4KB in about 1 min. So if you use parallelism of 15, I'd expect it to take it below 10s even with bigger data skew. What I recommend you to do is to start from scratch and just work

Re: Flink Performance Issue

2021-09-27 Thread Mohammed Kamaal
Hi Robert, I have removed all the business logic (keyBy and window) operator code and just had a source and sink to test it. The throughput is 20K messages in 2 minutes. It is a simple read from source (kafka topic) and write to sink (kafka topic). Don't you think 2 minutes is also not a better

Re: Flink Performance Issue

2021-09-22 Thread Robert Metzger
Hi Kamaal, I would first suggest understanding the performance bottleneck, before applying any optimizations. Idea 1: Are your CPUs fully utilized? if yes, good, then scaling up will probably help If not, then there's another inefficiency Idea 2: How fast can you get the data into your job, with

Re: Flink Performance Issue

2021-09-22 Thread Mohammed Kamaal
Hi Arvid, The throughput has decreased further after I removed all the rebalance(). The performance has decreased from 14 minutes for 20K messages to 20 minutes for 20K messages. Below are the tasks that the flink application is performing. I am using keyBy and Window operation. Do you think a

Re: Flink Performance Issue

2021-09-06 Thread Arvid Heise
Hi Mohammed, something is definitely wrong in your setup. You can safely say that you can process 1k records per second and core with Kafka and light processing, so you shouldn't even need to go distributed in your case. Do you perform any heavy computation? What is your flatMap doing? Are you em

Re: Flink Performance Issue

2021-09-02 Thread Mohammed Kamaal
Hi Fabian, Just an update, Problem 2:- Caused by: org.apache.kafka.common.errors.NetworkException It is resolved. It was because we exceeded the number of allowed partitions for the kafka cluster (AWS MSK cluster). Have deleted unused topics and partitions to resolve the issue.

Re: Flink Performance Issue

2021-08-24 Thread Fabian Paul
Hi Mohammed, 200records should definitely be doable. The first you can do is remove the print out Sink because they are increasing the load on your cluster due to the additional IO operation and secondly preventing Flink from fusing operators. I am interested to see the updated job graph after

Re: Flink Performance Issue

2021-08-24 Thread Fabian Paul
Hi Mohammed, Without diving too much into your business logic a thing which catches my eye is the partitiong you are using. In general all calls to`keyBy`or `rebalance` are very expensive because all the data is shuffled across down- stream tasks. Flink tries to fuse operators with the same keyG

Flink Performance Issue

2021-08-24 Thread Mohammed Kamaal
Hi, Apologize for the big message, to explain the issue in detail. We have a Flink (version 1.8) application running on AWS Kinesis Analytics. The application has a source which is a kafka topic with 15 partitions (AWS Managed Streaming Kafka) and the sink is again a kafka topic with 15 partiti