Unequal distribtion of Kafka partitions with respect to topic when reading from multiple topics using KafkaSource API with Flink 1.14.3

2023-01-18 Thread Kathula, Sandeep
Hi, I am using KafkaSource API read from 6 topics within Kafka. Flink version - 1.14.3. Each and every kafka topic my Flink pipeline reads from is having a different load but same number of partitions (lets say 30). For example partition 0 of topic 1 and partition 0 of topic 2 have different

Beam with Flink runner - Issues when writing to S3 in Parquet Format

2021-09-14 Thread Kathula, Sandeep
Hi, We have a simple Beam application which reads from Kafka, converts to parquet and write to S3 with Flink runner (Beam 2.29 and Flink 1.12). We have a fixed window of 5 minutes after conversion to PCollection and then writing to S3. We have around 320 columns in our data. Our intention is

Re: Unable to read state Witten by Beam application with Flink runner using Flink's State Processor API

2021-08-03 Thread Kathula, Sandeep
together). Can you please share an example code of how you're reading the state? Also can please you try this with latest Beam / Flink versions (the ones you're using are no longer supported)? Best, D. On Tue, Jul 27, 2021 at 5:46 PM Kathula, Sandeep wrote

Unable to read state Witten by Beam application with Flink runner using Flink's State Processor API

2021-07-27 Thread Kathula, Sandeep
Hi, We have a simple Beam application like a work count running with Flink runner (Beam 2.26 and Flink 1.9). We are using Beam’s value state. I am trying to read the state from savepoint using Flink's State Processor API but getting a NullPointerException. Converted the whole code into Pur

S3 Checkpointing taking long time with stateful operations

2020-06-18 Thread Kathula, Sandeep
Hi, We are running a stateful application in Flink with RocksDB as backend and set incremental state to true with checkpoints written to S3. * 10 task managers each with 2 task slots * Checkpoint interval 3 minutes * Checkpointing mode – At-least once processing After running app for

Flink not restoring from checkpoint when job manager fails even with HA through zookeeper

2020-06-06 Thread Kathula, Sandeep
Hi, We are running Flink 1.9 in K8S. We used https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/jobmanager_high_availability.html to set high availability. We have a single master. We set max number of retries for a task to 2. After task fails twice and then the job manager