Flink fails at stop operation after migration from 1.14 to 1.15.1

2022-08-04 Thread Vararu, Vadim
Hi all, Trying to migrate a job from Flink 1.14 to 1.15.1. The job itself consumes from a kinesis stream and writes to s3. The stop operation works well on 1.14, however, on 1.15.1 it fails (both with and without savepoint). The jobs fails with different exceptions when there is data flowing th

Global window in batch mode

2022-09-29 Thread Vararu, Vadim
Hi all, I need to configure a keyed global window that would trigger a reduce function for all the events in each key group before the processing finishes and the job closes. I have something similar for the realtime(streaming) version of the job, configured with a processing time gap: .keyB

Old s3 files referenced in sink's state after migration from 1.14 to 1.15

2022-10-05 Thread Vararu, Vadim
Hi all, We have some jobs that write parquet files in s3, bucketing by processing time in a structure like /year/month/day/hour. On 13th of September, we have migrated our Flink runtime 1.14.5 to 1.15.2 and now we have some jobs crashing at checkpointing because of being unable to find some s3

S3 Parquet files rolling on event not working because PartFileInfo.getSize() does not increase.

2023-05-10 Thread Vararu, Vadim
Hi all, Trying to have a s3 parquet bulk writer with file roll policy based on size limitation + checkpoint. For that I’ve extended the CheckpointRollingPolicy and overwritten shouldRollOnEvent to return true if the part size is greater than the limit. The problem is that the part size that I

Data duplicated with s3 file sink

2024-04-03 Thread Vararu, Vadim
nfig.builder () .withPartPrefix (String.format ("part-%s", UUID.randomUUID ())) .withPartSuffix (String.format ("%s.parquet", compressionCodecName.getExtension ())) .build ()) .build (); Thanks, Vararu Vadim.

Data duplication at stop with savepoint

2024-04-15 Thread Vararu, Vadim
Hi community, Need your help to understand if there is a misconfiguration or it’s a Flink bug. I’ve drawn a schema for better understanding but here is the problem in few steps:

Kinesis connector writes wrong sequence number at stop with savepoint

2024-04-15 Thread Vararu, Vadim
I’ve been investigating a data duplication issue in a Kinesis -> Flink -> Kafka exactly once setup. Found out that at the stop with savepoint next things happen: * The Kafka transaction is committed, the last processed events being written * The Kinesis sequence number is written in the

Proper way to modify log4j config file for kubernetes-session

2024-05-13 Thread Vararu, Vadim
Hi, Trying to configure loggers in the log4j-console.properties file (that is mounted from the host where the kubernetes-session.sh is invoked and referenced by the TM processes via - Dlog4j.configurationFile). Is there a proper (documented) way to do that, meaning to append/modify the log4j c

Re: Proper way to modify log4j config file for kubernetes-session

2024-05-14 Thread Vararu, Vadim
Yes, the dynamic log level modification worked great for me. Thanks a lot, Vadim From: Biao Geng Date: Tuesday, 14 May 2024 at 10:07 To: Vararu, Vadim Cc: user@flink.apache.org Subject: Re: Proper way to modify log4j config file for kubernetes-session Hi Vararu, Does this document meet your

Flink kinesis connector 4.3.0 release estimated date

2024-05-22 Thread Vararu, Vadim
Hi guys, Any idea when the 4.3.0 kinesis connector is estimated to be released? Cheers, Vadim.

Re: Flink kinesis connector 4.3.0 release estimated date

2024-05-23 Thread Vararu, Vadim
That’s great news. Thanks. From: Leonard Xu Date: Thursday, 23 May 2024 at 04:42 To: Vararu, Vadim Cc: user , Danny Cranmer Subject: Re: Flink kinesis connector 4.3.0 release estimated date Hey, Vararu The kinesis connector 4.3.0 release is under vote phase and we hope to finalize the