State Memory Leak Issue in Flink CEP v1.17: Discussion & Proposed Solution

2023-08-20 Thread Puneet Duggal
Hi, We have encountered a state memory leak issue while working with the Flink CEP (Complex Event Processing) in version 1.17. Based on code, it seems like the issue is persistent across all versions of CEP library. I've already raised a JIRA issue detailing the problem we've faced. You can fi

Query Regarding Optimisation of Timer Management in Flink CEP (Version 1.17)

2023-07-03 Thread Puneet Duggal
Hi, I am currently working with Flink CEP version 1.17, and I am in the process of load testing for potential memory leaks related to checkpoint data. While analyzing the CepOperator code, I have come across a particular pattern regarding timer registration and event processing that I believe c

Job Cancellation Failing

2023-02-19 Thread Puneet Duggal
Flink Cluster Context: Flink Version - 1.15 Deployment Mode - Session Number of Job Managers - 3 (HA) Number of Task Managers - 1 Cancellation of Job fails due to following org.apache.flink.runtime.rest.NotFoundException: Job 1cb2185d4d72c8c6f0a3a549d7de4ef0 not found at org.apache.flink.runti

Rocksdb Incremental checkpoint

2022-12-19 Thread Puneet Duggal
Hi, After going through the following article regarding rocksdb incremental checkpoint (https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html), my understanding was that at each checkpoint, flink only checkpoints newly created SSTables whereas other it can reference from

Re: Flink CEP Incremental Checkpoint Issue

2022-11-02 Thread Puneet Duggal
Hi Yun Tang, Thank you for the response and yes went through some articles which explained rocksdb incremental checkpointing mechanism and makes sense w.r.t. metrics that i am seeing. Regards, Puneet > On 22-Oct-2022, at 1:26 PM, Yun Tang wrote: > > compaction

Re: Flink CEP Incremental Checkpoint Issue

2022-10-17 Thread Puneet Duggal
Apologies for the mistake of calculation 120*6*2KB = 1440KB = 1.4MB > On 18-Oct-2022, at 1:35 AM, Puneet Duggal wrote: > > Hi, > > I am working on a use case which uses Flink CEP for pattern detection. > > Flink Version - 1.12.1 > Deployment Mode - Session Mode (H

Re: Job Manager getting restarted while restarting task manager

2022-10-13 Thread Puneet Duggal
t make the JobMananger restart. You can provide the whole log as an attachment to investigate.On Wed, 12 Oct 2022 at 6:01 PM, Puneet Duggal <puneetduggal1...@gmail.com> wrote:Hi Xintong Song,Thanks for your immediate reply. Yes, I do restart task manager via kill command and then flink resta

Re: Job Manager getting restarted while restarting task manager

2022-10-12 Thread Puneet Duggal
l ' command, or a > kubernetes pod removal / eviction, etc. You may want to check where the > signal came from. > > Best, > Xintong > > > On Wed, Oct 12, 2022 at 6:26 AM Puneet Duggal <mailto:puneetduggal1...@gmail.com>> wrote: > Hi, > > I am faci

Job Manager getting restarted while restarting task manager

2022-10-11 Thread Puneet Duggal
Hi, I am facing an issue where when restarting task manager after adding some configuration changes, even though task manager restarts successfully with the updated configuration change, is causing the leader job manager to restart as well. Pasting the leader job manager logs here 2022-10-11

Re: Task Manager shutdown causing jobs to fail

2022-03-07 Thread Puneet Duggal
che.org/flink/flink-docs-release-1.14/docs/ops/state/task_failure_recovery/ > > <https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/state/task_failure_recovery/> > for more details. > > On Fri, Mar 4, 2022 at 2:50 AM Puneet Duggal <mailto:puneetdugga

Task Manager shutdown causing jobs to fail

2022-03-03 Thread Puneet Duggal
Hi, Currently in production, i have HA session mode flink cluster with 3 job managers and multiple task managers with more than enough free task slots. But i have seen multiple times that whenever task manager goes down ( e.g. due to heartbeat issue).. so does all the jobs running on it even wh

Re: AWS Kinesis Flink vs K8s

2022-03-03 Thread Puneet Duggal
y. > > Hope this helps, > Jeremy > > > > > From: Danny Cranmer mailto:dannycran...@apache.org>> > Date: Wednesday, February 16, 2022 at 9:27 AM > To: Puneet Duggal <mailto:puneetduggal1...@gmail.com>>, "Ber, Jeremy" <mailto:jd...@a

AWS Kinesis Flink vs K8s

2022-02-16 Thread Puneet Duggal
Hi, Just wanted to ask the community various pros and cons of deploying flink using AWS Kinesis vs using K8s application mode. Currently we are deploying flink cluster in HA session standalone mode and planning to switch to application deployment mode. Regards, Puneet

Re: Uploading jar to s3 for persistence

2022-01-10 Thread Puneet Duggal
it will solve is that when job manager goes down/restarted, currently it deletes the dir containing all the jars. Use persistent storage will also help in this case. Best, Puneet Duggal > On 10-Jan-2022, at 9:40 PM, Puneet Duggal wrote: > > Hi Piotr, > > Thank you for your imm

Re: Uploading jar to s3 for persistence

2022-01-10 Thread Puneet Duggal
/flink/flink-docs-master/docs/deployment/filesystems/s3/ It is mentioned that we can use s3 in all locations where Flink expects Filesystem URI. (including high availability and RocksDB State Backend). Regards, Puneet Duggal > On 10-Jan-2022, at 7:13 PM, Piotr Nowojski wrote: > >

Uploading jar to s3 for persistence

2022-01-09 Thread Puneet Duggal
Hi, Currently i am working with flink HA cluster with 3 job managers and 3 zookeeper nodes. Also i am persisting my checkpoints to s3 and hence already configured required flink-s3 jars during flink job manager and task manager process startup. Now i have configured a variable web.upload.dir

Uploading jar on multiple flink job managers

2021-12-24 Thread Puneet Duggal
Hi, So currently we are using flink 1.12 in HA mode on production. There are 3 job managers (1 leader and 2 standby). When I am uploading a jar on one of the job managers, somehow it is not reflected on other job managers. Is there any way where I can achieve a behaviour where uploading jar to

log4j2 upgrade requirement

2021-12-22 Thread Puneet Duggal
code. But my uber jar uses log4j 2.17.0 version. So my doubt is whether there is any situation that i am missing because of which i should upgrade log4j version of cluster as well or just upgrading log4j version of my jar should suffice. Thanks, Puneet Duggal

Job Listener not working as expected

2021-12-07 Thread Puneet Duggal
onJobSubmitted and onJobExecuted are being called simultaneously on submitting a real time data streaming job. Currently, jobs are being deployed in session mode. Thanks, Puneet Duggal

Re: Duplicate Calls to Cep Filter

2021-11-06 Thread Puneet Duggal
Hi Dawid, Thank you for the clarification. If possible can you please mention any example use case where we can know complexity of building nfa and why IGNORE action needs to be evaluated independently of TAKE action. Thanks & Regards, Puneet Duggal On Mon, 1 Nov 2021 at 3:28 PM, D

Re: RE: Duplicate Calls to Cep Filter

2021-10-28 Thread Puneet Duggal
ting for the > following events, thus > the condition might be eval for two times. > > Best, > Yun > > > --Original Mail -- > *Sender:*Schwalbe Matthias > *Send Date:*Wed Oct 27 17:55:18 2021 > *Recipients:*Puneet Duggal , user < > user@flink.apache.o

Duplicate Calls to Cep Filter

2021-10-27 Thread Puneet Duggal
=[a], middle=[b]} As you can see ... on ingestion of *b* middle pattern is getting called twice. Any ideas of this behaviour. Thanks and regards, Puneet Duggal

Re: Latency Tracking in Apache Flink

2021-10-20 Thread Puneet Duggal
Hi, Yes it is a simple ETL job and i thought of using it start_time, end_time concept… but just wanted to know if flink or any other 3rd party monitoring tools like datadog etc provide out of the box functionality to report latency. Thanks and regards, Puneet Duggal > On 21-Oct-2021, at 8

Latency Tracking in Apache Flink

2021-10-20 Thread Puneet Duggal
Hi, Is there any way to track latency / time taken for each event processing. Read about latency markers but not much useful as it just skips time taken by each operator. Thanks, Puneet

Job Manager went down on cancelling job with savepoint

2021-09-24 Thread Puneet Duggal
Hi, So while cancelling one job with savepoint… even though job got cancelled successfully .. but somehow immediately after that job manager went down. Not able to deduce anything from given stack trace.. Any help is appreciated 2021-09-24 11:50:44,182 INFO org.apache.flink.runtime.checkpoint

Re: JVM Metaspace capacity planning

2021-09-15 Thread Puneet Duggal
oading-for-user-code> > > Best, > Guowei > > > On Mon, Sep 13, 2021 at 10:06 PM Puneet Duggal <mailto:puneetduggal1...@gmail.com>> wrote: > Hi, > > Thank you for quick reply. So in my case i am using Datastream Apis.Each job > is a real time processin

Re: JVM Metaspace capacity planning

2021-09-13 Thread Puneet Duggal
> occur. There is already a JIRA ticket for this [1]. > > I don't know much about the behavior of class loaders, so I'll wait for > others to apply in this aspect. > > [1] https://issues.apache.org/jira/browse/FLINK-15024 > <https://issues.apache.org/jira/browse/FL

JVM Metaspace capacity planning

2021-09-13 Thread Puneet Duggal
Hi, So on going through multiple resources, got basic idea that JVM Metaspace is used by flink class loader to load class metadata which is used to create objects in heap. Also this is a one time activity since all the objects of single class require single class metadata object in JVM Metaspac

Re: Documentation for deep diving into flink (data-streaming) job restart process

2021-09-12 Thread Puneet Duggal
Hi Robert, Any solution / alternate approach to above issue would be appreciated as going live with new jobs will be unreliable w.r.t task manager going down. On Fri, Sep 10, 2021 at 1:17 PM Puneet Duggal wrote: > Hi Robert, > > Thanks for taking out time to go through the logs. &g

Re: Documentation for deep diving into flink (data-streaming) job restart process

2021-09-10 Thread Puneet Duggal
t; > Why are you restarting a TaskManager? > How are you deploying Flink? > > On Fri, Sep 10, 2021 at 12:46 AM Puneet Duggal <mailto:puneetduggal1...@gmail.com>> wrote: > Hi, > > Please find attached logfile regarding job not getting restarted on another > task manag

Re: Documentation for deep diving into flink (data-streaming) job restart process

2021-09-09 Thread Puneet Duggal
e us with the JobManager logs of this incident? Jobs should > not disappear, they should restart on other Task Managers. > > On Wed, Sep 8, 2021 at 3:06 PM Puneet Duggal > wrote: > >> Hi, >> >> So for past 2-3 days i have been looking for documentation which >&g

Documentation for deep diving into flink (data-streaming) job restart process

2021-09-08 Thread Puneet Duggal
Hi, So for past 2-3 days i have been looking for documentation which elaborates how flink takes care of restarting the data streaming job. I know all the restart and failover strategies but wanted to know how different components (Job Manager, Task Manager etc) play a role while restarting the