Hi,
We have encountered a state memory leak issue while working with the Flink CEP
(Complex Event Processing) in version 1.17. Based on code, it seems like the
issue is persistent across all versions of CEP library.
I've already raised a JIRA issue detailing the problem we've faced. You can
fi
Hi,
I am currently working with Flink CEP version 1.17, and I am in the process of
load testing for potential memory leaks related to checkpoint data. While
analyzing the CepOperator code, I have come across a particular pattern
regarding timer registration and event processing that I believe c
Flink Cluster Context:
Flink Version - 1.15
Deployment Mode - Session
Number of Job Managers - 3 (HA)
Number of Task Managers - 1
Cancellation of Job fails due to following
org.apache.flink.runtime.rest.NotFoundException: Job
1cb2185d4d72c8c6f0a3a549d7de4ef0 not found
at
org.apache.flink.runti
Hi,
After going through the following article regarding rocksdb incremental
checkpoint
(https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html),
my understanding was that at each checkpoint, flink only checkpoints newly
created SSTables whereas other it can reference from
Hi Yun Tang,
Thank you for the response and yes went through some articles which explained
rocksdb incremental checkpointing mechanism and makes sense w.r.t. metrics that
i am seeing.
Regards,
Puneet
> On 22-Oct-2022, at 1:26 PM, Yun Tang wrote:
>
> compaction
Apologies for the mistake of calculation
120*6*2KB = 1440KB = 1.4MB
> On 18-Oct-2022, at 1:35 AM, Puneet Duggal wrote:
>
> Hi,
>
> I am working on a use case which uses Flink CEP for pattern detection.
>
> Flink Version - 1.12.1
> Deployment Mode - Session Mode (H
t make the JobMananger restart. You can provide the whole log as an attachment to investigate.On Wed, 12 Oct 2022 at 6:01 PM, Puneet Duggal <puneetduggal1...@gmail.com> wrote:Hi Xintong Song,Thanks for your immediate reply. Yes, I do restart task manager via kill command and then flink resta
l ' command, or a
> kubernetes pod removal / eviction, etc. You may want to check where the
> signal came from.
>
> Best,
> Xintong
>
>
> On Wed, Oct 12, 2022 at 6:26 AM Puneet Duggal <mailto:puneetduggal1...@gmail.com>> wrote:
> Hi,
>
> I am faci
Hi,
I am facing an issue where when restarting task manager after adding some
configuration changes, even though task manager restarts successfully with the
updated configuration change, is causing the leader job manager to restart as
well. Pasting the leader job manager logs here
2022-10-11
che.org/flink/flink-docs-release-1.14/docs/ops/state/task_failure_recovery/
>
> <https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/state/task_failure_recovery/>
> for more details.
>
> On Fri, Mar 4, 2022 at 2:50 AM Puneet Duggal <mailto:puneetdugga
Hi,
Currently in production, i have HA session mode flink cluster with 3 job
managers and multiple task managers with more than enough free task slots. But
i have seen multiple times that whenever task manager goes down ( e.g. due to
heartbeat issue).. so does all the jobs running on it even wh
y.
>
> Hope this helps,
> Jeremy
>
>
>
>
> From: Danny Cranmer mailto:dannycran...@apache.org>>
> Date: Wednesday, February 16, 2022 at 9:27 AM
> To: Puneet Duggal <mailto:puneetduggal1...@gmail.com>>, "Ber, Jeremy" <mailto:jd...@a
Hi,
Just wanted to ask the community various pros and cons of deploying flink using
AWS Kinesis vs using K8s application mode. Currently we are deploying flink
cluster in HA session standalone mode and planning to switch to application
deployment mode.
Regards,
Puneet
it will solve is that when job manager goes
down/restarted, currently it deletes the dir containing all the jars. Use
persistent storage will also help in this case.
Best,
Puneet Duggal
> On 10-Jan-2022, at 9:40 PM, Puneet Duggal wrote:
>
> Hi Piotr,
>
> Thank you for your imm
/flink/flink-docs-master/docs/deployment/filesystems/s3/
It is mentioned that we can use s3 in all locations where Flink expects
Filesystem URI. (including high availability and RocksDB State Backend).
Regards,
Puneet Duggal
> On 10-Jan-2022, at 7:13 PM, Piotr Nowojski wrote:
>
>
Hi,
Currently i am working with flink HA cluster with 3 job managers and 3
zookeeper nodes. Also i am persisting my checkpoints to s3 and hence already
configured required flink-s3 jars during flink job manager and task manager
process startup. Now i have configured a variable
web.upload.dir
Hi,
So currently we are using flink 1.12 in HA mode on production. There are 3 job
managers (1 leader and 2 standby). When I am uploading a jar on one of the job
managers, somehow it is not reflected on other job managers. Is there any way
where I can achieve a behaviour where uploading jar to
code. But my uber jar uses log4j 2.17.0 version.
So my doubt is whether there is any situation that i am missing because of
which i should upgrade log4j version of cluster as well or just upgrading log4j
version of my jar should suffice.
Thanks,
Puneet Duggal
onJobSubmitted and
onJobExecuted are being called simultaneously on submitting a real time data
streaming job. Currently, jobs are being deployed in session mode.
Thanks,
Puneet Duggal
Hi Dawid,
Thank you for the clarification. If possible can you please mention any
example use case where we can know complexity of building nfa and why
IGNORE action needs to be evaluated independently of TAKE action.
Thanks & Regards,
Puneet Duggal
On Mon, 1 Nov 2021 at 3:28 PM, D
ting for the
> following events, thus
> the condition might be eval for two times.
>
> Best,
> Yun
>
>
> --Original Mail --
> *Sender:*Schwalbe Matthias
> *Send Date:*Wed Oct 27 17:55:18 2021
> *Recipients:*Puneet Duggal , user <
> user@flink.apache.o
=[a], middle=[b]}
As you can see ... on ingestion of *b* middle pattern is getting called
twice. Any ideas of this behaviour.
Thanks and regards,
Puneet Duggal
Hi,
Yes it is a simple ETL job and i thought of using it start_time, end_time
concept… but just wanted to know if flink or any other 3rd party monitoring
tools like datadog etc provide out of the box functionality to report latency.
Thanks and regards,
Puneet Duggal
> On 21-Oct-2021, at 8
Hi,
Is there any way to track latency / time taken for each event processing. Read
about latency markers but not much useful as it just skips time taken by each
operator.
Thanks,
Puneet
Hi,
So while cancelling one job with savepoint… even though job got cancelled
successfully .. but somehow immediately after that job manager went down. Not
able to deduce anything from given stack trace.. Any help is appreciated
2021-09-24 11:50:44,182 INFO
org.apache.flink.runtime.checkpoint
oading-for-user-code>
>
> Best,
> Guowei
>
>
> On Mon, Sep 13, 2021 at 10:06 PM Puneet Duggal <mailto:puneetduggal1...@gmail.com>> wrote:
> Hi,
>
> Thank you for quick reply. So in my case i am using Datastream Apis.Each job
> is a real time processin
> occur. There is already a JIRA ticket for this [1].
>
> I don't know much about the behavior of class loaders, so I'll wait for
> others to apply in this aspect.
>
> [1] https://issues.apache.org/jira/browse/FLINK-15024
> <https://issues.apache.org/jira/browse/FL
Hi,
So on going through multiple resources, got basic idea that JVM Metaspace is
used by flink class loader to load class metadata which is used to create
objects in heap. Also this is a one time activity since all the objects of
single class require single class metadata object in JVM Metaspac
Hi Robert,
Any solution / alternate approach to above issue would be appreciated as
going live with new jobs will be unreliable w.r.t task manager going down.
On Fri, Sep 10, 2021 at 1:17 PM Puneet Duggal
wrote:
> Hi Robert,
>
> Thanks for taking out time to go through the logs.
&g
t;
> Why are you restarting a TaskManager?
> How are you deploying Flink?
>
> On Fri, Sep 10, 2021 at 12:46 AM Puneet Duggal <mailto:puneetduggal1...@gmail.com>> wrote:
> Hi,
>
> Please find attached logfile regarding job not getting restarted on another
> task manag
e us with the JobManager logs of this incident? Jobs should
> not disappear, they should restart on other Task Managers.
>
> On Wed, Sep 8, 2021 at 3:06 PM Puneet Duggal
> wrote:
>
>> Hi,
>>
>> So for past 2-3 days i have been looking for documentation which
>&g
Hi,
So for past 2-3 days i have been looking for documentation which elaborates how
flink takes care of restarting the data streaming job. I know all the restart
and failover strategies but wanted to know how different components (Job
Manager, Task Manager etc) play a role while restarting the
32 matches
Mail list logo