Re: How to list operators and see UID

2024-04-03 Thread Asimansu Bera
t;: 0, "CANCELING": 0, "FAILED": 0, "SCHEDULED": 0, "DEPLOYING": 0, "FINISHED": 0, "CANCELED": 0, "RECONCILING": 0 }, "metrics": { "read-bytes": 237863,

Re: GCS FileSink Read Timeouts

2024-04-02 Thread Asimansu Bera
Hello Dylan, I'm not an expert. There are many configuration settings(tuning) which could be setup via flink configuration. Pls refer to the second link below - specifically retry options. https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/filesystems/gcs/ https://github.

Re: How to handle tuple keys with null values

2024-04-02 Thread Asimansu Bera
Hello Sachin, The same issue had been reported in the past and JIRA was closed without resolution. https://issues.apache.org/jira/browse/FLINK-4823 I do see this is as a data quality issue. You need to understand what you would like to do with the null value. Either way, better to filter out the

Re: Flink pipeline throughput

2024-04-01 Thread Asimansu Bera
su and Xuyang. > > I am using Flink 1.17.0 > > Regards, > Kartik > > On Mon, Apr 1, 2024, 5:13 AM Asimansu Bera > wrote: > >> Hello Karthik, >> >> You may check the execution-buffer-timeout-interval parameter. This value >> is an important on

Re: Flink pipeline throughput

2024-03-31 Thread Asimansu Bera
Hello Karthik, You may check the execution-buffer-timeout-interval parameter. This value is an important one for your case. I had a similar issue experienced in the past. https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/config/#execution-buffer-timeout-interval For your

Re: Understanding checkpoint/savepoint storage requirements

2024-03-27 Thread Asimansu Bera
To add more details to it so that it will be clear why access to persistent object stores for all JVM processes are required for a job graph of Flink for consistent recovery. *JoB Manager:* Flink's JobManager writes critical metadata during checkpoints for fault tolerance: - Job Configuration:

Re: Understanding RocksDBStateBackend in Flink on Yarn on AWS EMR

2024-03-21 Thread Asimansu Bera
Hello Sachin, Typically, Cloud VMs are ephemeral, meaning that if the EMR cluster goes down or VMs are required to be shut down for security updates or due to faults, new VMs will be added to the cluster. As a result, any data stored in the local file system, such as file://tmp, would be lost. To

Re: End-to-end lag spikes when closing a large number of panes

2024-03-20 Thread Asimansu Bera
Hello Caio, Based on the pseudocode, there is no keyed function present. Hence, the window will not be processed parallely . Please check again and respond back. val windowDataStream = inputDataStream .window(TumblingEventTimeWindows of 1 hour) .trigger(custom trigger) .aggregat

Re: Question around manually setting Flink jobId

2024-03-14 Thread Asimansu Bera
Hello Venkat, There are few ways to get the JobID from the client side. JobID is alpha numeric as 9eec4d17246b5ff965a43082818a3336. When you submit the job using flink command line client , Job is returned as Job has been submitted with JobID 9eec4d17246b5ff965a43082818a3336 1. using below comma

Re: Task Manager getting killed while executing sql queries.

2024-02-15 Thread Asimansu Bera
Hello Kanchi, It's recommended to submit a separate request or issue for the problem you're encountering, as the data pipeline is distinct from the one Neha raised. This will help ensure that each issue can be addressed individually and efficiently. Hello Neha, Not sure about the issue you are en

Re: Impact of RocksDB backend on the Java heap

2024-02-15 Thread Asimansu Bera
t; > The memory RocksDB manages is outside the JVM, yes, but the mentioned > subsets must be bridged to the JVM somehow so that the data can be exposed > to the functions running inside Flink, no? > > Regards, > Alexis. > > > On Thu, 15 Feb 2024, 14:06 Asimansu Bera, w

Re: Impact of RocksDB backend on the Java heap

2024-02-15 Thread Asimansu Bera
Hello Alexis, RocksDB resides off-heap and outside of JVM. The small subset of data ends up on the off-heap in the memory. For more details, check the following link: https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/memory/mem_setup_tm/#managed-memory I hope this addres