Re: Flink operator job restart

2023-08-10 Thread Shammon FY
Hi Ethan: You can restart jobs with a specified checkpoint directory as [1]. But generally, we often restart jobs with savepoint, you can refer to [2] for more information. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/checkpoints/#resuming-from-a-retained-checkpoint [2]

In Flink, is there a way to merge two streams in stateful manner

2023-08-10 Thread Muazim Wani
Hi Team, I have a use case where I have two streams and want to join them in stateful manner . E.g data of stream 1 should be emitted before stream2. I tried to store the data in ListState in KeyedProcessFunction but I am not able to access state outside proccessElement(). Is there any way I could

Re: In Flink, is there a way to merge two streams in stateful manner

2023-08-10 Thread Ken Krugler
Hi Muazim, In Flink, a stream of data (unless bounded) is assumed to never end. So in your example below, this means stream 2 would NEVER be emitted, because stream 1 would never end (there is no time at which you know for sure that stream 1 is done). And this in turn means stream 2 would be b

Questions related to Autoscaler

2023-08-10 Thread Hou, Lijuan via user
Hi Ron, Thanks for the reply! > 1 - It seems for flink job using flink operator to realize autoscaling, the > only option to realize autoscaling is to enable the Autoscaler feature, and > KEDA won’t work, right? What is KEDA mean? -> KEDA is a Kubernetes based Event Driven Autoscaler. I found

Re: In Flink, is there a way to merge two streams in stateful manner

2023-08-10 Thread Muazim Wani
Thanks for the response! I have a specific use case where I am writing to a TextFile sink. I have a Bounded stream of header data and need to merge it with another bounded stream. While writing the data to a text file the header data should be written before the original data(from another bounded

Re: In Flink, is there a way to merge two streams in stateful manner

2023-08-10 Thread Ken Krugler
As (almost) always, the devil is in the details. You haven’t said, but I’m assuming you’re writing out multiple files, each with a different schema, as otherwise you could just leverage the existing Flink support for CSV. So then you could combine the header/footer streams (adding a flag for he

Local process and Docker have different behavior

2023-08-10 Thread Daniel Henneberger
Dear Apache Flink community, When I run Flink locally in my test cases on my Mac, I observe different behavior compared to running it in my Docker-backed build instance or using the official Docker-compose image. The processes complete as expected when i run it in-process but not always when i use

回复: Questions related to Autoscaler

2023-08-10 Thread Chen Zhanghao
Q1: if you use operator to submit a standalone mode job with reactive mode enabled, KEDA should still work. Q2: For Flink versions, 1.17 is recommended, but 1.15 is also okay if you backport the necessary changes listed in Autoscaler | Apache Flink Kubernetes Operator

回复: Flink operator job restart

2023-08-10 Thread Chen Zhanghao
Hi Ethan, You can refer to the K8s operator doc on how to do a stateful job upgrade: Job Management | Apache Flink Kubernetes Operator.

Re: In Flink, is there a way to merge two streams in stateful manner

2023-08-10 Thread Hang Ruan
Hi, Muazim. I think the Hybird Source[1] may be helpful for your case. Best, Hang Ken Krugler 于2023年8月11日周五 04:18写道: > As (almost) always, the devil is in the details. > > You haven’t said, but I’m assuming you’re writing out multiple files, each > with a different schema, as otherwise you cou

Re: In Flink, is there a way to merge two streams in stateful manner

2023-08-10 Thread Hang Ruan
ps: Forget the link: Hybrid Source[1] [1] https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/connectors/datastream/hybridsource/ Hang Ruan 于2023年8月11日周五 10:14写道: > Hi, Muazim. > > I think the Hybird Source[1] may be helpful for your case. > > Best, > Hang > > Ken Krugler 于2023年8月1

Global/Shared objects

2023-08-10 Thread Kamal Mittal via user
Hello, Is it possible to create global/shared objects like static which are shared among slots in a task manager? Is it ok to create such objects in flink? Rgds, Kamal

Re: In Flink, is there a way to merge two streams in stateful manner

2023-08-10 Thread Muazim Wani
Thank you so much for taking the time to provide me with such a detailed response. Your assistance has been incredibly helpful in clarifying my understanding! Let me provide you with the exact scenario , I think there might be some misunderstanding. All the streams are bounded and parallelism is s

Re: Global/Shared objects

2023-08-10 Thread Hang Ruan
Hi, Kamal. Each TaskManager is a JVM process and each task slot is a thread of the TaskManager. More information see [1]. The static fields could be shared among subtasks in the same TaskManager. If the subtasks are running in the different TaskManager, they cannot share the static fields. Best,