Re: Deploy Flink on YARN or Kubernetes.

2022-12-20 Thread Biao Geng
Hi Ruibin, I think it may be hard to say which provider is alway more recommended than the other. The answer to your question depends heavily on your team's technical stack, your platform and your expectations on the new cluster. I *cannot* give you any advice but I just want share some observation

RE: [EXTERNAL]Re: aws glue connector

2022-12-20 Thread Katz, David L via user
Hi Danny- Thanks for your response! At first glance, I’m not sure, will drill in to look deeper. For clarification, a few points I probably should have expressed initially: 1. I’m running flink as a Kinesis Data Application (AWS serverless application) 2. The reason I want to use strea

Re: aws glue connector

2022-12-20 Thread Danny Cranmer
Hello David, There is a FLIP [1] to add native Glue Catalog support and we already have Glue Schema Registry format plugins [2][3], however these are data stream API only. Are you intending on just using the Glue schema features or leveraging other features? Would either of the things I mentioned

Re: Deploy Flink on YARN or Kubernetes.

2022-12-20 Thread Márton Balassi
Hi Ruibin, Given that you are starting fresh I would recommend going with Kubernetes and specifically checking out the Flink Kubernetes Operator. [1] I have worked with Yarn for years before I transitioned to Kubernetes a year ago and I am pleased that we made the jump. To address you point on a v

Re: Windowing query with group by produces update stream

2022-12-20 Thread Theodor Wübker
I actually managed to fixed this already :) For those wondering, I grouped by both window start and end first. That did it! > On 19. Dec 2022, at 15:43, Theodor Wübker wrote: > > Hey everyone, > > I would like to run a Windowing-SQL query with a group-by clause on a Kafka > topic and write th

Re: Understanding pipelined regions

2022-12-20 Thread Gen Luo
Hi Raihan, As the description of PipelinedRegion says, a pipelined region is a set of vertices connected via pipelined data exchanges. For example in a job with such a dag A->B, both of the tasks have two subtasks. If the edge between A and B is a forward edge, there are two pipelined regions: (A1

RE: Understanding pipelined regions

2022-12-20 Thread Schwalbe Matthias
Hi Sunny, Welcome to Flink 😊. The next thing for you to consider is to setup checkpointing [1] which allows a failing job to pick up from where it stopped. Sincere greetings from the supposed close-by Zurich 😊 Thias [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastre