Re: How to process mini-batch events in Flink with Datastream API

2023-02-10 Thread Leon Xu
Thanks Austin. Will take a look at the AsyncIO. Looks like a pretty cool feature. On Fri, Feb 10, 2023 at 1:31 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > It's been a while, but I think I've done something similar before with > Async I/O [1] and batching records with a window. >

Could savepoints contain in-flight data?

2023-02-10 Thread Alexis Sarda-Espinosa
Hello, One feature of unaligned checkpoints is that the checkpoint barriers can overtake in-flight data, so the buffers are persisted as part of the state. The documentation for savepoints doesn't mention anything explicitly, so just to be sure, will savepoints always wait for in-flight data to

Re: How to process mini-batch events in Flink with Datastream API

2023-02-10 Thread Austin Cawley-Edwards
It's been a while, but I think I've done something similar before with Async I/O [1] and batching records with a window. This was years ago, so no idea if this was/is good practice, but essentially it was: -> Window by batch size (with a timeout trigger to maintain some SLA) -> Process function

Re: How to process mini-batch events in Flink with Datastream API

2023-02-10 Thread Leon Xu
I wonder if windows will be the solution when it comes to datastream API. On Fri, Feb 10, 2023 at 12:07 PM Leon Xu wrote: > Hi Flink Users, > > We wanted to use Flink to run a decoration pipeline, where we would like > to make calls to some external service to fetch data and alter the event in

How to process mini-batch events in Flink with Datastream API

2023-02-10 Thread Leon Xu
Hi Flink Users, We wanted to use Flink to run a decoration pipeline, where we would like to make calls to some external service to fetch data and alter the event in the Flink pipeline. Since there's external service call involved so we want to do batch calls so that it can reduce the load on the

Pyflink Side Output Question and/or suggested documentation change

2023-02-10 Thread Andrew Otto
Question about side outputs and OutputTags in pyflink. The docs say we are supposed to yield output_tag, value Docs then say: > For retrieving the side output stream you use getSideOutput(OutputTag) on the

How to deploy kubernetes flink operator on GKE

2023-02-10 Thread P Singh
Hi Team, I have tried many ways to deploy kubernetes flink operators on GKE link link2 link3

apache/flink docker images arm64

2023-02-10 Thread Roberts, Ben (Senior Developer) via user
Hi, Would it be possible for the arm64/v8 architecture images to be published to dockerhub apache/flink:1.16 and 1.16.1 please? I’m aware that the official docker flink image is now published in the arm64 arch, but that image doesn’t include a JDK, so it’d be super helpful to have the