Thanks Austin. Will take a look at the AsyncIO. Looks like a pretty cool feature.
On Fri, Feb 10, 2023 at 1:31 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > It's been a while, but I think I've done something similar before with > Async I/O [1] and batching records with a window. > > This was years ago, so no idea if this was/is good practice, but > essentially it was: > > -> Window by batch size (with a timeout trigger to maintain some SLA) > -> Process function that just collects all records in the window > -> Send the entire batch to the AsyncFunction > > This approach definitely has some downside, where you don't get to take > advantage of some of the nice per-record things Async I/O gives you > (ordering, retries, etc.) but it does greatly reduce the load on external > services. > > Hope that helps, > Austin > > [1]: > https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/operators/asyncio/ > > On Fri, Feb 10, 2023 at 3:22 PM Leon Xu <l...@attentivemobile.com> wrote: > >> I wonder if windows will be the solution when it comes to datastream API. >> >> On Fri, Feb 10, 2023 at 12:07 PM Leon Xu <l...@attentivemobile.com> wrote: >> >>> Hi Flink Users, >>> >>> We wanted to use Flink to run a decoration pipeline, where we would like >>> to make calls to some external service to fetch data and alter the event in >>> the Flink pipeline. >>> >>> Since there's external service call involved so we want to do batch >>> calls so that it can reduce the load on the external service.(batching >>> multiple flink events and just make one external service call) >>> >>> It looks like min-batch might be something we can leverage to achieve >>> that but that feature seems to only exist in table API. We are using >>> datastream API and we are wondering if there's any solution/workaround for >>> this? >>> >>> >>> Thanks >>> Leon >>> >>