It's been a while, but I think I've done something similar before with
Async I/O [1] and batching records with a window.

This was years ago, so no idea if this was/is good practice, but
essentially it was:

-> Window by batch size (with a timeout trigger to maintain some SLA)
-> Process function that just collects all records in the window
-> Send the entire batch to the AsyncFunction

This approach definitely has some downside, where you don't get to take
advantage of some of the nice per-record things Async I/O gives you
(ordering, retries, etc.) but it does greatly reduce the load on external
services.

Hope that helps,
Austin

[1]:
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/operators/asyncio/

On Fri, Feb 10, 2023 at 3:22 PM Leon Xu <l...@attentivemobile.com> wrote:

> I wonder if windows will be the solution when it comes to datastream API.
>
> On Fri, Feb 10, 2023 at 12:07 PM Leon Xu <l...@attentivemobile.com> wrote:
>
>> Hi Flink Users,
>>
>> We wanted to use Flink to run a decoration pipeline, where we would like
>> to make calls to some external service to fetch data and alter the event in
>> the Flink pipeline.
>>
>> Since there's external service call involved so we want to do batch calls
>> so that it can reduce the load on the external service.(batching multiple
>> flink events and just make one external service call)
>>
>> It looks like min-batch might be something we can leverage to achieve
>> that but that feature seems to only exist in table API. We are using
>> datastream API and we are wondering if there's any solution/workaround for
>> this?
>>
>>
>> Thanks
>> Leon
>>
>

Reply via email to