Thanks Austin. Will take a look at the AsyncIO. Looks like a pretty cool
feature.

On Fri, Feb 10, 2023 at 1:31 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

> It's been a while, but I think I've done something similar before with
> Async I/O [1] and batching records with a window.
>
> This was years ago, so no idea if this was/is good practice, but
> essentially it was:
>
> -> Window by batch size (with a timeout trigger to maintain some SLA)
> -> Process function that just collects all records in the window
> -> Send the entire batch to the AsyncFunction
>
> This approach definitely has some downside, where you don't get to take
> advantage of some of the nice per-record things Async I/O gives you
> (ordering, retries, etc.) but it does greatly reduce the load on external
> services.
>
> Hope that helps,
> Austin
>
> [1]:
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/operators/asyncio/
>
> On Fri, Feb 10, 2023 at 3:22 PM Leon Xu <l...@attentivemobile.com> wrote:
>
>> I wonder if windows will be the solution when it comes to datastream API.
>>
>> On Fri, Feb 10, 2023 at 12:07 PM Leon Xu <l...@attentivemobile.com> wrote:
>>
>>> Hi Flink Users,
>>>
>>> We wanted to use Flink to run a decoration pipeline, where we would like
>>> to make calls to some external service to fetch data and alter the event in
>>> the Flink pipeline.
>>>
>>> Since there's external service call involved so we want to do batch
>>> calls so that it can reduce the load on the external service.(batching
>>> multiple flink events and just make one external service call)
>>>
>>> It looks like min-batch might be something we can leverage to achieve
>>> that but that feature seems to only exist in table API. We are using
>>> datastream API and we are wondering if there's any solution/workaround for
>>> this?
>>>
>>>
>>> Thanks
>>> Leon
>>>
>>

Reply via email to