Yes. You have explained my requirements exactly as they are. My operator will talk to multiple databases and a couple of web services to enrich incoming input streams. I cannot think of a way to use the async IO operator. So I thought maybe convert these 7-10 calls into async calls and chain the Futures together. I believe I have to block once in the end of the KeyedBroadcastProcessFunction but if there's a way to avoid that also while also ensuring ordered processing of events, then do let me know.
On Fri, Apr 29, 2022 at 7:35 AM Guowei Ma <guowei....@gmail.com> wrote: > Hi Vishal > > I want to understand your needs first. Your requirements are: After a > stateful operator receives a notification, it needs to traverse all the > data stored in the operator state, communicate with an external system > during the traversal process (maybe similar to join?). In order to improve > the efficiency of this behavior, you want to take an asynchronous > approach. That is, if you modify the state of different keys, do not block > each other due to external communication. > If I understand correctly, according to the existing function of > KeyedBroadcastProcessFunction, it is really impossible. > As for whether there are other solutions, it may depend on specific > scenarios, such as what kind of external system. So could you describe in > detail what scenario has this requirement, and what are the external > systems it depends on? > > Best, > Guowei > > > On Fri, Apr 29, 2022 at 12:42 AM Vishal Surana <vis...@moengage.com> > wrote: > >> Hello, >> My application has a stateful operator which leverages RocksDB to store a >> large amount of state. It, along with other operators receive configuration >> as a broadcast stream (KeyedBroadcastProcessFunction). The operator depends >> upon another input stream that triggers some communication with external >> services whose results are then combined to yield the state that gets >> stored in RocksDB. >> >> In order to make the application more efficient, I am going to switch to >> asynchronous IO but as the result is ultimately going to be a (Scala) >> Future, I will have to block once to get the result. I was hoping to >> leverage the Async IO operator but that apparently doesn't support RocksDB >> based state storage. Am I correct in saying >> that KeyedBroadcastProcessFunction is the only option I have? If so, then I >> want to understand how registering a future's callbacks (via onComplete) >> works with a synchronous operator such as KeyedBroadcastProcessFunction. >> Will the thread executing the function simply relinquish control to some >> other subtask while the results of the external services are being awaited? >> Will the callback eventually be triggered automatically or will I have to >> explicitly block on the result future like so: Await.result(f, timeout). >> >> -- >> Regards, >> Vishal >> > -- Regards, Vishal