subject:"How to scale a streaming Flink pipeline without abusing parallelism for long computation tasks\?"

Re: How to scale a streaming Flink pipeline without abusing parallelism for long computation tasks?

2020-04-23 Thread Arvid Heise

Hi Elkhan, Theo's advice is spot-on, you should use asyncIO with AsyncFunction. AsyncIO is not performing any task asynchronously by itself though, so you should either use the async API of the library if existant or manage your own thread pool. The thread pool should be as large as your desired

Re: How to scale a streaming Flink pipeline without abusing parallelism for long computation tasks?

2020-04-16 Thread Theo Diefenthal

Hi, I think you could utilize AsyncIO in your case with just using a local thread pool [1]. Best regards Theo [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html Von: "Elkhan Dadashov" An: "user" Gesendet: Donnerstag, 16. April 2020

How to scale a streaming Flink pipeline without abusing parallelism for long computation tasks?

2020-04-16 Thread Elkhan Dadashov

Hi Flink users, I have a basic Flnk pipeline, doing flatmap. inside flatmap, I get the input, path it to the client library to compute some result. That library execution takes around 30 seconds to 2 minutes (depending on the input ) for producing the output from the given input ( it is