Hi Mingliang,
You can increase the parallelism of the Python SDK Harness via the pipeline
option
--experimental worker_threads=
Note that the workers are Python threads which suffer from the Global
Interpreter Lock. We currently do not use real processes, e.g. via
multiprocessing.
There is
Hi all,
I’m currently tuning performance of python sdk with Flink runner. I found that
the multithreading in python sdk worker limits the cpu usage around 1 core
maximal. To my understanding, all the task slots on one taskmanger share one
sdk process, which means the low cpu usage of python sdk