Re: Python sdk performance

2019-06-10 Thread Maximilian Michels
Hi Mingliang, You can increase the parallelism of the Python SDK Harness via the pipeline option   --experimental worker_threads= Note that the workers are Python threads which suffer from the Global Interpreter Lock. We currently do not use real processes, e.g. via multiprocessing. There is

Python sdk performance

2019-06-08 Thread 青雉(祁明良)
Hi all, I’m currently tuning performance of python sdk with Flink runner. I found that the multithreading in python sdk worker limits the cpu usage around 1 core maximal. To my understanding, all the task slots on one taskmanger share one sdk process, which means the low cpu usage of python sdk