Hi,

In June 2017, Google introduced server-based shuffle for Datatflow
pipeline, which can result in 5x performance improvement. However, at the
time of announcement this feature was only available for Cloud Dataflow SDK
for Java version 1. What is the status for Dataflow SDK for Python? Is it
supported already? Any plan to add it soon?


https://cloud.google.com/blog/big-data/2017/06/introducing-
cloud-dataflow-shuffle-for-up-to-5x-performance-improvement-
in-data-analytic-pipelines

Thanks!

Reply via email to