Excerpts from Prem Sure's message of 2018-07-04 19:39:29 +0530: > Hoping below would help in clearing some.. > executors dont have control to share the data among themselves except > sharing accumulators via driver's support. > Its all based on the data locality or remote nature, tasks/stages are > defined to perform which may result in shuffle.
If I understand correctly : * Only shuffle data goes through the driver * The receivers data stays node local until a shuffle occurs Is that right ? > On Wed, Jul 4, 2018 at 1:56 PM, thomas lavocat < > thomas.lavo...@univ-grenoble-alpes.fr> wrote: > > > Hello, > > > > I have a question on Spark Dataflow. If I understand correctly, all > > received data is sent from the executor to the driver of the application > > prior to task creation. > > > > Then the task embeding the data transit from the driver to the executor in > > order to be processed. > > > > As executor cannot exchange data themselves, in a shuffle, data also > > transit to the driver. > > > > Is that correct ? > > > > Thomas > > > > > > --------------------------------------------------------------------- > > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > > > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org