Re: [Spark Streaming MEMORY_ONLY] Understanding Dataflow

2018-07-05 Thread Thomas Lavocat
Excerpts from Prem Sure's message of 2018-07-04 19:39:29 +0530: > Hoping below would help in clearing some.. > executors dont have control to share the data among themselves except > sharing accumulators via driver's support. > Its all based on the data locality or remote nature, tasks/stages are

Re: [Spark Streaming MEMORY_ONLY] Understanding Dataflow

2018-07-04 Thread Prem Sure
Hoping below would help in clearing some.. executors dont have control to share the data among themselves except sharing accumulators via driver's support. Its all based on the data locality or remote nature, tasks/stages are defined to perform which may result in shuffle. On Wed, Jul 4, 2018 at

[Spark Streaming MEMORY_ONLY] Understanding Dataflow

2018-07-04 Thread thomas lavocat
Hello, I have a question on Spark Dataflow. If I understand correctly, all received data is sent from the executor to the driver of the application prior to task creation. Then the task embeding the data transit from the driver to the executor in order to be processed. As executor cannot