subject:"\[Spark Streaming MEMORY_ONLY\] Understanding Dataflow"

Re: [Spark Streaming MEMORY_ONLY] Understanding Dataflow

2018-07-05 Thread Thomas Lavocat

Excerpts from Prem Sure's message of 2018-07-04 19:39:29 +0530: > Hoping below would help in clearing some.. > executors dont have control to share the data among themselves except > sharing accumulators via driver's support. > Its all based on the data locality or remote nature, tasks/stages are

Re: [Spark Streaming MEMORY_ONLY] Understanding Dataflow

2018-07-04 Thread Prem Sure

Hoping below would help in clearing some.. executors dont have control to share the data among themselves except sharing accumulators via driver's support. Its all based on the data locality or remote nature, tasks/stages are defined to perform which may result in shuffle. On Wed, Jul 4, 2018 at

[Spark Streaming MEMORY_ONLY] Understanding Dataflow

2018-07-04 Thread thomas lavocat

Hello, I have a question on Spark Dataflow. If I understand correctly, all received data is sent from the executor to the driver of the application prior to task creation. Then the task embeding the data transit from the driver to the executor in order to be processed. As executor cannot