Re: How executor Understand which RDDs needed to be persist from the submitted Task

2020-01-09 Thread Jack Kolokasis
Thanks for your help! Iacovos On 1/9/20 5:49 PM, Wenchen Fan wrote: You can take a look at ShuffleMapTask.runTask. It's not just a function. On Thu, Jan 9, 2020 at 11:25 PM Jack Kolokasis > wrote: Thanks for the help. I read that driver only send a function

Re: How executor Understand which RDDs needed to be persist from the submitted Task

2020-01-09 Thread Wenchen Fan
You can take a look at ShuffleMapTask.runTask. It's not just a function. On Thu, Jan 9, 2020 at 11:25 PM Jack Kolokasis wrote: > Thanks for the help. I read that driver only send a function (task) to > executors and the executors apply this function to their local RDD > partitions. > > Iacovos >

Re: How executor Understand which RDDs needed to be persist from the submitted Task

2020-01-09 Thread Jack Kolokasis
Thanks for the help. I read that driver only send a function (task) to executors and the executors apply this function to their local RDD partitions. Iacovos On 1/9/20 5:03 PM, Wenchen Fan wrote: RDD has a flag `storageLevel` which will be set by calling persist(). RDD will be serialized and 

Re: How executor Understand which RDDs needed to be persist from the submitted Task

2020-01-09 Thread Wenchen Fan
RDD has a flag `storageLevel` which will be set by calling persist(). RDD will be serialized and sent to executors for running tasks. So executors just look at RDD.storageLevel and store output in its block manager when needed. On Thu, Jan 9, 2020 at 5:53 PM Jack Kolokasis wrote: > Hello all, >

How executor Understand which RDDs needed to be persist from the submitted Task

2020-01-09 Thread Jack Kolokasis
Hello all, I want to find when a Task that is sended by Driver to executor contains a call to function persist(). I am trying to read the submitted function that driver send to executor but I could not find any call to persist() method. Do you know how executor understand which RDDs needed to