Question on RDD caching

2016-02-04 Thread Vishnu Viswanath
Hello, When we call cache() or persist(MEMORY_ONLY), how does the request flow to the nodes? I am assuming this will happen: 1. Driver knows which all nodes hold the partition for the given rdd (where is this info stored?) 2. It sends a cache request to the node's executor 3. The executor will

Re: Question in rdd caching in memory using persist

2016-01-07 Thread Prem Sure
are you running standalone - local mode or cluster mode. executor and driver existance differ based on setup type. snapshot of your env UI would be helpful to say On Thu, Jan 7, 2016 at 11:51 AM, wrote: > Hi, > > > > After I called rdd.persist(*MEMORY_ONLY_SER*), I

RE: Question in rdd caching in memory using persist

2016-01-07 Thread seemanto.barua
...@gmail.com' Cc: 'user@spark.apache.org' Subject: Re: Question in rdd caching in memory using persist I have a standalone cluster. spark version is 1.3.1 From: Prem Sure [mailto:premsure...@gmail.com] Sent: Thursday, January 07, 2016 12:32 PM To: Barua, Seemanto (US) Cc: spark users <u