Hello,

When we call cache() or persist(MEMORY_ONLY), how does the request flow to
the nodes?
I am assuming this will happen:

1.  Driver knows which all nodes hold the partition for the given
rdd (where is this info stored?)
2. It sends a cache request to the node's executor
3. The executor will store the Partition in memory
4. Therefore, each node can have partitions of different RDDs in it's cache.

Can someone please tell me if I am correct.

Thanks and Regards,
Vishnu Viswanath,

Reply via email to