Hi Swapnil,
1. All the task scheduling/retry happens from Driver. So you are right that
a lot of communication happens from driver to cluster. It all depends on
the how you want to go about your Spark application, whether your
application has direct access to Spark cluster or its routed through a
Thanks..
On Aug 28, 2015 4:55 AM, Rishitesh Mishra rishi80.mis...@gmail.com
wrote:
Hi Swapnil,
1. All the task scheduling/retry happens from Driver. So you are right
that a lot of communication happens from driver to cluster. It all depends
on the how you want to go about your Spark
Hello
I am new to spark world and started to explore recently in standalone mode.
It would be great if I get clarifications on below doubts-
1. Driver locality - It is mentioned in documentation that client
deploy-mode is not good if machine running spark-submit is not co-located
with worker
Hi Swapnil,
Let me try to answer some of the questions. Answers inline. Hope it helps.
On Thursday, August 27, 2015, Swapnil Shinde swapnilushi...@gmail.com
wrote:
Hello
I am new to spark world and started to explore recently in standalone
mode. It would be great if I get clarifications on
Thanks Rishitesh !!
1. I get that driver doesn't need to be on master but there is lot of
communication between driver and cluster. That's why co-located gateway was
recommended. How much is the impact of driver not being co-located with
cluster?
4. How does hdfs split get assigned to worker node