Driver needs a consistent connection to the master in standalone mode as whole 
bunch of client stuff happens on the driver. So calls like parallelize send 
data from driver to the master & collect send data from master to the driver. 

If you are looking to avoid the connect you can look into embedded driver model 
in yarn where the driver will also run inside the cluster & hence reliability & 
connectivity is a given. 
-- 
Regards,
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi

On Fri, Sep 12, 2014 at 6:46 PM, Jim Carroll <jimfcarr...@gmail.com>
wrote:

> Hi Akhil,
> Thanks! I guess in short that means the master (or slaves?) connect back to
> the driver. This seems like a really odd way to work given the driver needs
> to already connect to the master on port 7077. I would have thought that if
> the driver could initiate a connection to the master, that would be all
> that's required.
> Can you describe what it is about the architecture that requires the master
> to connect back to the driver even when the driver initiates a connection to
> the master? Just curious.
> Thanks anyway.
> Jim
>  
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Network-requirements-between-Driver-Master-and-Slave-tp13997p14086.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to