On 24 Oct 2016, at 19:34, Masood Krohy 
<masood.kr...@intact.net<mailto:masood.kr...@intact.net>> wrote:

Hi everyone,

Is there a way to set the IP address/hostname that the Spark Driver is going to 
be running on when launching a program through spark-submit in yarn-cluster 
mode (PySpark 1.6.0)?

I do not see an option for this. If not, is there a way to get this IP address 
after the Spark app has started running? (through an API call at the beginning 
of the program to be used in the rest of the program). spark-submit outputs 
“ApplicationMaster host: 10.0.0.9” in the console (and changes on every run due 
to yarn cluster mode) and I am wondering if this can be accessed within the 
program. It does not seem to me that a YARN node label can be used to tie the 
Spark Driver/AM to a node, while allowing the Executors to run on all the nodes.



you can grab it off the YARN API itself; there's a REST view as well as a 
fussier RPC level. That is, assuming you want the web view, which is what is 
registered.

If you know the application ID, you can also construct a URL through the YARN 
proxy; any attempt to talk direct to the AM is going to get 302'd back there 
anyway so any kerberos credentials can be verified.

Reply via email to