Re: how to submit the spark job outside the cluster

Zhiliang Zhu Tue, 22 Sep 2015 20:59:27 -0700

Hi Zhan,
I really appreciate your help, I will do as that next.And on the local machine, 
no hadoop/spark needs to be installed, but only copied with the 
/etc/hadoop/conf... whether the information (for example IP, hostname etc) of 
local machine 
would be set in the conf files...


Moreover, do you have any exprience to submit hadoop/spark job by way of java 
program deployed on thegateway node, but not by way of hadoop/spark command...
Thank you very much~Best Regards,Zhiliang


 


     On Wednesday, September 23, 2015 11:30 AM, Zhan Zhang 
<zzh...@hortonworks.com> wrote:
   

 Hi Zhiliang,
I cannot find a specific doc. But as far as I remember, you can log in one of 
your cluster machine, and find the hadoop configuration location, for example 
/etc/hadoop/conf, copy that directory to your local machine. Typically it has 
hdfs-site.xml, yarn-site.xml etc. In spark, the former is used to access hdfs, 
and the latter is used to launch application on top of yarn.
Then in the spark-env.sh, you add export HADOOP_CONF_DIR=/etc/hadoop/conf. 
Thanks.
Zhan Zhang

On Sep 22, 2015, at 8:14 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:

Hi Zhan,
Yes, I get it now. 
I have not ever deployed hadoop configuration locally, and do not find the 
specific doc, would you help provide the doc to do that...
Thank you,Zhiliang

On Wednesday, September 23, 2015 11:08 AM, Zhan Zhang <zzh...@hortonworks.com> 
wrote:


There is no difference between running the client in or out of the client 
(assuming there is no firewall or network connectivity issue), as long as you 
have hadoop configuration locally.  Here is the doc for running on yarn.
http://spark.apache.org/docs/latest/running-on-yarn.html
Thanks.
Zhan Zhang
On Sep 22, 2015, at 7:49 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:

Hi Zhan,
Thanks very much for your help comment.I also view it would be similar to 
hadoop job submit, however, I was not deciding whether it is like that whenit 
comes to spark.  
Have you ever tried that for spark...Would you give me the deployment doc for 
hadoop and spark gateway, since this is the first time for meto do that, I do 
not find the specific doc for it.

Best Regards,Zhiliang




On Wednesday, September 23, 2015 10:20 AM, Zhan Zhang <zzh...@hortonworks.com> 
wrote:


It should be similar to other hadoop jobs. You need hadoop configuration in 
your client machine, and point the HADOOP_CONF_DIR in spark to the 
configuration.
Thanks
Zhan Zhang
On Sep 22, 2015, at 6:37 PM, Zhiliang Zhu <zchl.j...@yahoo.com.INVALID> wrote:

Dear Experts,

Spark job is running on the cluster by yarn. Since the job can be submited at 
the place on the machine from the cluster,however, I would like to submit the 
job from another machine which does not belong to the cluster.I know for this, 
hadoop job could be done by way of another machine which is installed hadoop 
gateway which is usedto connect the cluster.
Then what would go for spark, is it same as hadoop... And where is the 
instruction doc for installing this gateway...
Thank you very much~~Zhiliang

Re: how to submit the spark job outside the cluster

Reply via email to