Question about running spark on yarn

2014-04-22 Thread Gordon Wang
In this page http://spark.apache.org/docs/0.9.0/running-on-yarn.html

We have to use spark assembly to submit spark apps to yarn cluster.
And I checked the assembly jars of spark. It contains some yarn classes
which are added during compile time. The yarn classes are not what I want.

My question is that is it possible to use other jars to submit spark app to
yarn cluster.
I do not want to use the assembly jar because it has yarn classes which may
overwrite the yarn class in HADOOP_CLASSPATH. If the yarn cluster is
upgraded, even if the YARN apis are same, spark has to be recompiled
against to the new version of yarn.

Any help is appreciated ! Thanks.

-- 
Regards
Gordon Wang


Re: Question about running spark on yarn

2014-04-22 Thread Sandy Ryza
Hi Gordon,

We recently handled this in SPARK-1064.  As of 1.0.0, you'll be able to
pass -Phadoop-provided to Maven and avoid including Hadoop and its
dependencies in the assembly jar.

-Sandy


On Tue, Apr 22, 2014 at 2:43 AM, Gordon Wang gw...@gopivotal.com wrote:

 In this page http://spark.apache.org/docs/0.9.0/running-on-yarn.html

 We have to use spark assembly to submit spark apps to yarn cluster.
 And I checked the assembly jars of spark. It contains some yarn classes
 which are added during compile time. The yarn classes are not what I want.

 My question is that is it possible to use other jars to submit spark app
 to yarn cluster.
 I do not want to use the assembly jar because it has yarn classes which
 may overwrite the yarn class in HADOOP_CLASSPATH. If the yarn cluster is
 upgraded, even if the YARN apis are same, spark has to be recompiled
 against to the new version of yarn.

 Any help is appreciated ! Thanks.

 --
 Regards
 Gordon Wang



Re: Question about running spark on yarn

2014-04-22 Thread Gordon Wang
Hi Sandy,

Thanks for your reply !

Does this work for sbt ?

I checked the commit, looks like only maven build has such option.



On Wed, Apr 23, 2014 at 12:38 AM, Sandy Ryza sandy.r...@cloudera.comwrote:

 Hi Gordon,

 We recently handled this in SPARK-1064.  As of 1.0.0, you'll be able to
 pass -Phadoop-provided to Maven and avoid including Hadoop and its
 dependencies in the assembly jar.

 -Sandy


 On Tue, Apr 22, 2014 at 2:43 AM, Gordon Wang gw...@gopivotal.com wrote:

 In this page http://spark.apache.org/docs/0.9.0/running-on-yarn.html

 We have to use spark assembly to submit spark apps to yarn cluster.
 And I checked the assembly jars of spark. It contains some yarn classes
 which are added during compile time. The yarn classes are not what I want.

 My question is that is it possible to use other jars to submit spark app
 to yarn cluster.
 I do not want to use the assembly jar because it has yarn classes which
 may overwrite the yarn class in HADOOP_CLASSPATH. If the yarn cluster is
 upgraded, even if the YARN apis are same, spark has to be recompiled
 against to the new version of yarn.

 Any help is appreciated ! Thanks.

 --
 Regards
 Gordon Wang





-- 
Regards
Gordon Wang


Re: Question about running spark on yarn

2014-04-22 Thread sandy . ryza
I currently don't have plans to work on that.

-Sandy

 On Apr 22, 2014, at 8:06 PM, Gordon Wang gw...@gopivotal.com wrote:
 
 Thanks I see. Do you guys have plan to port this to sbt?
 
 
 On Wed, Apr 23, 2014 at 10:24 AM, Sandy Ryza sandy.r...@cloudera.com wrote:
 Right, it only works for Maven
 
 
 On Tue, Apr 22, 2014 at 6:23 PM, Gordon Wang gw...@gopivotal.com wrote:
 Hi Sandy,
 
 Thanks for your reply !
 
 Does this work for sbt ?
 
 I checked the commit, looks like only maven build has such option.
 
 
 
 On Wed, Apr 23, 2014 at 12:38 AM, Sandy Ryza sandy.r...@cloudera.com 
 wrote:
 Hi Gordon,
 
 We recently handled this in SPARK-1064.  As of 1.0.0, you'll be able to 
 pass -Phadoop-provided to Maven and avoid including Hadoop and its 
 dependencies in the assembly jar.
 
 -Sandy
 
 
 On Tue, Apr 22, 2014 at 2:43 AM, Gordon Wang gw...@gopivotal.com wrote:
 In this page http://spark.apache.org/docs/0.9.0/running-on-yarn.html
 
 We have to use spark assembly to submit spark apps to yarn cluster.
 And I checked the assembly jars of spark. It contains some yarn classes 
 which are added during compile time. The yarn classes are not what I 
 want. 
 
 My question is that is it possible to use other jars to submit spark app 
 to yarn cluster. 
 I do not want to use the assembly jar because it has yarn classes which 
 may overwrite the yarn class in HADOOP_CLASSPATH. If the yarn cluster is 
 upgraded, even if the YARN apis are same, spark has to be recompiled 
 against to the new version of yarn.
 
 Any help is appreciated ! Thanks.
 
 -- 
 Regards
 Gordon Wang
 
 
 
 -- 
 Regards
 Gordon Wang
 
 
 
 -- 
 Regards
 Gordon Wang