Re: run new spark version on old spark cluster ?

Nicolas Paris Mon, 20 May 2019 12:15:07 -0700

Finally that was easy to connect to both hive/hdfs. I just had to copy
the hive-site.xml from the old spark version and that worked instantly
after unzipping.


Right now I am stuck on connecting to yarn. 


On Mon, May 20, 2019 at 02:50:44PM -0400, Koert Kuipers wrote:
> we had very little issues with hdfs or hive, but then we use hive only for
> basic reading and writing of tables.
> 
> depending on your vendor you might have to add a few settings to your
> spark-defaults.conf. i remember on hdp you had to set the hdp.version somehow.
> we prefer to build spark with hadoop being provided, and then add hadoop
> classpath to spark classpath. this works well on cdh, hdp, and also for cloud
> providers.
> 
> for example this is a typical build with hive for cdh 5 (which is based on
> hadoop 2.6, you change hadoop version based on vendor):
> dev/make-distribution.sh --name <yourname> --tgz -Phadoop-2.6 
> -Dhadoop.version=
> 2.6.0 -Pyarn -Phadoop-provided -Phive
> add hadoop classpath to the spark classpath in spark-env.sh:
> export SPARK_DIST_CLASSPATH=$(hadoop classpath)
> 
> i think certain vendors support multiple "vendor supported" installs, so you
> could also look into that if you are not comfortable with running your own
> spark build.
> 
> On Mon, May 20, 2019 at 2:24 PM Nicolas Paris <nicolas.pa...@riseup.net> 
> wrote:
> 
>     > correct. note that you only need to install spark on the node you launch
>     it
>     > from. spark doesnt need to be installed on cluster itself.
> 
>     That sound reasonably doable for me. My guess is I will have some
>     troubles to make that spark version work with both hive & hdfs installed
>     on the cluster - or maybe that's finally plug-&-play i don't know.
> 
>     thanks
> 
>     On Mon, May 20, 2019 at 02:16:43PM -0400, Koert Kuipers wrote:
>     > correct. note that you only need to install spark on the node you launch
>     it
>     > from. spark doesnt need to be installed on cluster itself.
>     >
>     > the shared components between spark jobs on yarn are only really
>     > spark-shuffle-service in yarn and spark-history-server. i have found
>     > compatibility for these to be good. its best if these run latest 
> version.
>     >
>     > On Mon, May 20, 2019 at 2:02 PM Nicolas Paris <nicolas.pa...@riseup.net>
>     wrote:
>     >
>     >     > you will need the spark version you intend to launch with on the
>     machine
>     >     you
>     >     > launch from and point to the correct spark-submit
>     >
>     >     does this mean to install a second spark version (2.4) on the 
> cluster
>     ?
>     >
>     >     thanks
>     >
>     >     On Mon, May 20, 2019 at 01:58:11PM -0400, Koert Kuipers wrote:
>     >     > yarn can happily run multiple spark versions side-by-side
>     >     > you will need the spark version you intend to launch with on the
>     machine
>     >     you
>     >     > launch from and point to the correct spark-submit
>     >     >
>     >     > On Mon, May 20, 2019 at 1:50 PM Nicolas Paris <
>     nicolas.pa...@riseup.net>
>     >     wrote:
>     >     >
>     >     >     Hi
>     >     >
>     >     >     I am wondering whether that's feasible to:
>     >     >     - build a spark application (with sbt/maven) based on spark2.4
>     >     >     - deploy that jar on yarn on a spark2.3 based installation
>     >     >
>     >     >     thanks by advance,
>     >     >
>     >     >
>     >     >     --
>     >     >     nicolas
>     >     >
>     >     >   
>      ---------------------------------------------------------------------
>     >     >     To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>     >     >
>     >     >
>     >
>     >     --
>     >     nicolas
>     >
>     >     
> ---------------------------------------------------------------------
>     >     To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>     >
>     >
> 
>     --
>     nicolas
> 
>     ---------------------------------------------------------------------
>     To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 
> 

-- 
nicolas

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: run new spark version on old spark cluster ?

Reply via email to