Re: Spark 3.0 with Hadoop 2.6 HDFS/Hive

Ashika Umanga Umagiliya Sun, 19 Jul 2020 22:32:01 -0700

Hello

"spark.yarn.populateHadoopClasspath" is used in YARN mode correct?
However our Spark cluster is standalone cluster not using YARN.
We only connect to HDFS/Hive to access data.Computation is done on our
spark cluster running on K8s (not Yarn)



On Mon, Jul 20, 2020 at 2:04 PM DB Tsai <dbt...@dbtsai.com> wrote:

> In Spark 3.0, if you use the `with-hadoop` Spark distribution that has
> embedded Hadoop 3.2, you can set
> `spark.yarn.populateHadoopClasspath=false` to not populate the
> cluster's hadoop classpath. In this scenario, Spark will use hadoop
> 3.2 client to connect to hadoop 2.6 which should work fine. In fact,
> we have production deployment using this way for a while.
>
> On Sun, Jul 19, 2020 at 8:10 PM Ashika Umanga <ashika.uma...@gmail.com>
> wrote:
> >
> > Greetings,
> >
> > Hadoop 2.6 has been removed according to this ticket
> https://issues.apache.org/jira/browse/SPARK-25016
> >
> > We run our Spark cluster on K8s in standalone mode.
> > We access HDFS/Hive running on a Hadoop 2.6 cluster.
> > We've been using Spark 2.4.5 and planning on upgrading to Spark 3.0.0
> > However, we dont have any control over the Hadoop cluster and it will
> remain in 2.6
> >
> > Is Spark 3.0 still compatible with HDFS/Hive running on Hadoop 2.6 ?
> >
> > Best Regards,
>
>
>
> --
> Sincerely,
>
> DB Tsai
> ----------------------------------------------------------
> Web: https://www.dbtsai.com
> PGP Key ID: 42E5B25A8F7A82C1
>


-- 
Umanga
http://jp.linkedin.com/in/umanga
http://umanga.ifreepages.com

Re: Spark 3.0 with Hadoop 2.6 HDFS/Hive

Reply via email to