Is spark fair scheduler is for kubernete?

2022-04-10 Thread Jason Jun
the official doc, https://spark.apache.org/docs/latest/job-scheduling.html,
didn't mention  that its working for kubernete cluster?

Can anyone quickly answer this?

TIA.
Jason


Re: Spark client for Hadoop 2.x

2022-04-10 Thread Dongjoon Hyun
Hi, Amin

In general, the Apache Spark community has received many feedbacks and been
moving forward to

- Use the latest Hadoop versions for more bug fixes including CVEs.
- Use Hadoop's shaded clients to minimize the dependency issues

Since the above is not achievable with Hadoop 2 clients, I believe the
official answer is `No` to (1). (Especially for your Hadoop 2.7 cluster
released in 2018.)

For the second question, Apache Spark community has been collaborating with
Apache Hadoop community in order to use the latest Apache Hadoop 3 clients
to connect old/new Hadoop clusters and public cloud environments. I believe
your production jobs should be fine if you are not relying on some
proprietary(=non-Apache Hadoop) features from private vendors. Please
report to the Apache Hadoop community or us if you hit unknown
compatibility issues.

Bests
Dongjoon.


On Fri, Apr 8, 2022 at 9:37 PM Amin Borjian 
wrote:

>
>
> From Spark version 3.1.0 onwards, the clients provided for Spark are built
> with Hadoop 3 and placed in maven repository. Unfortunately we use Hadoop
> 2.7.7 in our infrastructure currently.
>
>
>
> 1) Does Spark have a plan to publish the Spark client dependencies for
> Hadoop 2.x?
>
> 2) Are the new Spark clients capable of connecting to the Hadoop 2.x
> cluster? (According to a simple test, Spark client 3.2.1 had no problem
> with the Hadoop 2.7 cluster but we wanted to know if there was any
> guarantee from Spark?)
>
>
>
> Thank you very much in advance
>
> Amin Borjian
>
>
>