Re: Dropping Apache Spark Hadoop2 Binary Distribution?

Chao Sun Wed, 05 Oct 2022 13:58:51 -0700

+1

> and specifically may allow us to finally move off of the ancient version
of Guava (?)


I think the Guava issue comes from Hive 2.3 dependency, not Hadoop.

On Wed, Oct 5, 2022 at 1:55 PM Xinrong Meng <xinrong.apa...@gmail.com>
wrote:

> +1.
>
> On Wed, Oct 5, 2022 at 1:53 PM Xiao Li <lix...@databricks.com.invalid>
> wrote:
>
>> +1.
>>
>> Xiao
>>
>> On Wed, Oct 5, 2022 at 12:49 PM Sean Owen <sro...@gmail.com> wrote:
>>
>>> I'm OK with this. It simplifies maintenance a bit, and specifically may
>>> allow us to finally move off of the ancient version of Guava (?)
>>>
>>> On Mon, Oct 3, 2022 at 10:16 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
>>> wrote:
>>>
>>>> Hi, All.
>>>>
>>>> I'm wondering if the following Apache Spark Hadoop2 Binary Distribution
>>>> is still used by someone in the community or not. If it's not used or
>>>> not useful,
>>>> we may remove it from Apache Spark 3.4.0 release.
>>>>
>>>>
>>>> https://downloads.apache.org/spark/spark-3.3.0/spark-3.3.0-bin-hadoop2.tgz
>>>>
>>>> Here is the background of this question.
>>>> Since Apache Spark 2.2.0 (SPARK-19493, SPARK-19550), the Apache
>>>> Spark community has been building and releasing with Java 8 only.
>>>> I believe that the user applications also use Java8+ in these days.
>>>> Recently, I received the following message from the Hadoop PMC.
>>>>
>>>>   > "if you really want to claim hadoop 2.x compatibility, then you
>>>> have to
>>>>   > be building against java 7". Otherwise a lot of people with hadoop
>>>> 2.x
>>>>   > clusters won't be able to run your code. If your projects are java8+
>>>>   > only, then they are implicitly hadoop 3.1+, no matter what you use
>>>>   > in your build. Hence: no need for branch-2 branches except
>>>>   > to complicate your build/test/release processes [1]
>>>>
>>>> If Hadoop2 binary distribution is no longer used as of today,
>>>> or incomplete somewhere due to Java 8 building, the following three
>>>> existing alternative Hadoop 3 binary distributions could be
>>>> the better official solution for old Hadoop 2 clusters.
>>>>
>>>>     1) Scala 2.12 and without-hadoop distribution
>>>>     2) Scala 2.12 and Hadoop 3 distribution
>>>>     3) Scala 2.13 and Hadoop 3 distribution
>>>>
>>>> In short, is there anyone who is using Apache Spark 3.3.0 Hadoop2
>>>> Binary distribution?
>>>>
>>>> Dongjoon
>>>>
>>>> [1]
>>>> https://issues.apache.org/jira/browse/ORC-1251?focusedCommentId=17608247&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17608247
>>>>
>>>
>>
>> --
>>
>>

Re: Dropping Apache Spark Hadoop2 Binary Distribution?

Reply via email to