I'm also genuinely curious when PyPI users would care about the
bundled Hadoop jars - do we even need two versions? that itself is
extra complexity for end users.
I do think Hadoop 3 is the better choice for the user who doesn't
care, and better long term.
OK but let's at least move ahead with changing defaults.

On Wed, Jun 24, 2020 at 12:38 PM Xiao Li <lix...@databricks.com> wrote:
>
> Hi, Dongjoon,
>
> Please do not misinterpret my point. I already clearly said "I do not know 
> how to track the popularity of Hadoop 2 vs Hadoop 3."
>
> Also, let me repeat my opinion:  the top priority is to provide two options 
> for PyPi distribution and let the end users choose the ones they need. Hadoop 
> 3.2 or Hadoop 2.7. In general, when we want to make any breaking change, let 
> us follow our protocol documented in 
> https://spark.apache.org/versioning-policy.html.
>
> If you just want to change the Jenkins setup, I am OK about it. If you want 
> to change the default distribution, we need more discussions in the community 
> for getting an agreement.
>
>  Thanks,
>
> Xiao
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to