Re: [ANNOUNCE] Apache Spark 3.0.0

2020-06-18 Thread wuyi
Congrats!!



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [ANNOUNCE] Apache Spark 3.0.0

2020-06-18 Thread Takeshi Yamamuro
Congrats, all!

Bests,
Takeshi

On Fri, Jun 19, 2020 at 1:16 PM Felix Cheung 
wrote:

> Congrats
>
> --
> *From:* Jungtaek Lim 
> *Sent:* Thursday, June 18, 2020 8:18:54 PM
> *To:* Hyukjin Kwon 
> *Cc:* Mridul Muralidharan ; Reynold Xin <
> r...@databricks.com>; dev ; user <
> u...@spark.apache.org>
> *Subject:* Re: [ANNOUNCE] Apache Spark 3.0.0
>
> Great, thanks all for your efforts on the huge step forward!
>
> On Fri, Jun 19, 2020 at 12:13 PM Hyukjin Kwon  wrote:
>
> Yay!
>
> 2020년 6월 19일 (금) 오전 4:46, Mridul Muralidharan 님이 작성:
>
> Great job everyone ! Congratulations :-)
>
> Regards,
> Mridul
>
> On Thu, Jun 18, 2020 at 10:21 AM Reynold Xin  wrote:
>
> Hi all,
>
> Apache Spark 3.0.0 is the first release of the 3.x line. It builds on many
> of the innovations from Spark 2.x, bringing new ideas as well as continuing
> long-term projects that have been in development. This release resolves
> more than 3400 tickets.
>
> We'd like to thank our contributors and users for their contributions and
> early feedback to this release. This release would not have been possible
> without you.
>
> To download Spark 3.0.0, head over to the download page:
> http://spark.apache.org/downloads.html
>
> To view the release notes:
> https://spark.apache.org/releases/spark-release-3-0-0.html
>
>
>
>

-- 
---
Takeshi Yamamuro


Re: [ANNOUNCE] Apache Spark 3.0.0

2020-06-18 Thread Gourav Sengupta
CELEBRATIONS!!!

On Thu, Jun 18, 2020 at 6:21 PM Reynold Xin  wrote:

> Hi all,
>
> Apache Spark 3.0.0 is the first release of the 3.x line. It builds on many
> of the innovations from Spark 2.x, bringing new ideas as well as continuing
> long-term projects that have been in development. This release resolves
> more than 3400 tickets.
>
> We'd like to thank our contributors and users for their contributions and
> early feedback to this release. This release would not have been possible
> without you.
>
> To download Spark 3.0.0, head over to the download page:
> http://spark.apache.org/downloads.html
>
> To view the release notes:
> https://spark.apache.org/releases/spark-release-3-0-0.html
>
>
>
>


Re: java.lang.ClassNotFoundException for s3a comitter

2020-06-18 Thread murat migdisoglu
Hi all
I've upgraded my test cluster to spark 3 and change my comitter to
directory and I still get this error.. The documentations are somehow
obscure on that.
Do I need to add a third party jar to support new comitters?

java.lang.ClassNotFoundException:
org.apache.spark.internal.io.cloud.PathOutputCommitProtocol


On Thu, Jun 18, 2020 at 1:35 AM murat migdisoglu 
wrote:

> Hello all,
> we have a hadoop cluster (using yarn) using  s3 as filesystem with s3guard
> is enabled.
> We are using hadoop 3.2.1 with spark 2.4.5.
>
> When I try to save a dataframe in parquet format, I get the following
> exception:
> java.lang.ClassNotFoundException:
> com.hortonworks.spark.cloud.commit.PathOutputCommitProtocol
>
> My relevant spark configurations are as following:
>
> "hadoop.mapreduce.outputcommitter.factory.scheme.s3a":"org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory",
> "fs.s3a.committer.name": "magic",
> "fs.s3a.committer.magic.enabled": true,
> "fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",
>
> While spark streaming fails with the exception above, apache beam succeeds
> writing parquet files.
> What might be the problem?
>
> Thanks in advance
>
>
> --
> "Talkers aren’t good doers. Rest assured that we’re going there to use
> our hands, not our tongues."
> W. Shakespeare
>


-- 
"Talkers aren’t good doers. Rest assured that we’re going there to use our
hands, not our tongues."
W. Shakespeare


Re: [ANNOUNCE] Apache Spark 3.0.0

2020-06-18 Thread Felix Cheung
Congrats


From: Jungtaek Lim 
Sent: Thursday, June 18, 2020 8:18:54 PM
To: Hyukjin Kwon 
Cc: Mridul Muralidharan ; Reynold Xin ; 
dev ; user 
Subject: Re: [ANNOUNCE] Apache Spark 3.0.0

Great, thanks all for your efforts on the huge step forward!

On Fri, Jun 19, 2020 at 12:13 PM Hyukjin Kwon 
mailto:gurwls...@gmail.com>> wrote:
Yay!

2020년 6월 19일 (금) 오전 4:46, Mridul Muralidharan 
mailto:mri...@gmail.com>>님이 작성:
Great job everyone ! Congratulations :-)

Regards,
Mridul

On Thu, Jun 18, 2020 at 10:21 AM Reynold Xin 
mailto:r...@databricks.com>> wrote:

Hi all,

Apache Spark 3.0.0 is the first release of the 3.x line. It builds on many of 
the innovations from Spark 2.x, bringing new ideas as well as continuing 
long-term projects that have been in development. This release resolves more 
than 3400 tickets.

We'd like to thank our contributors and users for their contributions and early 
feedback to this release. This release would not have been possible without you.

To download Spark 3.0.0, head over to the download page: 
http://spark.apache.org/downloads.html

To view the release notes: 
https://spark.apache.org/releases/spark-release-3-0-0.html





Re: [ANNOUNCE] Apache Spark 3.0.0

2020-06-18 Thread Jungtaek Lim
Great, thanks all for your efforts on the huge step forward!

On Fri, Jun 19, 2020 at 12:13 PM Hyukjin Kwon  wrote:

> Yay!
>
> 2020년 6월 19일 (금) 오전 4:46, Mridul Muralidharan 님이 작성:
>
>> Great job everyone ! Congratulations :-)
>>
>> Regards,
>> Mridul
>>
>> On Thu, Jun 18, 2020 at 10:21 AM Reynold Xin  wrote:
>>
>>> Hi all,
>>>
>>> Apache Spark 3.0.0 is the first release of the 3.x line. It builds on
>>> many of the innovations from Spark 2.x, bringing new ideas as well as
>>> continuing long-term projects that have been in development. This release
>>> resolves more than 3400 tickets.
>>>
>>> We'd like to thank our contributors and users for their contributions
>>> and early feedback to this release. This release would not have been
>>> possible without you.
>>>
>>> To download Spark 3.0.0, head over to the download page:
>>> http://spark.apache.org/downloads.html
>>>
>>> To view the release notes:
>>> https://spark.apache.org/releases/spark-release-3-0-0.html
>>>
>>>
>>>
>>>


Re: [ANNOUNCE] Apache Spark 3.0.0

2020-06-18 Thread Hyukjin Kwon
Yay!

2020년 6월 19일 (금) 오전 4:46, Mridul Muralidharan 님이 작성:

> Great job everyone ! Congratulations :-)
>
> Regards,
> Mridul
>
> On Thu, Jun 18, 2020 at 10:21 AM Reynold Xin  wrote:
>
>> Hi all,
>>
>> Apache Spark 3.0.0 is the first release of the 3.x line. It builds on
>> many of the innovations from Spark 2.x, bringing new ideas as well as
>> continuing long-term projects that have been in development. This release
>> resolves more than 3400 tickets.
>>
>> We'd like to thank our contributors and users for their contributions and
>> early feedback to this release. This release would not have been possible
>> without you.
>>
>> To download Spark 3.0.0, head over to the download page:
>> http://spark.apache.org/downloads.html
>>
>> To view the release notes:
>> https://spark.apache.org/releases/spark-release-3-0-0.html
>>
>>
>>
>>


Re: Initial Decom PR for Spark 3?

2020-06-18 Thread Stephen Boesch
Second paragraph of the PR lists the design doc.

> There is a design document at
https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE/edit?usp=sharing

On Thu, 18 Jun 2020 at 18:05, Hyukjin Kwon  wrote:

> Looks it had to be with SPIP and a proper design doc to discuss.
>
> 2020년 2월 9일 (일) 오전 1:23, Erik Erlandson 님이 작성:
>
>> I'd be willing to pull this in, unless others have concerns post
>> branch-cut.
>>
>> On Tue, Feb 4, 2020 at 2:51 PM Holden Karau  wrote:
>>
>>> Hi Y’all,
>>>
>>> I’ve got a K8s graceful decom PR (
>>> https://github.com/apache/spark/pull/26440
>>>  ) I’d love to try and get in for Spark 3, but I don’t want to push on
>>> it if folks don’t think it’s worth it. I’ve been working on it since 2017
>>> and it was really close in November but then I had the crash and had to
>>> step back for awhile.
>>>
>>> It’s effectiveness is behind a feature flag and it’s been outstanding
>>> for awhile so those points are in its favour. It does however change things
>>> in core which is not great.
>>>
>>> Cheers,
>>>
>>> Holden
>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>


Re: Initial Decom PR for Spark 3?

2020-06-18 Thread Holden Karau
For follow up while I've backported this in some internal releases I'm not
considering a candidate for backporting to Spark 3 anymore. I should have
updated the thread with that. The design doc is linked in the PR.

On Thu, Jun 18, 2020 at 6:05 PM Hyukjin Kwon  wrote:

> Looks it had to be with SPIP and a proper design doc to discuss.
>
> 2020년 2월 9일 (일) 오전 1:23, Erik Erlandson 님이 작성:
>
>> I'd be willing to pull this in, unless others have concerns post
>> branch-cut.
>>
>> On Tue, Feb 4, 2020 at 2:51 PM Holden Karau  wrote:
>>
>>> Hi Y’all,
>>>
>>> I’ve got a K8s graceful decom PR (
>>> https://github.com/apache/spark/pull/26440
>>>  ) I’d love to try and get in for Spark 3, but I don’t want to push on
>>> it if folks don’t think it’s worth it. I’ve been working on it since 2017
>>> and it was really close in November but then I had the crash and had to
>>> step back for awhile.
>>>
>>> It’s effectiveness is behind a feature flag and it’s been outstanding
>>> for awhile so those points are in its favour. It does however change things
>>> in core which is not great.
>>>
>>> Cheers,
>>>
>>> Holden
>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>

-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: Initial Decom PR for Spark 3?

2020-06-18 Thread Hyukjin Kwon
Looks it had to be with SPIP and a proper design doc to discuss.

2020년 2월 9일 (일) 오전 1:23, Erik Erlandson 님이 작성:

> I'd be willing to pull this in, unless others have concerns post
> branch-cut.
>
> On Tue, Feb 4, 2020 at 2:51 PM Holden Karau  wrote:
>
>> Hi Y’all,
>>
>> I’ve got a K8s graceful decom PR (
>> https://github.com/apache/spark/pull/26440
>>  ) I’d love to try and get in for Spark 3, but I don’t want to push on it
>> if folks don’t think it’s worth it. I’ve been working on it since 2017 and
>> it was really close in November but then I had the crash and had to step
>> back for awhile.
>>
>> It’s effectiveness is behind a feature flag and it’s been outstanding for
>> awhile so those points are in its favour. It does however change things in
>> core which is not great.
>>
>> Cheers,
>>
>> Holden
>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>


Re: Removing references to slave (and maybe in the future master)

2020-06-18 Thread Russell Spitzer
I really dislike the use of "worker" in the code base since it describes a
process which doesn't actually do work, but I don't think it's in the scope
for this ticket. I would definitely prefer we use "agent" instead of
"worker" (or some other name) and have master switched to something like
"resource manager" or something that actually describes the purpose of the
process.

I realize that touching "master" is going to disrupt just about everything
but these name choices are usually the first thing that trips up new Spark
Users. In my experience, I usually have to spend at least 15-20
minutes explaining that a worker will not actually do work, and the master
won't run their application.

Thanks Holden for doing all the legwork on this!


Re: Removing references to slave (and maybe in the future master)

2020-06-18 Thread Erik Krogen
Thanks a lot for proposing this, Holden.

I'd be curious to know how others feel about also tackling the word
blacklist -- while I think most would agree it is not as egregious as
master/slave, it seems to be an appropriate time to use the momentum to
really a make a best effort at removing any trace of language that would
alienate potential community members. There is some discussion of this term
in this blog post, which I also encourage reading:
https://lethargy.org/~jesus/writes/a-guide-to-nomenclature-selection/

On Thu, Jun 18, 2020 at 1:27 PM Holden Karau  wrote:

> So I think using Worker everywhere would be a bit confusing since the
> relationship between worker and blockmanager replica is complex, also in
> the current PR `AgentLost` is not `WorkerLost` because it doesn't
> necessarily mean the worker is lost (there's a flag for if the worker has
> been lost).
>
> On Thu, Jun 18, 2020 at 1:21 PM Matei Zaharia 
> wrote:
>
>> Yup, it would be great to do this. FWIW, I would propose using “worker”
>> everywhere instead unless it already means something in that context, just
>> to have a single word for this (instead of multiple words such as agent,
>> replica, etc), but I haven’t looked into whether that would make anything
>> confusing.
>>
>> On Jun 18, 2020, at 1:14 PM, Holden Karau  wrote:
>>
>> Thank you. I agree being careful with API comparability is important. I
>> think in situations where the terms are exposed in our API we can introduce
>> alternatives and deprecate the old ones to allow for a smooth migration.
>>
>> On Thu, Jun 18, 2020 at 12:28 PM Reynold Xin  wrote:
>>
>>> Thanks for doing this. I think this is a great thing to do.
>>>
>>> But we gotta be careful with API compatibility.
>>>
>>>
>>> On Thu, Jun 18, 2020 at 11:32 AM, Holden Karau 
>>> wrote:
>>>
 Hi Folks,

 I've started working on cleaning up the Spark code to remove references
 to slave since the word has a lot of negative connotations and we can
 generally replace it with more accurate/descriptive words in our code base.
 The PR is at https://github.com/apache/spark/pull/28864 (I'm a little
 uncertain on the place of where I chose the name "AgentLost" as the
 replacement, suggestions welcome).

 At some point I think we should explore deprecating master as well, but
 that is used very broadley inside of our code and in our APIs, so while it
 is visible to more people changing it would be more work. I think having
 consensus around removing slave though is a good first step.

 Cheers,

 Holden

 --
 Twitter: https://twitter.com/holdenkarau
 Books (Learning Spark, High Performance Spark, etc.):
 https://amzn.to/2MaRAG9  
 YouTube Live Streams: https://www.youtube.com/user/holdenkarau

>>>
>>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>>
>>
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>


Re: Removing references to slave (and maybe in the future master)

2020-06-18 Thread Holden Karau
So I think using Worker everywhere would be a bit confusing since the
relationship between worker and blockmanager replica is complex, also in
the current PR `AgentLost` is not `WorkerLost` because it doesn't
necessarily mean the worker is lost (there's a flag for if the worker has
been lost).

On Thu, Jun 18, 2020 at 1:21 PM Matei Zaharia 
wrote:

> Yup, it would be great to do this. FWIW, I would propose using “worker”
> everywhere instead unless it already means something in that context, just
> to have a single word for this (instead of multiple words such as agent,
> replica, etc), but I haven’t looked into whether that would make anything
> confusing.
>
> On Jun 18, 2020, at 1:14 PM, Holden Karau  wrote:
>
> Thank you. I agree being careful with API comparability is important. I
> think in situations where the terms are exposed in our API we can introduce
> alternatives and deprecate the old ones to allow for a smooth migration.
>
> On Thu, Jun 18, 2020 at 12:28 PM Reynold Xin  wrote:
>
>> Thanks for doing this. I think this is a great thing to do.
>>
>> But we gotta be careful with API compatibility.
>>
>>
>> On Thu, Jun 18, 2020 at 11:32 AM, Holden Karau 
>> wrote:
>>
>>> Hi Folks,
>>>
>>> I've started working on cleaning up the Spark code to remove references
>>> to slave since the word has a lot of negative connotations and we can
>>> generally replace it with more accurate/descriptive words in our code base.
>>> The PR is at https://github.com/apache/spark/pull/28864 (I'm a little
>>> uncertain on the place of where I chose the name "AgentLost" as the
>>> replacement, suggestions welcome).
>>>
>>> At some point I think we should explore deprecating master as well, but
>>> that is used very broadley inside of our code and in our APIs, so while it
>>> is visible to more people changing it would be more work. I think having
>>> consensus around removing slave though is a good first step.
>>>
>>> Cheers,
>>>
>>> Holden
>>>
>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>
>> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>
>
>

-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: Removing references to slave (and maybe in the future master)

2020-06-18 Thread Matei Zaharia
Yup, it would be great to do this. FWIW, I would propose using “worker” 
everywhere instead unless it already means something in that context, just to 
have a single word for this (instead of multiple words such as agent, replica, 
etc), but I haven’t looked into whether that would make anything confusing.

> On Jun 18, 2020, at 1:14 PM, Holden Karau  wrote:
> 
> Thank you. I agree being careful with API comparability is important. I think 
> in situations where the terms are exposed in our API we can introduce 
> alternatives and deprecate the old ones to allow for a smooth migration.
> 
> On Thu, Jun 18, 2020 at 12:28 PM Reynold Xin  > wrote:
> Thanks for doing this. I think this is a great thing to do.
> 
> But we gotta be careful with API compatibility.
> 
> 
> On Thu, Jun 18, 2020 at 11:32 AM, Holden Karau  > wrote:
> Hi Folks,
> 
> I've started working on cleaning up the Spark code to remove references to 
> slave since the word has a lot of negative connotations and we can generally 
> replace it with more accurate/descriptive words in our code base. The PR is 
> at https://github.com/apache/spark/pull/28864 
>  (I'm a little uncertain on the 
> place of where I chose the name "AgentLost" as the replacement, suggestions 
> welcome).
> 
> At some point I think we should explore deprecating master as well, but that 
> is used very broadley inside of our code and in our APIs, so while it is 
> visible to more people changing it would be more work. I think having 
> consensus around removing slave though is a good first step.
> 
> Cheers,
> 
> Holden
> 
> -- 
> Twitter: https://twitter.com/holdenkarau 
> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 
>  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau 
> 
> -- 
> Twitter: https://twitter.com/holdenkarau 
> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 
>  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau 
> 


Re: Removing references to slave (and maybe in the future master)

2020-06-18 Thread Holden Karau
Thank you. I agree being careful with API comparability is important. I
think in situations where the terms are exposed in our API we can introduce
alternatives and deprecate the old ones to allow for a smooth migration.

On Thu, Jun 18, 2020 at 12:28 PM Reynold Xin  wrote:

> Thanks for doing this. I think this is a great thing to do.
>
> But we gotta be careful with API compatibility.
>
>
> On Thu, Jun 18, 2020 at 11:32 AM, Holden Karau 
> wrote:
>
>> Hi Folks,
>>
>> I've started working on cleaning up the Spark code to remove references
>> to slave since the word has a lot of negative connotations and we can
>> generally replace it with more accurate/descriptive words in our code base.
>> The PR is at https://github.com/apache/spark/pull/28864 (I'm a little
>> uncertain on the place of where I chose the name "AgentLost" as the
>> replacement, suggestions welcome).
>>
>> At some point I think we should explore deprecating master as well, but
>> that is used very broadley inside of our code and in our APIs, so while it
>> is visible to more people changing it would be more work. I think having
>> consensus around removing slave though is a good first step.
>>
>> Cheers,
>>
>> Holden
>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>
> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: [ANNOUNCE] Apache Spark 3.0.0

2020-06-18 Thread Mridul Muralidharan
Great job everyone ! Congratulations :-)

Regards,
Mridul

On Thu, Jun 18, 2020 at 10:21 AM Reynold Xin  wrote:

> Hi all,
>
> Apache Spark 3.0.0 is the first release of the 3.x line. It builds on many
> of the innovations from Spark 2.x, bringing new ideas as well as continuing
> long-term projects that have been in development. This release resolves
> more than 3400 tickets.
>
> We'd like to thank our contributors and users for their contributions and
> early feedback to this release. This release would not have been possible
> without you.
>
> To download Spark 3.0.0, head over to the download page:
> http://spark.apache.org/downloads.html
>
> To view the release notes:
> https://spark.apache.org/releases/spark-release-3-0-0.html
>
>
>
>


Re: Removing references to slave (and maybe in the future master)

2020-06-18 Thread Reynold Xin
Thanks for doing this. I think this is a great thing to do.

But we gotta be careful with API compatibility.

On Thu, Jun 18, 2020 at 11:32 AM, Holden Karau < hol...@pigscanfly.ca > wrote:

> 
> Hi Folks,
> 
> 
> I've started working on cleaning up the Spark code to remove references to
> slave since the word has a lot of negative connotations and we can
> generally replace it with more accurate/descriptive words in our code
> base. The PR is at https:/ / github. com/ apache/ spark/ pull/ 28864 (
> https://github.com/apache/spark/pull/28864 ) (I'm a little uncertain on the
> place of where I chose the name "AgentLost" as the replacement,
> suggestions welcome).
> 
> 
> At some point I think we should explore deprecating master as well, but
> that is used very broadley inside of our code and in our APIs, so while it
> is visible to more people changing it would be more work. I think having
> consensus around removing slave though is a good first step.
> 
> 
> Cheers,
> 
> 
> Holden
> 
> 
> --
> Twitter: https:/ / twitter. com/ holdenkarau (
> https://twitter.com/holdenkarau )
> 
> Books (Learning Spark, High Performance Spark, etc.): https:/ / amzn. to/ 
> 2MaRAG9
> ( https://amzn.to/2MaRAG9 )
> YouTube Live Streams: https:/ / www. youtube. com/ user/ holdenkarau (
> https://www.youtube.com/user/holdenkarau )
>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] Apache Spark 3.0.0

2020-06-18 Thread Gaetano Fabiano
Congratulations 拾 
Celebrating 拾 

Sent from my iPhone

> On 18 Jun 2020, at 20:38, Gourav Sengupta  wrote:
> 
> 
> CELEBRATIONS!!!
> 
>> On Thu, Jun 18, 2020 at 6:21 PM Reynold Xin  wrote:
>> Hi all,
>> 
>> Apache Spark 3.0.0 is the first release of the 3.x line. It builds on many 
>> of the innovations from Spark 2.x, bringing new ideas as well as continuing 
>> long-term projects that have been in development. This release resolves more 
>> than 3400 tickets.
>> 
>> We'd like to thank our contributors and users for their contributions and 
>> early feedback to this release. This release would not have been possible 
>> without you.
>> 
>> To download Spark 3.0.0, head over to the download page: 
>> http://spark.apache.org/downloads.html
>> 
>> To view the release notes: 
>> https://spark.apache.org/releases/spark-release-3-0-0.html
>> 
>> 
>> 


Removing references to slave (and maybe in the future master)

2020-06-18 Thread Holden Karau
Hi Folks,

I've started working on cleaning up the Spark code to remove references to
slave since the word has a lot of negative connotations and we can
generally replace it with more accurate/descriptive words in our code base.
The PR is at https://github.com/apache/spark/pull/28864 (I'm a little
uncertain on the place of where I chose the name "AgentLost" as the
replacement, suggestions welcome).

At some point I think we should explore deprecating master as well, but
that is used very broadley inside of our code and in our APIs, so while it
is visible to more people changing it would be more work. I think having
consensus around removing slave though is a good first step.

Cheers,

Holden

-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


[ANNOUNCE] Apache Spark 3.0.0

2020-06-18 Thread Reynold Xin
Hi all,

Apache Spark 3.0.0 is the first release of the 3.x line. It builds on many of 
the innovations from Spark 2.x, bringing new ideas as well as continuing 
long-term projects that have been in development. This release resolves more 
than 3400 tickets.

We'd like to thank our contributors and users for their contributions and early 
feedback to this release. This release would not have been possible without you.

To download Spark 3.0.0, head over to the download page: 
http://spark.apache.org/downloads.html

To view the release notes: 
https://spark.apache.org/releases/spark-release-3-0-0.html

smime.p7s
Description: S/MIME Cryptographic Signature