Re: ASF board report draft for February 2022

2022-02-08 Thread Mich Talebzadeh
Hi,

I believe it would be beneficial to provide the links to SPIPs mentioned in
the report

- Two Spark Project Improvement Proposals (SPIPs) were recently accepted by
the community: namely; 1)  Support for Customized Kubernetes Schedulers

and
2) Storage Partitioned Join for Data Source V2



HTH


   view my Linkedin profile




*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Tue, 8 Feb 2022 at 09:06, Matei Zaharia  wrote:

> It’s time to send our quarterly report to the ASF board again this
> Wednesday. I’ve written the following draft for it — let me know if you
> want to add or change anything.
>
> ==
>
> Description:
>
> Apache Spark is a fast and general purpose engine for large-scale data
> processing. It offers high-level APIs in Java, Scala, Python, R and SQL as
> well as a rich set of libraries including stream processing, machine
> learning,
> and graph analytics.
>
> Issues for the board:
>
> - None
>
> Project status:
>
> - We released Apache Spark 3.2.1, a bug fix release for the 3.2 line, in
> January.
>
> - Two Spark Project Improvement Proposals (SPIPs) were recently accepted
> by the community: Support for Customized Kubernetes Schedulers and Storage
> Partitioned Join for Data Source V2.
>
> - We’ve migrated away from Spark’s original Jenkins CI/CD infrastructure,
> which was graciously hosted by UC Berkeley on their clusters since 2013, to
> GitHub Actions. Thanks to the Berkeley CS department for hosting this for
> so long!
>
> - We added a new committer, Yuanjian Li, in December 2021.
>
> - We added a new PMC member, Maciej Szymkiewicz, in January 2022.
>
> Trademarks:
>
> - No changes since the last report.
>
> Latest releases:
>
> - Spark 3.2.1 was released on January 26, 2022.
> - Spark 3.2.0 was released on October 13, 2021.
> - Spark 3.1.2 was released on June 23rd, 2021.
>
> Committers and PMC:
> - The latest committer was added on Dec 20th, 2021 (Yuanjian Li).
> - The latest PMC member was added on Jan 19th, 2022 (Maciej Szymkiewicz).
>
> ==
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [VOTE] Spark 3.1.3 RC3

2022-02-08 Thread Holden Karau
Yup, I’ve run into some weirdness with docs again I want to verify before I
send the vote email though.

On Mon, Feb 7, 2022 at 10:06 PM Wenchen Fan  wrote:

> Shall we use the release scripts of branch 3.1 to release 3.1?
>
> On Fri, Feb 4, 2022 at 4:57 AM Holden Karau  wrote:
>
>> Good catch Dongjoon :)
>>
>> This release candidate fails, but feel free to keep testing for any other
>> potential blockers.
>>
>> I’ll roll RC4 next week with the older release scripts (but the more
>> modern image since the legacy image didn’t have a good time with the R doc
>> packaging).
>>
>> On Thu, Feb 3, 2022 at 3:53 PM Dongjoon Hyun 
>> wrote:
>>
>>> Unfortunately, -1 for 3.1.3 RC3 due to the packaging issue.
>>>
>>> It seems that the master branch release script didn't work properly for
>>> Hadoop 2 binary distribution, Holden.
>>>
>>> $ curl -s
>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.3-rc3-bin/spark-3.1.3-bin-hadoop2.tgz
>>> | tar tz | grep hadoop-common
>>> spark-3.1.3-bin-hadoop2/jars/hadoop-common-3.2.0.jar
>>>
>>> Apache Spark didn't drop Apache Hadoop 2 based binary distribution yet.
>>>
>>> Dongjoon
>>>
>>>
>>> On Wed, Feb 2, 2022 at 3:38 PM Mridul Muralidharan 
>>> wrote:
>>>
 Hi,

   Minor nit: the tag mentioned under [1] looks like a typo - I used
 "v3.1.3-rc3"  for my vote (3.2.1 is mentioned in a couple of places, treat
 them as 3.1.3 instead)

 +1
 Signatures, digests, etc check out fine.
 Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes

 Regards,
 Mridul

 [1] "The tag to be voted on is v3.2.1-rc1" - the commit hash and git
 url are correct.


 On Wed, Feb 2, 2022 at 9:30 AM Mridul Muralidharan 
 wrote:

>
> Thanks Tom !
> I missed [1] (or probably forgot) the 3.1 part of the discussion given
> it centered around 3.2 ...
>
>
> Regards,
> Mridul
>
> [1] https://www.mail-archive.com/dev@spark.apache.org/msg28484.html
>
> On Wed, Feb 2, 2022 at 8:55 AM Thomas Graves 
> wrote:
>
>> It was discussed doing all the maintenance lines back at beginning of
>> December (Dec 6) when we were talking about release 3.2.1.
>>
>> Tom
>>
>> On Wed, Feb 2, 2022 at 2:07 AM Mridul Muralidharan 
>> wrote:
>> >
>> > Hi Holden,
>> >
>> >   Not that I am against releasing 3.1.3 (given the fixes that have
>> already gone in), but did we discuss releasing it ? I might have missed 
>> the
>> thread ...
>> >
>> > Regards,
>> > Mridul
>> >
>> > On Tue, Feb 1, 2022 at 7:12 PM Holden Karau 
>> wrote:
>> >>
>> >> Please vote on releasing the following candidate as Apache Spark
>> version 3.1.3.
>> >>
>> >> The vote is open until Feb. 4th at 5 PM PST (1 AM UTC + 1 day) and
>> passes if a majority
>> >> +1 PMC votes are cast, with a minimum of 3 + 1 votes.
>> >>
>> >> [ ] +1 Release this package as Apache Spark 3.1.3
>> >> [ ] -1 Do not release this package because ...
>> >>
>> >> To learn more about Apache Spark, please see
>> http://spark.apache.org/
>> >>
>> >> There are currently no open issues targeting 3.1.3 in Spark's JIRA
>> https://issues.apache.org/jira/browse
>> >> (try project = SPARK AND "Target Version/s" = "3.1.3" AND status
>> in (Open, Reopened, "In Progress"))
>> >> at https://s.apache.org/n79dw
>> >>
>> >>
>> >>
>> >> The tag to be voted on is v3.2.1-rc1 (commit
>> >> b8c0799a8cef22c56132d94033759c9f82b0cc86):
>> >> https://github.com/apache/spark/tree/v3.1.3-rc3
>> >>
>> >> The release files, including signatures, digests, etc. can be
>> found at:
>> >> https://dist.apache.org/repos/dist/dev/spark/v3.1.3-rc3-bin/
>> >>
>> >> Signatures used for Spark RCs can be found in this file:
>> >> https://dist.apache.org/repos/dist/dev/spark/KEYS
>> >>
>> >> The staging repository for this release can be found at
>> >> :
>> https://repository.apache.org/content/repositories/orgapachespark-1400/
>> >>
>> >> The documentation corresponding to this release can be found at:
>> >> https://dist.apache.org/repos/dist/dev/spark/v3.1.3-rc3-docs/
>> >>
>> >> The list of bug fixes going into 3.1.3 can be found at the
>> following URL:
>> >> https://s.apache.org/x0q9b
>> >>
>> >> This release is using the release script in master as of
>> ddc77fb906cb3ce1567d277c2d0850104c89ac25
>> >> The release docker container was rebuilt since the previous
>> version didn't have the necessary components to build the R 
>> documentation.
>> >>
>> >> FAQ
>> >>
>> >>
>> >> =
>> >> How can I help test this release?
>> >> =
>> >>
>> >> If you are a Spark user, you can help us test this release by
>> taking
>> >> an existing Spark 

ASF board report draft for February 2022

2022-02-08 Thread Matei Zaharia
It’s time to send our quarterly report to the ASF board again this Wednesday. 
I’ve written the following draft for it — let me know if you want to add or 
change anything.

==

Description:

Apache Spark is a fast and general purpose engine for large-scale data
processing. It offers high-level APIs in Java, Scala, Python, R and SQL as
well as a rich set of libraries including stream processing, machine learning,
and graph analytics.

Issues for the board:

- None

Project status:

- We released Apache Spark 3.2.1, a bug fix release for the 3.2 line, in 
January.

- Two Spark Project Improvement Proposals (SPIPs) were recently accepted by the 
community: Support for Customized Kubernetes Schedulers and Storage Partitioned 
Join for Data Source V2.

- We’ve migrated away from Spark’s original Jenkins CI/CD infrastructure, which 
was graciously hosted by UC Berkeley on their clusters since 2013, to GitHub 
Actions. Thanks to the Berkeley CS department for hosting this for so long!

- We added a new committer, Yuanjian Li, in December 2021.

- We added a new PMC member, Maciej Szymkiewicz, in January 2022.

Trademarks:

- No changes since the last report.

Latest releases:

- Spark 3.2.1 was released on January 26, 2022.
- Spark 3.2.0 was released on October 13, 2021.
- Spark 3.1.2 was released on June 23rd, 2021.

Committers and PMC:
- The latest committer was added on Dec 20th, 2021 (Yuanjian Li).
- The latest PMC member was added on Jan 19th, 2022 (Maciej Szymkiewicz).

==
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org