Re: Apache Spark 3.1.2 Release?

2021-05-18 Thread Gengliang Wang
Late +1, thank you, Dongjoon!

> On May 19, 2021, at 10:47 AM, Jungtaek Lim  
> wrote:
> 
> Late +1 here as well, thanks for volunteering!
> 
> 2021년 5월 19일 (수) 오전 11:24, 郑瑞峰  <mailto:ruife...@foxmail.com>>님이 작성:
> late +1. thanks Dongjoon!
> 
> 
> -- 原始邮件 --
> 发件人: "Dongjoon Hyun"  <mailto:dongjoon.h...@gmail.com>>;
> 发送时间: 2021年5月19日(星期三) 凌晨1:29
> 收件人: "Wenchen Fan"mailto:cloud0...@gmail.com>>;
> 抄送: "Xiao Li"mailto:lix...@databricks.com>>;"Kent 
> Yao"mailto:yaooq...@gmail.com>>;"John 
> Zhuge"mailto:jzh...@apache.org>>;"Hyukjin 
> Kwon"mailto:gurwls...@gmail.com>>;"Holden 
> Karau"mailto:hol...@pigscanfly.ca>>;"Takeshi 
> Yamamuro" <mailto:linguin@gmail.com>>;"dev" <mailto:dev@spark.apache.org>>;"Yuming Wang" <mailto:wgy...@gmail.com>>;
> 主题: Re: Apache Spark 3.1.2 Release?
> 
> Thank you all! I'll start to prepare.
> 
> Bests,
> Dongjoon.
> 
> On Tue, May 18, 2021 at 12:53 AM Wenchen Fan  <mailto:cloud0...@gmail.com>> wrote:
> +1, thanks!
> 
> On Tue, May 18, 2021 at 1:37 PM Xiao Li  <mailto:lix...@databricks.com>> wrote:
> +1 Thanks, Dongjoon!
> 
> Xiao
> 
> 
> 
> On Mon, May 17, 2021 at 8:45 PM Kent Yao  <mailto:yaooq...@gmail.com>> wrote:
> +1. thanks Dongjoon
> 
> Kent Yao 
> @ Data Science Center, Hangzhou Research Institute, NetEase Corp.
> a spark enthusiast
> kyuubi <https://github.com/yaooqinn/kyuubi>is a unified multi-tenant JDBC 
> interface for large-scale data processing and analytics, built on top of 
> Apache Spark <http://spark.apache.org/>.
> spark-authorizer <https://github.com/yaooqinn/spark-authorizer>A Spark SQL 
> extension which provides SQL Standard Authorization for Apache Spark 
> <http://spark.apache.org/>.
> spark-postgres <https://github.com/yaooqinn/spark-postgres> A library for 
> reading data from and transferring data to Postgres / Greenplum with Spark 
> SQL and DataFrames, 10~100x faster.
> itatchi <https://github.com/yaooqinn/spark-func-extras>A library that brings 
> useful functions from various modern database management systems to Apache 
> Spark <http://spark.apache.org/>.
> 
> 
>  
> 
> On 05/18/2021 10:57,John Zhuge <mailto:jzh...@apache.org> 
> wrote:
> +1, thanks Dongjoon!
> 
> On Mon, May 17, 2021 at 7:50 PM Yuming Wang  <mailto:wgy...@gmail.com>> wrote:
> +1.
> 
> On Tue, May 18, 2021 at 9:06 AM Hyukjin Kwon  <mailto:gurwls...@gmail.com>> wrote:
> +1 thanks for driving me
> 
> On Tue, 18 May 2021, 09:33 Holden Karau,  <mailto:hol...@pigscanfly.ca>> wrote:
> +1 and thanks for volunteering to be the RM :)
> 
> On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro  <mailto:linguin@gmail.com>> wrote:
> Thank you, Dongjoon~ sgtm, too.
> 
> On Tue, May 18, 2021 at 7:34 AM Cheng Su  wrote:
> +1 for a new release, thanks Dongjoon!
> 
> Cheng Su
> 
> On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  <mailto:vii...@gmail.com>> wrote:
> 
> +1 sounds good. Thanks Dongjoon for volunteering on this!
> 
> 
> Liang-Chi
> 
> 
> Dongjoon Hyun-2 wrote
> > Hi, All.
> > 
> > Since Apache Spark 3.1.1 tag creation (Feb 21),
> > new 172 patches including 9 correctness patches and 4 K8s patches 
> arrived
> > at branch-3.1.
> > 
> > Shall we make a new release, Apache Spark 3.1.2, as the second release 
> at
> > 3.1 line?
> > I'd like to volunteer for the release manager for Apache Spark 3.1.2.
> > I'm thinking about starting the first RC next week.
> > 
> > $ git log --oneline v3.1.1..HEAD | wc -l
> >  172
> > 
> > # Known correctness issues
> > SPARK-34534 New protocol FetchShuffleBlocks in OneForOneBlockFetcher
> > lead to data loss or correctness
> > SPARK-34545 PySpark Python UDF return inconsistent results when
> > applying 2 UDFs with different return type to 2 columns together
> > SPARK-34681 Full outer shuffled hash join when building left side
> > produces wrong result
> > SPARK-34719 fail if the view query has duplicated column names
> > SPARK-34794 Nested higher-order functions broken in DSL
> > SPARK-34829 transform_values return identical values when it's used
> > with udf that returns reference type
> > SPARK-34833 Apply right-padding cor

Re: Apache Spark 3.1.2 Release?

2021-05-18 Thread Jungtaek Lim
Late +1 here as well, thanks for volunteering!

2021년 5월 19일 (수) 오전 11:24, 郑瑞峰 님이 작성:

> late +1. thanks Dongjoon!
>
>
> -- 原始邮件 --
> *发件人:* "Dongjoon Hyun" ;
> *发送时间:* 2021年5月19日(星期三) 凌晨1:29
> *收件人:* "Wenchen Fan";
> *抄送:* "Xiao Li";"Kent Yao";"John
> Zhuge";"Hyukjin Kwon";"Holden
> Karau";"Takeshi Yamamuro" >;"dev";"Yuming Wang";
> *主题:* Re: Apache Spark 3.1.2 Release?
>
> Thank you all! I'll start to prepare.
>
> Bests,
> Dongjoon.
>
> On Tue, May 18, 2021 at 12:53 AM Wenchen Fan  wrote:
>
>> +1, thanks!
>>
>> On Tue, May 18, 2021 at 1:37 PM Xiao Li  wrote:
>>
>>> +1 Thanks, Dongjoon!
>>>
>>> Xiao
>>>
>>>
>>>
>>> On Mon, May 17, 2021 at 8:45 PM Kent Yao  wrote:
>>>
>>>> +1. thanks Dongjoon
>>>>
>>>> *Kent Yao *
>>>> @ Data Science Center, Hangzhou Research Institute, NetEase Corp.
>>>> *a spark enthusiast*
>>>> *kyuubi <https://github.com/yaooqinn/kyuubi>is a
>>>> unified multi-tenant JDBC interface for large-scale data processing and
>>>> analytics, built on top of Apache Spark <http://spark.apache.org/>.*
>>>> *spark-authorizer <https://github.com/yaooqinn/spark-authorizer>A Spark
>>>> SQL extension which provides SQL Standard Authorization for **Apache
>>>> Spark <http://spark.apache.org/>.*
>>>> *spark-postgres <https://github.com/yaooqinn/spark-postgres> A library
>>>> for reading data from and transferring data to Postgres / Greenplum with
>>>> Spark SQL and DataFrames, 10~100x faster.*
>>>> *itatchi <https://github.com/yaooqinn/spark-func-extras>A** library t**hat
>>>> brings useful functions from various modern database management systems to 
>>>> **Apache
>>>> Spark <http://spark.apache.org/>.*
>>>>
>>>>
>>>>
>>>> On 05/18/2021 10:57,John Zhuge 
>>>> wrote:
>>>>
>>>> +1, thanks Dongjoon!
>>>>
>>>> On Mon, May 17, 2021 at 7:50 PM Yuming Wang  wrote:
>>>>
>>>>> +1.
>>>>>
>>>>> On Tue, May 18, 2021 at 9:06 AM Hyukjin Kwon 
>>>>> wrote:
>>>>>
>>>>>> +1 thanks for driving me
>>>>>>
>>>>>> On Tue, 18 May 2021, 09:33 Holden Karau, 
>>>>>> wrote:
>>>>>>
>>>>>>> +1 and thanks for volunteering to be the RM :)
>>>>>>>
>>>>>>> On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro <
>>>>>>> linguin@gmail.com> wrote:
>>>>>>>
>>>>>>>> Thank you, Dongjoon~ sgtm, too.
>>>>>>>>
>>>>>>>> On Tue, May 18, 2021 at 7:34 AM Cheng Su 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> +1 for a new release, thanks Dongjoon!
>>>>>>>>>
>>>>>>>>> Cheng Su
>>>>>>>>>
>>>>>>>>> On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:
>>>>>>>>>
>>>>>>>>> +1 sounds good. Thanks Dongjoon for volunteering on this!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Liang-Chi
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dongjoon Hyun-2 wrote
>>>>>>>>> > Hi, All.
>>>>>>>>> >
>>>>>>>>> > Since Apache Spark 3.1.1 tag creation (Feb 21),
>>>>>>>>> > new 172 patches including 9 correctness patches and 4 K8s
>>>>>>>>> patches arrived
>>>>>>>>> > at branch-3.1.
>>>>>>>>> >
>>>>>>>>> > Shall we make a new release, Apache Spark 3.1.2, as the
>>>>>>>>> second release at
>>>>>>>>> > 3.1 line?
>>>>>>>>> > I'd like to volunteer for the release manager for Apache
>>>>>>>>> Spark 3.1.2.
>>>>>>>>> > I'm thinking about starting the first RC next week.
>>>&

回复: Apache Spark 3.1.2 Release?

2021-05-18 Thread 郑瑞峰
late +1. thanks Dongjoon!



-- 原始邮件 --
发件人:
"Dongjoon Hyun" 
   
http://apache-spark-developers-list.1001551.n3.nabble.com/ 
 
     
-
     To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
 
 
 



-- 
---
Takeshi Yamamuro



 

-- 
Twitter: https://twitter.com/holdenkarau

Books (Learning Spark, High Performance Spark, 
etc.): https://amzn.to/2MaRAG9 ;
YouTube Live Streams: https://www.youtube.com/user/holdenkarau









 
 
 



-- 
John Zhuge

  
 
 



--

Re: Apache Spark 3.1.2 Release?

2021-05-18 Thread Dongjoon Hyun
Thank you all! I'll start to prepare.

Bests,
Dongjoon.

On Tue, May 18, 2021 at 12:53 AM Wenchen Fan  wrote:

> +1, thanks!
>
> On Tue, May 18, 2021 at 1:37 PM Xiao Li  wrote:
>
>> +1 Thanks, Dongjoon!
>>
>> Xiao
>>
>>
>>
>> On Mon, May 17, 2021 at 8:45 PM Kent Yao  wrote:
>>
>>> +1. thanks Dongjoon
>>>
>>> *Kent Yao *
>>> @ Data Science Center, Hangzhou Research Institute, NetEase Corp.
>>> *a spark enthusiast*
>>> *kyuubi is a
>>> unified multi-tenant JDBC interface for large-scale data processing and
>>> analytics, built on top of Apache Spark .*
>>> *spark-authorizer A Spark
>>> SQL extension which provides SQL Standard Authorization for **Apache
>>> Spark .*
>>> *spark-postgres  A library
>>> for reading data from and transferring data to Postgres / Greenplum with
>>> Spark SQL and DataFrames, 10~100x faster.*
>>> *itatchi A** library t**hat
>>> brings useful functions from various modern database management systems to 
>>> **Apache
>>> Spark .*
>>>
>>>
>>>
>>> On 05/18/2021 10:57,John Zhuge 
>>> wrote:
>>>
>>> +1, thanks Dongjoon!
>>>
>>> On Mon, May 17, 2021 at 7:50 PM Yuming Wang  wrote:
>>>
 +1.

 On Tue, May 18, 2021 at 9:06 AM Hyukjin Kwon 
 wrote:

> +1 thanks for driving me
>
> On Tue, 18 May 2021, 09:33 Holden Karau,  wrote:
>
>> +1 and thanks for volunteering to be the RM :)
>>
>> On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro <
>> linguin@gmail.com> wrote:
>>
>>> Thank you, Dongjoon~ sgtm, too.
>>>
>>> On Tue, May 18, 2021 at 7:34 AM Cheng Su 
>>> wrote:
>>>
 +1 for a new release, thanks Dongjoon!

 Cheng Su

 On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:

 +1 sounds good. Thanks Dongjoon for volunteering on this!


 Liang-Chi


 Dongjoon Hyun-2 wrote
 > Hi, All.
 >
 > Since Apache Spark 3.1.1 tag creation (Feb 21),
 > new 172 patches including 9 correctness patches and 4 K8s
 patches arrived
 > at branch-3.1.
 >
 > Shall we make a new release, Apache Spark 3.1.2, as the
 second release at
 > 3.1 line?
 > I'd like to volunteer for the release manager for Apache
 Spark 3.1.2.
 > I'm thinking about starting the first RC next week.
 >
 > $ git log --oneline v3.1.1..HEAD | wc -l
 >  172
 >
 > # Known correctness issues
 > SPARK-34534 New protocol FetchShuffleBlocks in
 OneForOneBlockFetcher
 > lead to data loss or correctness
 > SPARK-34545 PySpark Python UDF return inconsistent
 results when
 > applying 2 UDFs with different return type to 2 columns
 together
 > SPARK-34681 Full outer shuffled hash join when building
 left side
 > produces wrong result
 > SPARK-34719 fail if the view query has duplicated column
 names
 > SPARK-34794 Nested higher-order functions broken in DSL
 > SPARK-34829 transform_values return identical values when
 it's used
 > with udf that returns reference type
 > SPARK-34833 Apply right-padding correctly for correlated
 subqueries
 > SPARK-35381 Fix lambda variable name issues in nested
 DataFrame
 > functions in R APIs
 > SPARK-35382 Fix lambda variable name issues in nested
 DataFrame
 > functions in Python APIs
 >
 > # Notable K8s patches since K8s GA
 > SPARK-34674Close SparkContext after the Main method has
 finished
 > SPARK-34948Add ownerReference to executor configmap to
 fix leakages
 > SPARK-34820add apt-update before gnupg install
 > SPARK-34361In case of downscaling avoid killing of
 executors already
 > known by the scheduler backend in the pod allocator
 >
 > Bests,
 > Dongjoon.





 --
 Sent from:
 http://apache-spark-developers-list.1001551.n3.nabble.com/


 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



>>>
>>> --
>>> ---
>>> Takeshi Yamamuro
>>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, 

Re: Apache Spark 3.1.2 Release?

2021-05-18 Thread Wenchen Fan
+1, thanks!

On Tue, May 18, 2021 at 1:37 PM Xiao Li  wrote:

> +1 Thanks, Dongjoon!
>
> Xiao
>
>
>
> On Mon, May 17, 2021 at 8:45 PM Kent Yao  wrote:
>
>> +1. thanks Dongjoon
>>
>> *Kent Yao *
>> @ Data Science Center, Hangzhou Research Institute, NetEase Corp.
>> *a spark enthusiast*
>> *kyuubi is a
>> unified multi-tenant JDBC interface for large-scale data processing and
>> analytics, built on top of Apache Spark .*
>> *spark-authorizer A Spark
>> SQL extension which provides SQL Standard Authorization for **Apache
>> Spark .*
>> *spark-postgres  A library
>> for reading data from and transferring data to Postgres / Greenplum with
>> Spark SQL and DataFrames, 10~100x faster.*
>> *itatchi A** library t**hat
>> brings useful functions from various modern database management systems to 
>> **Apache
>> Spark .*
>>
>>
>>
>> On 05/18/2021 10:57,John Zhuge 
>> wrote:
>>
>> +1, thanks Dongjoon!
>>
>> On Mon, May 17, 2021 at 7:50 PM Yuming Wang  wrote:
>>
>>> +1.
>>>
>>> On Tue, May 18, 2021 at 9:06 AM Hyukjin Kwon 
>>> wrote:
>>>
 +1 thanks for driving me

 On Tue, 18 May 2021, 09:33 Holden Karau,  wrote:

> +1 and thanks for volunteering to be the RM :)
>
> On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro <
> linguin@gmail.com> wrote:
>
>> Thank you, Dongjoon~ sgtm, too.
>>
>> On Tue, May 18, 2021 at 7:34 AM Cheng Su 
>> wrote:
>>
>>> +1 for a new release, thanks Dongjoon!
>>>
>>> Cheng Su
>>>
>>> On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:
>>>
>>> +1 sounds good. Thanks Dongjoon for volunteering on this!
>>>
>>>
>>> Liang-Chi
>>>
>>>
>>> Dongjoon Hyun-2 wrote
>>> > Hi, All.
>>> >
>>> > Since Apache Spark 3.1.1 tag creation (Feb 21),
>>> > new 172 patches including 9 correctness patches and 4 K8s
>>> patches arrived
>>> > at branch-3.1.
>>> >
>>> > Shall we make a new release, Apache Spark 3.1.2, as the second
>>> release at
>>> > 3.1 line?
>>> > I'd like to volunteer for the release manager for Apache Spark
>>> 3.1.2.
>>> > I'm thinking about starting the first RC next week.
>>> >
>>> > $ git log --oneline v3.1.1..HEAD | wc -l
>>> >  172
>>> >
>>> > # Known correctness issues
>>> > SPARK-34534 New protocol FetchShuffleBlocks in
>>> OneForOneBlockFetcher
>>> > lead to data loss or correctness
>>> > SPARK-34545 PySpark Python UDF return inconsistent results
>>> when
>>> > applying 2 UDFs with different return type to 2 columns
>>> together
>>> > SPARK-34681 Full outer shuffled hash join when building
>>> left side
>>> > produces wrong result
>>> > SPARK-34719 fail if the view query has duplicated column
>>> names
>>> > SPARK-34794 Nested higher-order functions broken in DSL
>>> > SPARK-34829 transform_values return identical values when
>>> it's used
>>> > with udf that returns reference type
>>> > SPARK-34833 Apply right-padding correctly for correlated
>>> subqueries
>>> > SPARK-35381 Fix lambda variable name issues in nested
>>> DataFrame
>>> > functions in R APIs
>>> > SPARK-35382 Fix lambda variable name issues in nested
>>> DataFrame
>>> > functions in Python APIs
>>> >
>>> > # Notable K8s patches since K8s GA
>>> > SPARK-34674Close SparkContext after the Main method has
>>> finished
>>> > SPARK-34948Add ownerReference to executor configmap to fix
>>> leakages
>>> > SPARK-34820add apt-update before gnupg install
>>> > SPARK-34361In case of downscaling avoid killing of
>>> executors already
>>> > known by the scheduler backend in the pod allocator
>>> >
>>> > Bests,
>>> > Dongjoon.
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Sent from:
>>> http://apache-spark-developers-list.1001551.n3.nabble.com/
>>>
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

>>
>> --
>> John Zhuge
>>
>>
>
> --
>
>


Re: Apache Spark 3.1.2 Release?

2021-05-17 Thread Xiao Li
+1 Thanks, Dongjoon!

Xiao



On Mon, May 17, 2021 at 8:45 PM Kent Yao  wrote:

> +1. thanks Dongjoon
>
> *Kent Yao *
> @ Data Science Center, Hangzhou Research Institute, NetEase Corp.
> *a spark enthusiast*
> *kyuubi is a unified multi-tenant JDBC
> interface for large-scale data processing and analytics, built on top
> of Apache Spark .*
> *spark-authorizer A Spark
> SQL extension which provides SQL Standard Authorization for **Apache
> Spark .*
> *spark-postgres  A library for
> reading data from and transferring data to Postgres / Greenplum with Spark
> SQL and DataFrames, 10~100x faster.*
> *itatchi A** library t**hat
> brings useful functions from various modern database management systems to 
> **Apache
> Spark .*
>
>
>
> On 05/18/2021 10:57,John Zhuge 
> wrote:
>
> +1, thanks Dongjoon!
>
> On Mon, May 17, 2021 at 7:50 PM Yuming Wang  wrote:
>
>> +1.
>>
>> On Tue, May 18, 2021 at 9:06 AM Hyukjin Kwon  wrote:
>>
>>> +1 thanks for driving me
>>>
>>> On Tue, 18 May 2021, 09:33 Holden Karau,  wrote:
>>>
 +1 and thanks for volunteering to be the RM :)

 On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro 
 wrote:

> Thank you, Dongjoon~ sgtm, too.
>
> On Tue, May 18, 2021 at 7:34 AM Cheng Su 
> wrote:
>
>> +1 for a new release, thanks Dongjoon!
>>
>> Cheng Su
>>
>> On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:
>>
>> +1 sounds good. Thanks Dongjoon for volunteering on this!
>>
>>
>> Liang-Chi
>>
>>
>> Dongjoon Hyun-2 wrote
>> > Hi, All.
>> >
>> > Since Apache Spark 3.1.1 tag creation (Feb 21),
>> > new 172 patches including 9 correctness patches and 4 K8s
>> patches arrived
>> > at branch-3.1.
>> >
>> > Shall we make a new release, Apache Spark 3.1.2, as the second
>> release at
>> > 3.1 line?
>> > I'd like to volunteer for the release manager for Apache Spark
>> 3.1.2.
>> > I'm thinking about starting the first RC next week.
>> >
>> > $ git log --oneline v3.1.1..HEAD | wc -l
>> >  172
>> >
>> > # Known correctness issues
>> > SPARK-34534 New protocol FetchShuffleBlocks in
>> OneForOneBlockFetcher
>> > lead to data loss or correctness
>> > SPARK-34545 PySpark Python UDF return inconsistent results
>> when
>> > applying 2 UDFs with different return type to 2 columns together
>> > SPARK-34681 Full outer shuffled hash join when building
>> left side
>> > produces wrong result
>> > SPARK-34719 fail if the view query has duplicated column
>> names
>> > SPARK-34794 Nested higher-order functions broken in DSL
>> > SPARK-34829 transform_values return identical values when
>> it's used
>> > with udf that returns reference type
>> > SPARK-34833 Apply right-padding correctly for correlated
>> subqueries
>> > SPARK-35381 Fix lambda variable name issues in nested
>> DataFrame
>> > functions in R APIs
>> > SPARK-35382 Fix lambda variable name issues in nested
>> DataFrame
>> > functions in Python APIs
>> >
>> > # Notable K8s patches since K8s GA
>> > SPARK-34674Close SparkContext after the Main method has
>> finished
>> > SPARK-34948Add ownerReference to executor configmap to fix
>> leakages
>> > SPARK-34820add apt-update before gnupg install
>> > SPARK-34361In case of downscaling avoid killing of
>> executors already
>> > known by the scheduler backend in the pod allocator
>> >
>> > Bests,
>> > Dongjoon.
>>
>>
>>
>>
>>
>> --
>> Sent from:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>>
>
> --
> ---
> Takeshi Yamamuro
>
 --
 Twitter: https://twitter.com/holdenkarau
 Books (Learning Spark, High Performance Spark, etc.):
 https://amzn.to/2MaRAG9  
 YouTube Live Streams: https://www.youtube.com/user/holdenkarau

>>>
>
> --
> John Zhuge
>
>

--


Re: Apache Spark 3.1.2 Release?

2021-05-17 Thread Kent Yao







+1. thanks Dongjoon






  





















Kent Yao @ Data Science Center, Hangzhou Research Institute, NetEase Corp.a spark enthusiastkyuubiis a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark.spark-authorizerA Spark SQL extension which provides SQL Standard Authorization for Apache Spark.spark-postgres A library for reading data from and transferring data to Postgres / Greenplum with Spark SQL and DataFrames, 10~100x faster.itatchiA library that brings useful functions from various modern database management systems to Apache Spark.
















 


On 05/18/2021 10:57,John Zhuge wrote: 


+1, thanks Dongjoon!On Mon, May 17, 2021 at 7:50 PM Yuming Wang  wrote:+1.On Tue, May 18, 2021 at 9:06 AM Hyukjin Kwon  wrote:+1 thanks for driving meOn Tue, 18 May 2021, 09:33 Holden Karau,  wrote:+1 and thanks for volunteering to be the RM :)On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro  wrote:Thank you, Dongjoon~ sgtm, too.On Tue, May 18, 2021 at 7:34 AM Cheng Su  wrote:+1 for a new release, thanks Dongjoon!

Cheng Su

On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:

    +1 sounds good. Thanks Dongjoon for volunteering on this!


    Liang-Chi


    Dongjoon Hyun-2 wrote
    > Hi, All.
    > 
    > Since Apache Spark 3.1.1 tag creation (Feb 21),
    > new 172 patches including 9 correctness patches and 4 K8s patches arrived
    > at branch-3.1.
    > 
    > Shall we make a new release, Apache Spark 3.1.2, as the second release at
    > 3.1 line?
    > I'd like to volunteer for the release manager for Apache Spark 3.1.2.
    > I'm thinking about starting the first RC next week.
    > 
    > $ git log --oneline v3.1.1..HEAD | wc -l
    >      172
    > 
    > # Known correctness issues
    > SPARK-34534     New protocol FetchShuffleBlocks in OneForOneBlockFetcher
    > lead to data loss or correctness
    > SPARK-34545     PySpark Python UDF return inconsistent results when
    > applying 2 UDFs with different return type to 2 columns together
    > SPARK-34681     Full outer shuffled hash join when building left side
    > produces wrong result
    > SPARK-34719     fail if the view query has duplicated column names
    > SPARK-34794     Nested higher-order functions broken in DSL
    > SPARK-34829     transform_values return identical values when it's used
    > with udf that returns reference type
    > SPARK-34833     Apply right-padding correctly for correlated subqueries
    > SPARK-35381     Fix lambda variable name issues in nested DataFrame
    > functions in R APIs
    > SPARK-35382     Fix lambda variable name issues in nested DataFrame
    > functions in Python APIs
    > 
    > # Notable K8s patches since K8s GA
    > SPARK-34674    Close SparkContext after the Main method has finished
    > SPARK-34948    Add ownerReference to executor configmap to fix leakages
    > SPARK-34820    add apt-update before gnupg install
    > SPARK-34361    In case of downscaling avoid killing of executors already
    > known by the scheduler backend in the pod allocator
    > 
    > Bests,
    > Dongjoon.





    --
    Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ 

    -
    To unsubscribe e-mail: dev-unsubscr...@spark.apache.org


-- ---Takeshi Yamamuro
-- Twitter: https://twitter.com/holdenkarauBooks (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 YouTube Live Streams: https://www.youtube.com/user/holdenkarau


-- John Zhuge





Re: Apache Spark 3.1.2 Release?

2021-05-17 Thread Chao Sun
+1. Thanks Dongjoon for doing this!

On Mon, May 17, 2021 at 7:58 PM John Zhuge  wrote:

> +1, thanks Dongjoon!
>
> On Mon, May 17, 2021 at 7:50 PM Yuming Wang  wrote:
>
>> +1.
>>
>> On Tue, May 18, 2021 at 9:06 AM Hyukjin Kwon  wrote:
>>
>>> +1 thanks for driving me
>>>
>>> On Tue, 18 May 2021, 09:33 Holden Karau,  wrote:
>>>
 +1 and thanks for volunteering to be the RM :)

 On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro 
 wrote:

> Thank you, Dongjoon~ sgtm, too.
>
> On Tue, May 18, 2021 at 7:34 AM Cheng Su 
> wrote:
>
>> +1 for a new release, thanks Dongjoon!
>>
>> Cheng Su
>>
>> On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:
>>
>> +1 sounds good. Thanks Dongjoon for volunteering on this!
>>
>>
>> Liang-Chi
>>
>>
>> Dongjoon Hyun-2 wrote
>> > Hi, All.
>> >
>> > Since Apache Spark 3.1.1 tag creation (Feb 21),
>> > new 172 patches including 9 correctness patches and 4 K8s
>> patches arrived
>> > at branch-3.1.
>> >
>> > Shall we make a new release, Apache Spark 3.1.2, as the second
>> release at
>> > 3.1 line?
>> > I'd like to volunteer for the release manager for Apache Spark
>> 3.1.2.
>> > I'm thinking about starting the first RC next week.
>> >
>> > $ git log --oneline v3.1.1..HEAD | wc -l
>> >  172
>> >
>> > # Known correctness issues
>> > SPARK-34534 New protocol FetchShuffleBlocks in
>> OneForOneBlockFetcher
>> > lead to data loss or correctness
>> > SPARK-34545 PySpark Python UDF return inconsistent results
>> when
>> > applying 2 UDFs with different return type to 2 columns together
>> > SPARK-34681 Full outer shuffled hash join when building
>> left side
>> > produces wrong result
>> > SPARK-34719 fail if the view query has duplicated column
>> names
>> > SPARK-34794 Nested higher-order functions broken in DSL
>> > SPARK-34829 transform_values return identical values when
>> it's used
>> > with udf that returns reference type
>> > SPARK-34833 Apply right-padding correctly for correlated
>> subqueries
>> > SPARK-35381 Fix lambda variable name issues in nested
>> DataFrame
>> > functions in R APIs
>> > SPARK-35382 Fix lambda variable name issues in nested
>> DataFrame
>> > functions in Python APIs
>> >
>> > # Notable K8s patches since K8s GA
>> > SPARK-34674Close SparkContext after the Main method has
>> finished
>> > SPARK-34948Add ownerReference to executor configmap to fix
>> leakages
>> > SPARK-34820add apt-update before gnupg install
>> > SPARK-34361In case of downscaling avoid killing of
>> executors already
>> > known by the scheduler backend in the pod allocator
>> >
>> > Bests,
>> > Dongjoon.
>>
>>
>>
>>
>>
>> --
>> Sent from:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>>
>
> --
> ---
> Takeshi Yamamuro
>
 --
 Twitter: https://twitter.com/holdenkarau
 Books (Learning Spark, High Performance Spark, etc.):
 https://amzn.to/2MaRAG9  
 YouTube Live Streams: https://www.youtube.com/user/holdenkarau

>>>
>
> --
> John Zhuge
>


Re: Apache Spark 3.1.2 Release?

2021-05-17 Thread John Zhuge
+1, thanks Dongjoon!

On Mon, May 17, 2021 at 7:50 PM Yuming Wang  wrote:

> +1.
>
> On Tue, May 18, 2021 at 9:06 AM Hyukjin Kwon  wrote:
>
>> +1 thanks for driving me
>>
>> On Tue, 18 May 2021, 09:33 Holden Karau,  wrote:
>>
>>> +1 and thanks for volunteering to be the RM :)
>>>
>>> On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro 
>>> wrote:
>>>
 Thank you, Dongjoon~ sgtm, too.

 On Tue, May 18, 2021 at 7:34 AM Cheng Su 
 wrote:

> +1 for a new release, thanks Dongjoon!
>
> Cheng Su
>
> On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:
>
> +1 sounds good. Thanks Dongjoon for volunteering on this!
>
>
> Liang-Chi
>
>
> Dongjoon Hyun-2 wrote
> > Hi, All.
> >
> > Since Apache Spark 3.1.1 tag creation (Feb 21),
> > new 172 patches including 9 correctness patches and 4 K8s
> patches arrived
> > at branch-3.1.
> >
> > Shall we make a new release, Apache Spark 3.1.2, as the second
> release at
> > 3.1 line?
> > I'd like to volunteer for the release manager for Apache Spark
> 3.1.2.
> > I'm thinking about starting the first RC next week.
> >
> > $ git log --oneline v3.1.1..HEAD | wc -l
> >  172
> >
> > # Known correctness issues
> > SPARK-34534 New protocol FetchShuffleBlocks in
> OneForOneBlockFetcher
> > lead to data loss or correctness
> > SPARK-34545 PySpark Python UDF return inconsistent results
> when
> > applying 2 UDFs with different return type to 2 columns together
> > SPARK-34681 Full outer shuffled hash join when building left
> side
> > produces wrong result
> > SPARK-34719 fail if the view query has duplicated column
> names
> > SPARK-34794 Nested higher-order functions broken in DSL
> > SPARK-34829 transform_values return identical values when
> it's used
> > with udf that returns reference type
> > SPARK-34833 Apply right-padding correctly for correlated
> subqueries
> > SPARK-35381 Fix lambda variable name issues in nested
> DataFrame
> > functions in R APIs
> > SPARK-35382 Fix lambda variable name issues in nested
> DataFrame
> > functions in Python APIs
> >
> > # Notable K8s patches since K8s GA
> > SPARK-34674Close SparkContext after the Main method has
> finished
> > SPARK-34948Add ownerReference to executor configmap to fix
> leakages
> > SPARK-34820add apt-update before gnupg install
> > SPARK-34361In case of downscaling avoid killing of executors
> already
> > known by the scheduler backend in the pod allocator
> >
> > Bests,
> > Dongjoon.
>
>
>
>
>
> --
> Sent from:
> http://apache-spark-developers-list.1001551.n3.nabble.com/
>
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
>

 --
 ---
 Takeshi Yamamuro

>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>

-- 
John Zhuge


Re: Apache Spark 3.1.2 Release?

2021-05-17 Thread Yuming Wang
+1.

On Tue, May 18, 2021 at 9:06 AM Hyukjin Kwon  wrote:

> +1 thanks for driving me
>
> On Tue, 18 May 2021, 09:33 Holden Karau,  wrote:
>
>> +1 and thanks for volunteering to be the RM :)
>>
>> On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro 
>> wrote:
>>
>>> Thank you, Dongjoon~ sgtm, too.
>>>
>>> On Tue, May 18, 2021 at 7:34 AM Cheng Su  wrote:
>>>
 +1 for a new release, thanks Dongjoon!

 Cheng Su

 On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:

 +1 sounds good. Thanks Dongjoon for volunteering on this!


 Liang-Chi


 Dongjoon Hyun-2 wrote
 > Hi, All.
 >
 > Since Apache Spark 3.1.1 tag creation (Feb 21),
 > new 172 patches including 9 correctness patches and 4 K8s patches
 arrived
 > at branch-3.1.
 >
 > Shall we make a new release, Apache Spark 3.1.2, as the second
 release at
 > 3.1 line?
 > I'd like to volunteer for the release manager for Apache Spark
 3.1.2.
 > I'm thinking about starting the first RC next week.
 >
 > $ git log --oneline v3.1.1..HEAD | wc -l
 >  172
 >
 > # Known correctness issues
 > SPARK-34534 New protocol FetchShuffleBlocks in
 OneForOneBlockFetcher
 > lead to data loss or correctness
 > SPARK-34545 PySpark Python UDF return inconsistent results
 when
 > applying 2 UDFs with different return type to 2 columns together
 > SPARK-34681 Full outer shuffled hash join when building left
 side
 > produces wrong result
 > SPARK-34719 fail if the view query has duplicated column names
 > SPARK-34794 Nested higher-order functions broken in DSL
 > SPARK-34829 transform_values return identical values when
 it's used
 > with udf that returns reference type
 > SPARK-34833 Apply right-padding correctly for correlated
 subqueries
 > SPARK-35381 Fix lambda variable name issues in nested
 DataFrame
 > functions in R APIs
 > SPARK-35382 Fix lambda variable name issues in nested
 DataFrame
 > functions in Python APIs
 >
 > # Notable K8s patches since K8s GA
 > SPARK-34674Close SparkContext after the Main method has
 finished
 > SPARK-34948Add ownerReference to executor configmap to fix
 leakages
 > SPARK-34820add apt-update before gnupg install
 > SPARK-34361In case of downscaling avoid killing of executors
 already
 > known by the scheduler backend in the pod allocator
 >
 > Bests,
 > Dongjoon.





 --
 Sent from:
 http://apache-spark-developers-list.1001551.n3.nabble.com/


 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



>>>
>>> --
>>> ---
>>> Takeshi Yamamuro
>>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>


Re: Apache Spark 3.1.2 Release?

2021-05-17 Thread Hyukjin Kwon
+1 thanks for driving me

On Tue, 18 May 2021, 09:33 Holden Karau,  wrote:

> +1 and thanks for volunteering to be the RM :)
>
> On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro 
> wrote:
>
>> Thank you, Dongjoon~ sgtm, too.
>>
>> On Tue, May 18, 2021 at 7:34 AM Cheng Su  wrote:
>>
>>> +1 for a new release, thanks Dongjoon!
>>>
>>> Cheng Su
>>>
>>> On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:
>>>
>>> +1 sounds good. Thanks Dongjoon for volunteering on this!
>>>
>>>
>>> Liang-Chi
>>>
>>>
>>> Dongjoon Hyun-2 wrote
>>> > Hi, All.
>>> >
>>> > Since Apache Spark 3.1.1 tag creation (Feb 21),
>>> > new 172 patches including 9 correctness patches and 4 K8s patches
>>> arrived
>>> > at branch-3.1.
>>> >
>>> > Shall we make a new release, Apache Spark 3.1.2, as the second
>>> release at
>>> > 3.1 line?
>>> > I'd like to volunteer for the release manager for Apache Spark
>>> 3.1.2.
>>> > I'm thinking about starting the first RC next week.
>>> >
>>> > $ git log --oneline v3.1.1..HEAD | wc -l
>>> >  172
>>> >
>>> > # Known correctness issues
>>> > SPARK-34534 New protocol FetchShuffleBlocks in
>>> OneForOneBlockFetcher
>>> > lead to data loss or correctness
>>> > SPARK-34545 PySpark Python UDF return inconsistent results when
>>> > applying 2 UDFs with different return type to 2 columns together
>>> > SPARK-34681 Full outer shuffled hash join when building left
>>> side
>>> > produces wrong result
>>> > SPARK-34719 fail if the view query has duplicated column names
>>> > SPARK-34794 Nested higher-order functions broken in DSL
>>> > SPARK-34829 transform_values return identical values when it's
>>> used
>>> > with udf that returns reference type
>>> > SPARK-34833 Apply right-padding correctly for correlated
>>> subqueries
>>> > SPARK-35381 Fix lambda variable name issues in nested DataFrame
>>> > functions in R APIs
>>> > SPARK-35382 Fix lambda variable name issues in nested DataFrame
>>> > functions in Python APIs
>>> >
>>> > # Notable K8s patches since K8s GA
>>> > SPARK-34674Close SparkContext after the Main method has
>>> finished
>>> > SPARK-34948Add ownerReference to executor configmap to fix
>>> leakages
>>> > SPARK-34820add apt-update before gnupg install
>>> > SPARK-34361In case of downscaling avoid killing of executors
>>> already
>>> > known by the scheduler backend in the pod allocator
>>> >
>>> > Bests,
>>> > Dongjoon.
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Sent from:
>>> http://apache-spark-developers-list.1001551.n3.nabble.com/
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>


Re: Apache Spark 3.1.2 Release?

2021-05-17 Thread Holden Karau
+1 and thanks for volunteering to be the RM :)

On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro 
wrote:

> Thank you, Dongjoon~ sgtm, too.
>
> On Tue, May 18, 2021 at 7:34 AM Cheng Su  wrote:
>
>> +1 for a new release, thanks Dongjoon!
>>
>> Cheng Su
>>
>> On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:
>>
>> +1 sounds good. Thanks Dongjoon for volunteering on this!
>>
>>
>> Liang-Chi
>>
>>
>> Dongjoon Hyun-2 wrote
>> > Hi, All.
>> >
>> > Since Apache Spark 3.1.1 tag creation (Feb 21),
>> > new 172 patches including 9 correctness patches and 4 K8s patches
>> arrived
>> > at branch-3.1.
>> >
>> > Shall we make a new release, Apache Spark 3.1.2, as the second
>> release at
>> > 3.1 line?
>> > I'd like to volunteer for the release manager for Apache Spark
>> 3.1.2.
>> > I'm thinking about starting the first RC next week.
>> >
>> > $ git log --oneline v3.1.1..HEAD | wc -l
>> >  172
>> >
>> > # Known correctness issues
>> > SPARK-34534 New protocol FetchShuffleBlocks in
>> OneForOneBlockFetcher
>> > lead to data loss or correctness
>> > SPARK-34545 PySpark Python UDF return inconsistent results when
>> > applying 2 UDFs with different return type to 2 columns together
>> > SPARK-34681 Full outer shuffled hash join when building left
>> side
>> > produces wrong result
>> > SPARK-34719 fail if the view query has duplicated column names
>> > SPARK-34794 Nested higher-order functions broken in DSL
>> > SPARK-34829 transform_values return identical values when it's
>> used
>> > with udf that returns reference type
>> > SPARK-34833 Apply right-padding correctly for correlated
>> subqueries
>> > SPARK-35381 Fix lambda variable name issues in nested DataFrame
>> > functions in R APIs
>> > SPARK-35382 Fix lambda variable name issues in nested DataFrame
>> > functions in Python APIs
>> >
>> > # Notable K8s patches since K8s GA
>> > SPARK-34674Close SparkContext after the Main method has finished
>> > SPARK-34948Add ownerReference to executor configmap to fix
>> leakages
>> > SPARK-34820add apt-update before gnupg install
>> > SPARK-34361In case of downscaling avoid killing of executors
>> already
>> > known by the scheduler backend in the pod allocator
>> >
>> > Bests,
>> > Dongjoon.
>>
>>
>>
>>
>>
>> --
>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>>
>
> --
> ---
> Takeshi Yamamuro
>
-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: Apache Spark 3.1.2 Release?

2021-05-17 Thread Takeshi Yamamuro
Thank you, Dongjoon~ sgtm, too.

On Tue, May 18, 2021 at 7:34 AM Cheng Su  wrote:

> +1 for a new release, thanks Dongjoon!
>
> Cheng Su
>
> On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:
>
> +1 sounds good. Thanks Dongjoon for volunteering on this!
>
>
> Liang-Chi
>
>
> Dongjoon Hyun-2 wrote
> > Hi, All.
> >
> > Since Apache Spark 3.1.1 tag creation (Feb 21),
> > new 172 patches including 9 correctness patches and 4 K8s patches
> arrived
> > at branch-3.1.
> >
> > Shall we make a new release, Apache Spark 3.1.2, as the second
> release at
> > 3.1 line?
> > I'd like to volunteer for the release manager for Apache Spark 3.1.2.
> > I'm thinking about starting the first RC next week.
> >
> > $ git log --oneline v3.1.1..HEAD | wc -l
> >  172
> >
> > # Known correctness issues
> > SPARK-34534 New protocol FetchShuffleBlocks in
> OneForOneBlockFetcher
> > lead to data loss or correctness
> > SPARK-34545 PySpark Python UDF return inconsistent results when
> > applying 2 UDFs with different return type to 2 columns together
> > SPARK-34681 Full outer shuffled hash join when building left side
> > produces wrong result
> > SPARK-34719 fail if the view query has duplicated column names
> > SPARK-34794 Nested higher-order functions broken in DSL
> > SPARK-34829 transform_values return identical values when it's
> used
> > with udf that returns reference type
> > SPARK-34833 Apply right-padding correctly for correlated
> subqueries
> > SPARK-35381 Fix lambda variable name issues in nested DataFrame
> > functions in R APIs
> > SPARK-35382 Fix lambda variable name issues in nested DataFrame
> > functions in Python APIs
> >
> > # Notable K8s patches since K8s GA
> > SPARK-34674Close SparkContext after the Main method has finished
> > SPARK-34948Add ownerReference to executor configmap to fix
> leakages
> > SPARK-34820add apt-update before gnupg install
> > SPARK-34361In case of downscaling avoid killing of executors
> already
> > known by the scheduler backend in the pod allocator
> >
> > Bests,
> > Dongjoon.
>
>
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
>

-- 
---
Takeshi Yamamuro


Re: Apache Spark 3.1.2 Release?

2021-05-17 Thread Cheng Su
+1 for a new release, thanks Dongjoon!

Cheng Su

On 5/17/21, 2:44 PM, "Liang-Chi Hsieh"  wrote:

+1 sounds good. Thanks Dongjoon for volunteering on this!


Liang-Chi


Dongjoon Hyun-2 wrote
> Hi, All.
> 
> Since Apache Spark 3.1.1 tag creation (Feb 21),
> new 172 patches including 9 correctness patches and 4 K8s patches arrived
> at branch-3.1.
> 
> Shall we make a new release, Apache Spark 3.1.2, as the second release at
> 3.1 line?
> I'd like to volunteer for the release manager for Apache Spark 3.1.2.
> I'm thinking about starting the first RC next week.
> 
> $ git log --oneline v3.1.1..HEAD | wc -l
>  172
> 
> # Known correctness issues
> SPARK-34534 New protocol FetchShuffleBlocks in OneForOneBlockFetcher
> lead to data loss or correctness
> SPARK-34545 PySpark Python UDF return inconsistent results when
> applying 2 UDFs with different return type to 2 columns together
> SPARK-34681 Full outer shuffled hash join when building left side
> produces wrong result
> SPARK-34719 fail if the view query has duplicated column names
> SPARK-34794 Nested higher-order functions broken in DSL
> SPARK-34829 transform_values return identical values when it's used
> with udf that returns reference type
> SPARK-34833 Apply right-padding correctly for correlated subqueries
> SPARK-35381 Fix lambda variable name issues in nested DataFrame
> functions in R APIs
> SPARK-35382 Fix lambda variable name issues in nested DataFrame
> functions in Python APIs
> 
> # Notable K8s patches since K8s GA
> SPARK-34674Close SparkContext after the Main method has finished
> SPARK-34948Add ownerReference to executor configmap to fix leakages
> SPARK-34820add apt-update before gnupg install
> SPARK-34361In case of downscaling avoid killing of executors already
> known by the scheduler backend in the pod allocator
> 
> Bests,
> Dongjoon.





--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ 

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org




Re: Apache Spark 3.1.2 Release?

2021-05-17 Thread Liang-Chi Hsieh
+1 sounds good. Thanks Dongjoon for volunteering on this!


Liang-Chi


Dongjoon Hyun-2 wrote
> Hi, All.
> 
> Since Apache Spark 3.1.1 tag creation (Feb 21),
> new 172 patches including 9 correctness patches and 4 K8s patches arrived
> at branch-3.1.
> 
> Shall we make a new release, Apache Spark 3.1.2, as the second release at
> 3.1 line?
> I'd like to volunteer for the release manager for Apache Spark 3.1.2.
> I'm thinking about starting the first RC next week.
> 
> $ git log --oneline v3.1.1..HEAD | wc -l
>  172
> 
> # Known correctness issues
> SPARK-34534 New protocol FetchShuffleBlocks in OneForOneBlockFetcher
> lead to data loss or correctness
> SPARK-34545 PySpark Python UDF return inconsistent results when
> applying 2 UDFs with different return type to 2 columns together
> SPARK-34681 Full outer shuffled hash join when building left side
> produces wrong result
> SPARK-34719 fail if the view query has duplicated column names
> SPARK-34794 Nested higher-order functions broken in DSL
> SPARK-34829 transform_values return identical values when it's used
> with udf that returns reference type
> SPARK-34833 Apply right-padding correctly for correlated subqueries
> SPARK-35381 Fix lambda variable name issues in nested DataFrame
> functions in R APIs
> SPARK-35382 Fix lambda variable name issues in nested DataFrame
> functions in Python APIs
> 
> # Notable K8s patches since K8s GA
> SPARK-34674Close SparkContext after the Main method has finished
> SPARK-34948Add ownerReference to executor configmap to fix leakages
> SPARK-34820add apt-update before gnupg install
> SPARK-34361In case of downscaling avoid killing of executors already
> known by the scheduler backend in the pod allocator
> 
> Bests,
> Dongjoon.





--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Apache Spark 3.1.2 Release?

2021-05-17 Thread Dongjoon Hyun
Hi, All.

Since Apache Spark 3.1.1 tag creation (Feb 21),
new 172 patches including 9 correctness patches and 4 K8s patches arrived
at branch-3.1.

Shall we make a new release, Apache Spark 3.1.2, as the second release at
3.1 line?
I'd like to volunteer for the release manager for Apache Spark 3.1.2.
I'm thinking about starting the first RC next week.

$ git log --oneline v3.1.1..HEAD | wc -l
 172

# Known correctness issues
SPARK-34534 New protocol FetchShuffleBlocks in OneForOneBlockFetcher
lead to data loss or correctness
SPARK-34545 PySpark Python UDF return inconsistent results when
applying 2 UDFs with different return type to 2 columns together
SPARK-34681 Full outer shuffled hash join when building left side
produces wrong result
SPARK-34719 fail if the view query has duplicated column names
SPARK-34794 Nested higher-order functions broken in DSL
SPARK-34829 transform_values return identical values when it's used
with udf that returns reference type
SPARK-34833 Apply right-padding correctly for correlated subqueries
SPARK-35381 Fix lambda variable name issues in nested DataFrame
functions in R APIs
SPARK-35382 Fix lambda variable name issues in nested DataFrame
functions in Python APIs

# Notable K8s patches since K8s GA
SPARK-34674Close SparkContext after the Main method has finished
SPARK-34948Add ownerReference to executor configmap to fix leakages
SPARK-34820add apt-update before gnupg install
SPARK-34361In case of downscaling avoid killing of executors already
known by the scheduler backend in the pod allocator

Bests,
Dongjoon.