Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-25 Thread John Zhuge
+1 (non-binding) Validated checksum and signature; ran RAT checks; tried spark-3.1.2-bin-hadoop2.7 with HMS 1.2. On Tue, May 25, 2021 at 7:24 PM Liang-Chi Hsieh wrote: > +1 (non-binding) > > Binary and doc looks good. JIRA tickets looks good. Ran simple tasks. > > Thank you, Dongjoon! > > >

Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-25 Thread Liang-Chi Hsieh
+1 (non-binding) Binary and doc looks good. JIRA tickets looks good. Ran simple tasks. Thank you, Dongjoon! Hyukjin Kwon wrote > +1 > > 2021년 5월 26일 (수) 오전 9:00, Cheng Su > chengsu@.com > 님이 작성: > >> +1 (non-binding) >> >> >> >> Checked the related commits in commit history manually. >>

Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-25 Thread Hyukjin Kwon
+1 2021년 5월 26일 (수) 오전 9:00, Cheng Su 님이 작성: > +1 (non-binding) > > > > Checked the related commits in commit history manually. > > > > Thanks! > > Cheng Su > > > > *From: *Takeshi Yamamuro > *Date: *Tuesday, May 25, 2021 at 4:47 PM > *To: *Dongjoon Hyun , dev > *Subject: *Re: [VOTE] Release

Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-25 Thread Cheng Su
+1 (non-binding) Checked the related commits in commit history manually. Thanks! Cheng Su From: Takeshi Yamamuro Date: Tuesday, May 25, 2021 at 4:47 PM To: Dongjoon Hyun , dev Subject: Re: [VOTE] Release Spark 3.1.2 (RC1) +1 (non-binding) I ran the tests, checked the related jira tickets,

Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-25 Thread Takeshi Yamamuro
+1 (non-binding) I ran the tests, checked the related jira tickets, and compared TPCDS performance differences between this v3.1.2 candidate and v3.1.1. Everything looks fine. Thank you, Dongjoon! On Wed, May 26, 2021 at 2:32 AM Gengliang Wang wrote: > SGTM. Thanks for the work! > > +1

Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-25 Thread Gengliang Wang
SGTM. Thanks for the work! +1 (non-binding) On Wed, May 26, 2021 at 1:28 AM Dongjoon Hyun wrote: > Thank you, Sean and Gengliang. > > To Gengliang, it looks not that serious to me because that's a doc-only > issue which also can be mitigated simply by updating `facetFilters` from > htmls after

Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-25 Thread Dongjoon Hyun
Thank you, Sean and Gengliang. To Gengliang, it looks not that serious to me because that's a doc-only issue which also can be mitigated simply by updating `facetFilters` from htmls after release. Bests, Dongjoon. On Tue, May 25, 2021 at 9:45 AM Gengliang Wang wrote: > Hi Dongjoon, > > After

Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-25 Thread Gengliang Wang
Hi Dongjoon, After Spark 3.1.1, we need an extra step for updating the DocSearch version index in the release process. I didn't expect Spark 3.1.2 to come at this time so I haven't updated the release process until yesterday. I think we should

Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-25 Thread Sean Owen
+1 same result as in previous tests On Mon, May 24, 2021 at 1:14 AM Dongjoon Hyun wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.1.2. > > The vote is open until May 27th 1AM (PST) and passes if a majority +1 PMC > votes are cast, with a minimum of 3 +1

[SPARK-20384][SQL] Support value class in schema of Dataset (third time's a charm)

2021-05-25 Thread Emil Ejbyfeldt
Hi dev, I am interested getting the support value classes in schemas of Dataset merged and I am willing to work on it. There are two previous PRs created for this JIRA (SPARK-20384) first https://github.com/apache/spark/pull/22309 and more recently https://github.com/apache/spark/pull/27153

Should AggregationIterator.initializeBuffer be moved down to SortBasedAggregationIterator?

2021-05-25 Thread Jacek Laskowski
Hi, Just found out that the only purpose of AggregationIterator.initializeBuffer is to keep SortBasedAggregationIterator happy [1]. Shouldn't it be moved down to SortBasedAggregationIterator to make things clear(er)? [1] https://github.com/apache/spark/search?q=initializeBuffer Pozdrawiam,

Re: [Spark Core]: Adding support for size based partition coalescing

2021-05-25 Thread Wenchen Fan
Without AQE, repartition() simply creates 200 (the value of spark.sql.shuffle.partitions) partitions AFAIK. The AQE helps you to coalesce the partitions into a reasonable number, by size. Note that you need to tune spark.sql.shuffle.partitions to make sure it's big enough, as AQE can not increase

Re: Bridging gap between Spark UI and Code

2021-05-25 Thread Wenchen Fan
You can see the SQL plan node name in the DAG visualization. Please refer to https://spark.apache.org/docs/latest/web-ui.html for more details. If you still have any confusion, please let us know and we will keep improving the document. On Tue, May 25, 2021 at 4:41 AM mhawes wrote: > @Wenchen