Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-02-06 Thread Hyukjin Kwon
Looks like we resolved all standing issues known so far. I will start another RC next Monday PST. 2021년 2월 4일 (목) 오전 12:03, Kent Yao 님이 작성: > Sending https://github.com/apache/spark/pull/31460 > > Based my research so far, when there is there is an existing > *io.file.buffer.size* in

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-02-03 Thread Kent Yao
Sending https://github.com/apache/spark/pull/31460Based my research so far, when there is there is an existingio.file.buffer.size in hive-site.xml, the hadoopConf finallly get reset by that. In many real-world cases, when interacting with hive catalog through Spark SQL,

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-02-02 Thread Maxim Gekk
Hi All, > Also I am investigating a performance regression in some TPC-DS queries (q88 for instance) that is caused by a recent commit in 3.1 ... I have found that the perf regression is caused by the Hadoop config: io.file.buffer.size = 4096 Before the commit

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-02-02 Thread Hyukjin Kwon
Yeah, agree. I changed. Thanks for the heads up. Tom. 2021년 2월 3일 (수) 오전 8:31, Tom Graves 님이 작성: > ok thanks for the update. That is marked as an improvement, if its a > blocker can we mark it as such and describe why. I searched jiras and > didn't see any critical or blockers open. > > Tom >

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-02-02 Thread Tom Graves
ok thanks for the update. That is marked as an improvement, if its a blocker can we mark it as such and describe why.  I searched jiras and didn't see any critical or blockers open. TomOn Tuesday, February 2, 2021, 05:12:24 PM CST, Hyukjin Kwon wrote: There is one here:

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-02-02 Thread Hyukjin Kwon
There is one here: https://github.com/apache/spark/pull/31440. There look several issues being identified (to confirm that this is an issue in OSS too), and fixed in parallel. There are a bit of unexpected delays here as several issues more were found. I will try to file and share relevant JIRAs

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-02-02 Thread Tom Graves
Just curious if we have an update on next rc? is there a jira for the tpcds issue? Thanks,Tom On Wednesday, January 27, 2021, 05:46:27 PM CST, Hyukjin Kwon wrote: Just to share the current status, most of the known issues were resolved. Let me know if there are some more. One thing

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-27 Thread Hyukjin Kwon
Just to share the current status, most of the known issues were resolved. Let me know if there are some more. One thing left is a performance regression in TPCDS being investigated. Once this is identified (and fixed if it should be), I will cut another RC right away. I roughly expect to cut

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-26 Thread Terry Kim
Hi, Please check if the following regression should be included: https://github.com/apache/spark/pull/31352 Thanks, Terry On Tue, Jan 26, 2021 at 7:54 AM Holden Karau wrote: > If were ok waiting for it, I’d like to get > https://github.com/apache/spark/pull/31298 in as well (it’s not a >

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-26 Thread Holden Karau
If were ok waiting for it, I’d like to get https://github.com/apache/spark/pull/31298 in as well (it’s not a regression but it is a bug fix). On Tue, Jan 26, 2021 at 6:38 AM Hyukjin Kwon wrote: > It looks like a cool one but it's a pretty big one and affects the plans > considerably ... maybe

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-26 Thread Hyukjin Kwon
It looks like a cool one but it's a pretty big one and affects the plans considerably ... maybe it's best to avoid adding it into 3.1.1 in particular during the RC period if this isn't a clear regression that affects many users. 2021년 1월 26일 (화) 오후 11:23, Peter Toth 님이 작성: > Hey, > > Sorry for

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-26 Thread Peter Toth
Hey, Sorry for chiming in a bit late, but I would like to suggest my PR ( https://github.com/apache/spark/pull/28885) for review and inclusion into 3.1.1. Currently, invalid reuse reference nodes appear in many queries, causing performance issues and incorrect explain plans. Now that

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-25 Thread Hyukjin Kwon
Guys, I plan to make an RC as soon as we have no visible issues. I have merged a few correctness issues. There look: - https://github.com/apache/spark/pull/31319 waiting for a review (I will do it too soon). - https://github.com/apache/spark/pull/31336 - I know Max's investigating the perf

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-22 Thread Hyukjin Kwon
Sure, thanks guys. I'll start another RC after the fixes. Looks like we're almost there. On Fri, 22 Jan 2021, 17:47 Wenchen Fan, wrote: > BTW, there is a correctness bug being fixed at > https://github.com/apache/spark/pull/30788 . It's not a regression, but > the fix is very simple and it

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-22 Thread Wenchen Fan
BTW, there is a correctness bug being fixed at https://github.com/apache/spark/pull/30788 . It's not a regression, but the fix is very simple and it would be better to start the next RC after merging that fix. On Fri, Jan 22, 2021 at 3:54 PM Maxim Gekk wrote: > Also I am investigating a

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-21 Thread Maxim Gekk
Also I am investigating a performance regression in some TPC-DS queries (q88 for instance) that is caused by a recent commit in 3.1, highly likely in the period from 19th November, 2020 to 18th December, 2020. Maxim Gekk Software Engineer Databricks, Inc. On Fri, Jan 22, 2021 at 10:45 AM

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-21 Thread Wenchen Fan
-1 as I just found a regression in 3.1. A self-join query works well in 3.0 but fails in 3.1. It's being fixed at https://github.com/apache/spark/pull/31287 On Fri, Jan 22, 2021 at 4:34 AM Tom Graves wrote: > +1 > > built from tarball, verified sha and regular CI and tests all pass. > > Tom > >

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-21 Thread Tom Graves
+1 built from tarball, verified sha and regular CI and tests all pass. Tom On Monday, January 18, 2021, 06:06:42 AM CST, Hyukjin Kwon wrote: Please vote on releasing the following candidate as Apache Spark version 3.1.1. The vote is open until January 22nd 4PM PST and passes if a

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-21 Thread Holden Karau
-- 原始邮件 -- >> *发件人:* "Dongjoon Hyun" ; >> *发送时间:* 2021年1月20日(星期三) 中午1:57 >> *收件人:* "Holden Karau"; >> *抄送:* "Sean Owen";"Hyukjin Kwon"> >;"dev"; >> *主题:* Re: [VOTE] Release Spark 3.1.1 (R

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-20 Thread Mridul Muralidharan
non-binding) > > Thank you, Hyukjin > > Bests, > Ruifeng > > -- 原始邮件 -- > *发件人:* "Dongjoon Hyun" ; > *发送时间:* 2021年1月20日(星期三) 中午1:57 > *收件人:* "Holden Karau"; > *抄送:* "Sean Owen";"Hyukjin Kwon" >

回复: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-20 Thread 郑瑞峰
+1 (non-binding) Thank you, Hyukjin Bests, Ruifeng --原始邮件-- 发件人: "Dongjoon Hyun"

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-20 Thread Terry Kim
+1 (non-binding) (Also ran .NET for Apache Spark E2E tests, which touch many of DataFrame, Function APIs) Thanks, Terry On Wed, Jan 20, 2021 at 6:01 AM Jacek Laskowski wrote: > Hi, > > +1 (non-binding) > > 1. Built locally using AdoptOpenJDK (build 11.0.9+11) with >

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-20 Thread Jacek Laskowski
Hi, +1 (non-binding) 1. Built locally using AdoptOpenJDK (build 11.0.9+11) with -Pyarn,kubernetes,hive-thriftserver,scala-2.12 -DskipTests 2. Ran batch and streaming demos using Spark on Kubernetes (minikube) using spark-shell (client deploy mode) and spark-submit --deploy-mode cluster I

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-19 Thread Dongjoon Hyun
+1 I additionally - Ran JDBC integration test - Ran with AWS EKS 1.16 - Ran unit tests with Python 3.9.1 combination (numpy 1.19.5, pandas 1.2.0, scipy 1.6.0) (PyArrow is not tested because it's not supported in Python 3.9.x. This is documented via SPARK-34162) There exists some on-going work

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-19 Thread Holden Karau
+1, pip installs on Python 3.8 One potential thing we might want to consider if there ends up being another RC is that the error message for installing with Python2 could be clearer. Processing ./pyspark-3.1.1.tar.gz ERROR: Command errored out with exit status 1: command:

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-19 Thread Sean Owen
+1 from me. Same results as in 3.1.0 testing. On Mon, Jan 18, 2021 at 6:06 AM Hyukjin Kwon wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.1.1. > > The vote is open until January 22nd 4PM PST and passes if a majority +1 > PMC votes are cast, with a minimum

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-19 Thread John Zhuge
+1 (non-binding) On Tue, Jan 19, 2021 at 4:08 AM JackyLee wrote: > +1 > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > --

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-19 Thread JackyLee
+1 -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-19 Thread Prashant Sharma
+1 On Tue, Jan 19, 2021 at 4:38 PM Yang,Jie(INF) wrote: > +1 > > > > *发件人**: *Gengliang Wang > *日期**: *2021年1月19日 星期二 下午3:04 > *收件人**: *Jungtaek Lim > *抄送**: *Yuming Wang , Hyukjin Kwon , > dev > *主题**: *Re: [VOTE] Release Spark 3.1.1 (RC1) > > > > +

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-19 Thread Yang,Jie(INF)
+1 发件人: Gengliang Wang 日期: 2021年1月19日 星期二 下午3:04 收件人: Jungtaek Lim 抄送: Yuming Wang , Hyukjin Kwon , dev 主题: Re: [VOTE] Release Spark 3.1.1 (RC1) +1 (non-binding) On Tue, Jan 19, 2021 at 2:05 PM Jungtaek Lim mailto:kabhwan.opensou...@gmail.com>> wrote: +1 (non-binding) * ve

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-18 Thread Gengliang Wang
+1 (non-binding) On Tue, Jan 19, 2021 at 2:05 PM Jungtaek Lim wrote: > +1 (non-binding) > > * verified signature and sha for all files (there's a glitch which I'll > describe in below) > * built source (DISCLAIMER: didn't run tests) and made custom > distribution, and built a docker image

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-18 Thread Jungtaek Lim
+1 (non-binding) * verified signature and sha for all files (there's a glitch which I'll describe in below) * built source (DISCLAIMER: didn't run tests) and made custom distribution, and built a docker image based on the distribution - used profiles: kubernetes, hadoop-3.2, hadoop-cloud * ran

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-18 Thread Yuming Wang
+1. On Tue, Jan 19, 2021 at 7:54 AM Hyukjin Kwon wrote: > I forgot to say :). I'll start with my +1. > > On Mon, 18 Jan 2021, 21:06 Hyukjin Kwon, wrote: > >> Please vote on releasing the following candidate as Apache Spark version >> 3.1.1. >> >> The vote is open until January 22nd 4PM PST and

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-18 Thread Hyukjin Kwon
I forgot to say :). I'll start with my +1. On Mon, 18 Jan 2021, 21:06 Hyukjin Kwon, wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.1.1. > > The vote is open until January 22nd 4PM PST and passes if a majority +1 > PMC votes are cast, with a minimum of 3 +1

[VOTE] Release Spark 3.1.1 (RC1)

2021-01-18 Thread Hyukjin Kwon
Please vote on releasing the following candidate as Apache Spark version 3.1.1. The vote is open until January 22nd 4PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.1.0 [ ] -1 Do not release this package because