Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-15 Thread Sameer Agarwal
In addition to the issues mentioned above, Wenchen and Xiao have flagged two other regressions (https://issues.apache.org/jira/browse/SPARK-23316 and https://issues.apache.org/jira/browse/SPARK-23388) that were merged after RC3 was cut. Due to these, this vote fails. I'll follow-up with an RC4 in

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-15 Thread mrkm4ntr
I agree that this is not a blocker against RC3. It was not appropriate as a vote for RC3. There is no problem if it is in time for release 2.3.0. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-15 Thread Ryan Blue
I agree that SPARK-23413 should be considered a blocker. It isn't unreasonable to run a history server that is used for several versions of Spark. On Thu, Feb 15, 2018 at 7:49 AM, Sean Owen wrote: > SPARK-23381 is probably not a blocker IMHO; it's a nice-to-have to make > some

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-15 Thread Sean Owen
SPARK-23381 is probably not a blocker IMHO; it's a nice-to-have to make some returned values match an external implementation, for code that hasn't been published yet. However I think it's OK to add to the 2.3.0 release if there's going to be another RC. On Wed, Feb 14, 2018 at 10:49 PM Holden

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-15 Thread Marcelo Vanzin
Since it seems there are other issues to fix, I raised SPARK-23413 to blocker status to avoid having to change the disk format of history data in a minor release. On Wed, Feb 14, 2018 at 11:06 PM, Nick Pentreath wrote: > -1 for me as we elevated

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-14 Thread Nick Pentreath
-1 for me as we elevated https://issues.apache.org/jira/browse/SPARK-23377 to a Blocker. It should be fixed before release. On Thu, 15 Feb 2018 at 07:25 Holden Karau wrote: > If this is a blocker in your view then the vote thread is an important > place to mention it.

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-14 Thread Holden Karau
If this is a blocker in your view then the vote thread is an important place to mention it. I'm not super sure all of the places these methods are used so I'll defer to srowen and folks, but for the ML related implications in the past we've allowed people to set the hashing function when we've

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-14 Thread mrkm4ntr
I was advised to post here in the discussion at GitHub. I do not know what to do about the problem that discussions dispersing in two places. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-14 Thread Holden Karau
So it's currently tagged as minor and under consideration for 2.4.0. Do you think this priority is incorrect? This doesn't seem like a regression or a correctness issue so normally we wouldn't hold the release. Of course your free to vote how you choose, just providing some additional context

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-14 Thread mrkm4ntr
I'm -1 because of this issue. I want to fix the hashing implementation in FeatureHasher before FeatureHasher released in 2.3.0. https://issues.apache.org/jira/browse/SPARK-23381 https://github.com/apache/spark/pull/20568 I will fix it soon. -- Sent from:

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-13 Thread Sameer Agarwal
The issue with SPARK-23292 is that we currently run the python tests related to pandas and pyarrow with python 3 (which is already installed on all amplab jenkins machines). Since the code path is fully tested, we decided to not mark it as a blocker; I've reworded the title to better indicate

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-13 Thread Sean Owen
+1 from me. Again, licenses and sigs look fine. I built the source distribution with "-Phive -Phadoop-2.7 -Pyarn -Pkubernetes" and all tests passed. Remaining issues for 2.3.0, none of which are a Blocker: SPARK-22797 Add multiple column support to PySpark Bucketizer SPARK-23083 Adding

Re: [VOTE] Spark 2.3.0 (RC3)

2018-02-12 Thread Sameer Agarwal
I'll start the vote with a +1. As of today, all known release blockers and QA tasks have been resolved, and the jenkins builds are healthy: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/ On 12 February 2018 at 22:30, Sameer Agarwal wrote: >

[VOTE] Spark 2.3.0 (RC3)

2018-02-12 Thread Sameer Agarwal
Now that all known blockers have once again been resolved, please vote on releasing the following candidate as Apache Spark version 2.3.0. The vote is open until Friday February 16, 2018 at 8:00:00 am UTC and passes if a majority of at least 3 PMC +1 votes are cast. [ ] +1 Release this package