Re: [VOTE] Decommissioning SPIP

2020-07-01 Thread Marcelo Vanzin
; is at https://www.apache.org/foundation/voting.html. > > Please vote before July 6th at noon: > > [ ] +1: Accept the proposal as an official SPIP > [ ] +0 > [ ] -1: I don't think this is a good idea because ... > > I will start the voting off with a +1 f

Re: [VOTE] Apache Spark 3.0.0 RC1

2020-04-10 Thread Marcelo Vanzin
sn't fixed? > == > In order to make timely releases, we will typically not hold the > release unless the bug in question is a regression from the previous > release. That being said, if there is something which is a regression > that has not been correctly targeted please ping me or a committer to > help target the issue. > > > Note: I fully expect this RC to fail. > > > > -- Marcelo Vanzin van...@gmail.com "Life's too short to drink cheap beer"

Re: Keytab, Proxy User & Principal

2020-03-12 Thread Marcelo Vanzin
. But frankly this feels more like something better taken care of in Livy (e.g. by using KRB5CCNAME when running spark-submit). -- Marcelo Vanzin van...@gmail.com "Life's too short to drink cheap beer"

Re: Jenkins looks hosed

2019-12-23 Thread Marcelo Vanzin
3, 2019 at 12:23 PM Shane Knapp wrote: > > > > > > checking it now. > > > > > > On Mon, Dec 23, 2019 at 11:27 AM Marcelo Vanzin > > > wrote: > > > > > > > > Just in the off-chance that someone with admin access to the Jenkins > > >

Jenkins looks hosed

2019-12-23 Thread Marcelo Vanzin
Just in the off-chance that someone with admin access to the Jenkins servers is around this week... they seem to be in a pretty unhappy state, I can't even load the UI. FYI in case you're waiting for your PR tests to finish (or even start running). -- Marcelo ---

Re: Do we need to finally update Guava?

2019-12-16 Thread Marcelo Vanzin
Great that Hadoop has done it (which, btw, probably means that Spark won't work with that version of Hadoop yet), but Hive also depends on Guava, and last time I tried, even Hive 3.x did not work with Guava 27. (Newer Hadoop versions also have a new artifact that shades a lot of dependencies, whic

Re: dev/merge_spark_pr.py broken on python 2

2019-11-08 Thread Marcelo Vanzin
d 'fix' the > author name if that's the case, or just use python 3. > > On Fri, Nov 8, 2019 at 12:20 PM Marcelo Vanzin wrote: > > > > Something related to non-ASCII characters. Worked fine with python 3. > > > > git branch -D PR_TOOL_MERGE_PR_26426_MASTE

Re: dev/merge_spark_pr.py broken on python 2

2019-11-08 Thread Marcelo Vanzin
hange was on Oct 1, and should have actually helped it > still work with Python 2: > https://github.com/apache/spark/commit/2ec3265ae76fc1e136e44c240c476ce572b679df#diff-c321b6c82ebb21d8fd225abea9b7b74c > > Hasn't otherwise changed in a while. What's the error? > > On Fr

dev/merge_spark_pr.py broken on python 2

2019-11-08 Thread Marcelo Vanzin
Hey all, Something broke that script when running with python 2. I know we want to deprecate python 2, but in that case, scripts should at least be changed to use "python3" in the shebang line... -- Marcelo - To unsubscribe e-

Re: [VOTE] Release Apache Spark 2.3.4 (RC1)

2019-08-28 Thread Marcelo Vanzin
+1 On Mon, Aug 26, 2019 at 1:28 PM Kazuaki Ishizaki wrote: > > Please vote on releasing the following candidate as Apache Spark version > 2.3.4. > > The vote is open until August 29th 2PM PST and passes if a majority +1 PMC > votes are cast, with > a minimum of 3 +1 votes. > > [ ] +1 Release th

Re: [VOTE] Release Apache Spark 2.3.4 (RC1)

2019-08-28 Thread Marcelo Vanzin
(Ah, and the 2.4 RC has the same issue.) On Wed, Aug 28, 2019 at 2:23 PM Marcelo Vanzin wrote: > > Just noticed something before I started to run some tests. The output > of "spark-submit --version" is a little weird, in that it's missing > information (see end of e-ma

Re: [VOTE] Release Apache Spark 2.3.4 (RC1)

2019-08-28 Thread Marcelo Vanzin
Just noticed something before I started to run some tests. The output of "spark-submit --version" is a little weird, in that it's missing information (see end of e-mail). Personally I don't think a lot of that output is super useful (like "Compiled by" or the repo URL), but the branch and revision

Re: [VOTE] Release Apache Spark 2.4.4 (RC3)

2019-08-28 Thread Marcelo Vanzin
+1 On Tue, Aug 27, 2019 at 4:06 PM Dongjoon Hyun wrote: > > Please vote on releasing the following candidate as Apache Spark version > 2.4.4. > > The vote is open until August 30th 5PM PST and passes if a majority +1 PMC > votes are cast, with a minimum of 3 +1 votes. > > [ ] +1 Release this pa

Re: Spark 2.4.0 tests fail with hadoop-3.1 profile: NoClassDefFoundError org.apache.hadoop.hive.conf.HiveConf

2019-04-05 Thread Marcelo Vanzin
endencies in the classpath, is that correct? > > On Fri, Apr 5, 2019 at 10:57 AM Marcelo Vanzin wrote: >> >> The hadoop-3 profile doesn't really work yet, not even on master. >> That's being worked on still. >> >> On Fri, Apr 5, 2019 at 10:53 AM akirill

Re: Spark 2.4.0 tests fail with hadoop-3.1 profile: NoClassDefFoundError org.apache.hadoop.hive.conf.HiveConf

2019-04-05 Thread Marcelo Vanzin
The hadoop-3 profile doesn't really work yet, not even on master. That's being worked on still. On Fri, Apr 5, 2019 at 10:53 AM akirillov wrote: > > Hi there! I'm trying to run Spark unit tests with the following profiles: > > And 'core' module fails with the following test failing with > NoClass

Re: [VOTE] Release Apache Spark 2.4.1 (RC9)

2019-03-28 Thread Marcelo Vanzin
(Anybody knows what's the deal with all the .invalid e-mail addresses?) Anyway. ASF has voting rules, and some things like releases follow specific rules: https://www.apache.org/foundation/voting.html#ReleaseVotes So, for releases, ultimately, the only votes that "count" towards the final tally a

Re: [discuss] 2.4.1-rcX release, k8s client PRs, build system infrastructure update

2019-03-14 Thread Marcelo Vanzin
upgrade >> to take more than 15-20 mins, following which i will re-enable builds. >> >> On Wed, Mar 13, 2019 at 12:17 PM shane knapp wrote: >>> >>> ok awesome. let's shoot for 3pm PST. >>> >>> On Wed, Mar 13, 2019 at 11:59 AM Marcelo

Re: [discuss] 2.4.1-rcX release, k8s client PRs, build system infrastructure update

2019-03-14 Thread Marcelo Vanzin
r the 2.4.1 PR to launch the k8s integration tests. > >>>>> > >>>>> On Wed, Mar 13, 2019 at 2:55 PM shane knapp wrote: > >>>>>> > >>>>>> okie dokie! the time approacheth! > >>>>>> > >>

Re: Request to disable a bot account, 'Thincrs' in JIRA of Apache Spark

2019-03-13 Thread Marcelo Vanzin
Go for it. I would do it now, instead of waiting, since there's been enough time for them to take action. On Wed, Mar 13, 2019 at 4:32 PM Hyukjin Kwon wrote: > > Looks this bot keeps working. I am going to open a INFRA JIRA to block this > bot in few days. > Please let me know if you guys have a

Re: [discuss] 2.4.1-rcX release, k8s client PRs, build system infrastructure update

2019-03-13 Thread Marcelo Vanzin
Sounds good. On Wed, Mar 13, 2019 at 12:17 PM shane knapp wrote: > > ok awesome. let's shoot for 3pm PST. > > On Wed, Mar 13, 2019 at 11:59 AM Marcelo Vanzin wrote: >> >> On Wed, Mar 13, 2019 at 11:53 AM shane knapp wrote: >> > On Wed, Mar 13, 2019 a

Re: [discuss] 2.4.1-rcX release, k8s client PRs, build system infrastructure update

2019-03-13 Thread Marcelo Vanzin
On Wed, Mar 13, 2019 at 11:53 AM shane knapp wrote: > On Wed, Mar 13, 2019 at 11:49 AM Marcelo Vanzin wrote: >> >> Do the upgraded minikube/k8s versions break the current master client >> version too? >> > yes. Ah, so that part kinda sucks. Let's do this

Re: [discuss] 2.4.1-rcX release, k8s client PRs, build system infrastructure update

2019-03-13 Thread Marcelo Vanzin
Do the upgraded minikube/k8s versions break the current master client version too? I'm not super concerned about 2.4 integration tests being broken for a little bit. It's very uncommon for new PRs to be open against branch-2.4 that would affect k8s. But I really don't want master to break. So if

Re: [VOTE] Release Apache Spark 2.4.1 (RC6)

2019-03-08 Thread Marcelo Vanzin
ing this issue. > > Should we create a new rc7? > > DB Tsai | Siri Open Source Technologies [not a contribution] |  Apple, > Inc > > > On Mar 8, 2019, at 10:54 AM, Marcelo Vanzin > > wrote: > > > > I personally find it a little weird to not have the

Re: [VOTE] Release Apache Spark 2.4.1 (RC6)

2019-03-08 Thread Marcelo Vanzin
I personally find it a little weird to not have the commit in branch-2.4. Not that this would happen, but if the v2.4.1-rc6 tag is overwritten (e.g. accidentally) then you lose the reference to that commit, and then the exact commit from which the rc was generated is lost. On Fri, Mar 8, 2019 at

Re: [VOTE] Release Apache Spark 2.4.1 (RC2)

2019-02-20 Thread Marcelo Vanzin
Just wanted to point out that https://issues.apache.org/jira/browse/SPARK-26859 is not in this RC, and is marked as a correctness bug. (The fix is in the 2.4 branch, just not in rc2.) On Wed, Feb 20, 2019 at 12:07 PM DB Tsai wrote: > > Please vote on releasing the following candidate as Apache Sp

Re: merge script stopped working; Python 2/3 input() issue?

2019-02-15 Thread Marcelo Vanzin
BTW the main script has this that the website script does not: if sys.version < '3': input = raw_input # noqa On Fri, Feb 15, 2019 at 3:55 PM Sean Owen wrote: > > I'm seriously confused on this one. The spark-website merge script > just stopped working for me. It fails on the call to input()

Re: merge script stopped working; Python 2/3 input() issue?

2019-02-15 Thread Marcelo Vanzin
You're talking about the spark-website script, right? The main repo's script has been working for me, the website one is broken. I think it was caused by this dude changing raw_input to input recently: commit 8b6e7dceaf5d73de3f92907ceeab8925a2586685 Author: Sean Owen Date: Sat Jan 19 19:02:30

Re: building docker images for GPU

2019-02-12 Thread Marcelo Vanzin
I think I remember someone mentioning a thread about this on the PR discussion, and digging a bit I found this: http://apache-spark-developers-list.1001551.n3.nabble.com/Toward-an-quot-API-quot-for-spark-images-used-by-the-Kubernetes-back-end-td23622.html It started a discussion but I haven't real

Re: [VOTE] Release Apache Spark 2.3.3 (RC2)

2019-02-11 Thread Marcelo Vanzin
+1. Ran our regression tests for YARN and Hive, all look good. On Tue, Feb 5, 2019 at 5:07 PM Takeshi Yamamuro wrote: > > Please vote on releasing the following candidate as Apache Spark version > 2.3.3. > > The vote is open until February 8 6:00PM (PST) and passes if a majority +1 > PMC votes

Re: [VOTE] Release Apache Spark 2.3.3 (RC2)

2019-02-08 Thread Marcelo Vanzin
Hi Takeshi, Since we only really have one +1 binding vote, do you want to extend this vote a bit? I've been stuck on a few things but plan to test this (setting things up now), but it probably won't happen before the deadline. On Tue, Feb 5, 2019 at 5:07 PM Takeshi Yamamuro wrote: > > Please vo

Re: [VOTE] Release Apache Spark 2.3.3 (RC1)

2019-01-23 Thread Marcelo Vanzin
-1 too. I just upgraded https://issues.apache.org/jira/browse/SPARK-26682 to blocker. It's a small fix and we should make it in 2.3.3. On Thu, Jan 17, 2019 at 6:49 PM Takeshi Yamamuro wrote: > > Please vote on releasing the following candidate as Apache Spark version > 2.3.3. > > The vote is op

Re: [DISCUSS] Upgrade built-in Hive to 2.3.4

2019-01-15 Thread Marcelo Vanzin
+1 to that. HIVE-16391 by itself means we're giving up things like Hadoop 3, and we're also putting the burden on the Hive folks to fix a problem that we created. The current PR is basically a Spark-side fix for that bug. It does mean also upgrading Hive (which gives us Hadoop 3, yay!), but I thin

Re: [DISCUSS] Upgrade built-in Hive to 2.3.4

2019-01-15 Thread Marcelo Vanzin
The metastore interactions in Spark are currently based on APIs that are in the Hive exec jar; so that makes it not possible to have Spark work with Hadoop 3 until the exec jar is upgraded. It could be possible to re-implement those interactions based solely on the metastore client Hive publishes;

Re: Spark History UI + Keycloak Integration

2019-01-04 Thread Marcelo Vanzin
On Fri, Jan 4, 2019 at 3:25 AM G, Ajay (Nokia - IN/Bangalore) wrote: ... > Added session handler for all context - > contextHandler.setSessionHandler(new SessionHandler()) ... > Keycloak authentication seems to work, Is this the right approach ? If it is > fine I can submit a PR. I don't reme

Re: Apache Spark git repo moved to gitbox.apache.org

2018-12-10 Thread Marcelo Vanzin
Hmm, it also seems that github comments are being sync'ed to jira. That's gonna get old very quickly, we should probably ask infra to disable that (if we can't do it ourselves). On Mon, Dec 10, 2018 at 9:13 AM Sean Owen wrote: > > Update for committers: now that my user ID is synced, I can > succe

Re: Make Scala 2.12 as default Scala version in Spark 3.0

2018-11-16 Thread Marcelo Vanzin
Now that the switch to 2.12 by default has been made, it might be good to have a serious discussion about dropping 2.11 altogether. Many of the main arguments have already been talked about. But I don't remember anyone mentioning how easy it would be to break the 2.11 build now. For example, the f

Re: which classes/methods are considered as private in Spark?

2018-11-13 Thread Marcelo Vanzin
On Tue, Nov 13, 2018 at 6:26 PM Wenchen Fan wrote: > Recently I updated the MiMa exclusion rules, and found MiMa tracks some > private classes/methods unexpectedly. Could you clarify what you mean here? Mima has some known limitations such as not handling "private[blah]" very well (because that

Re: [ANNOUNCE] Announcing Apache Spark 2.4.0

2018-11-08 Thread Marcelo Vanzin
+user@ >> -- Forwarded message - >> From: Wenchen Fan >> Date: Thu, Nov 8, 2018 at 10:55 PM >> Subject: [ANNOUNCE] Announcing Apache Spark 2.4.0 >> To: Spark dev list >> >> >> Hi all, >> >> Apache Spark 2.4.0 is the fifth release in the 2.x line. This release adds >> Barrier Exe

Re: Test and support only LTS JDK release?

2018-11-06 Thread Marcelo Vanzin
https://www.oracle.com/technetwork/java/javase/eol-135779.html On Tue, Nov 6, 2018 at 2:56 PM Felix Cheung wrote: > > Is there a list of LTS release that I can reference? > > > > From: Ryan Blue > Sent: Tuesday, November 6, 2018 1:28 PM > To: sn...@snazy.de > Cc:

Re: Test and support only LTS JDK release?

2018-11-06 Thread Marcelo Vanzin
+1, that's always been my view. Although, to be fair, and as Sean mentioned, the jump from jdk8 is probably the harder part. After that it's less likely (hopefully?) that we'll run into issues in non-LTS releases. And even if we don't officially support them, trying to keep up with breaking change

Re: [VOTE] SPARK 2.4.0 (RC5)

2018-10-31 Thread Marcelo Vanzin
+1 On Mon, Oct 29, 2018 at 3:22 AM Wenchen Fan wrote: > > Please vote on releasing the following candidate as Apache Spark version > 2.4.0. > > The vote is open until November 1 PST and passes if a majority +1 PMC votes > are cast, with > a minimum of 3 +1 votes. > > [ ] +1 Release this package

Re: Starting to make changes for Spark 3 -- what can we delete?

2018-10-16 Thread Marcelo Vanzin
Might be good to take a look at things marked "@DeveloperApi" and whether they should stay that way. e.g. I was looking at SparkHadoopUtil and I've always wanted to just make it private to Spark. I don't see why apps would need any of those methods. On Tue, Oct 16, 2018 at 10:18 AM Sean Owen wrot

Re: Remove Flume support in 3.0.0?

2018-10-10 Thread Marcelo Vanzin
BTW, although I did not file a bug for that, I think we should also consider getting rid of the kafka-0.8 connector. That would leave only kafka-0.10 as the single remaining dstream connector in Spark, though. (If you ignore kinesis which we can't ship in binary form or something like that?) On We

Re: moving the spark jenkins job builder repo from dbricks --> spark

2018-10-10 Thread Marcelo Vanzin
Thanks for doing this. The more things we have accessible to the project members in general the better! (Now there's that hive fork repo somewhere, but let's not talk about that.) On Wed, Oct 10, 2018 at 9:30 AM shane knapp wrote: >> > * the JJB templates are able to be run by anyone w/jenkins l

Re: [DISCUSS][K8S] Local dependencies with Kubernetes

2018-10-08 Thread Marcelo Vanzin
On Mon, Oct 8, 2018 at 6:36 AM Rob Vesse wrote: > Since connectivity back to the client is a potential stumbling block for > cluster mode I wander if it would be better to think in reverse i.e. rather > than having the driver pull from the client have the client push to the > driver pod? > > Yo

Re: [DISCUSS][K8S] Local dependencies with Kubernetes

2018-10-05 Thread Marcelo Vanzin
On Fri, Oct 5, 2018 at 7:54 AM Rob Vesse wrote: > Ideally this would all just be handled automatically for users in the way > that all other resource managers do I think you're giving other resource managers too much credit. In cluster mode, only YARN really distributes local dependencies, becau

Re: [VOTE] SPARK 2.4.0 (RC1)

2018-09-17 Thread Marcelo Vanzin
You can log in to https://repository.apache.org and see what's wrong. Just find that staging repo and look at the messages. In your case it seems related to your signature. failureMessageNo public key: Key with id: () was not able to be located on http://gpg-keyserver.de/. Upload your public k

Re: data source api v2 refactoring

2018-09-04 Thread Marcelo Vanzin
Same here, I don't see anything from Wenchen... just replies to him. On Sat, Sep 1, 2018 at 9:31 PM Mridul Muralidharan wrote: > > > Is it only me or are all others getting Wenchen’s mails ? (Obviously Ryan did > :-) ) > I did not see it in the mail thread I received or in archives ... [1] > Won

Re: Nightly Builds in the docs (in spark-nightly/spark-master-bin/latest? Can't seem to find it)

2018-08-31 Thread Marcelo Vanzin
I think there still might be an active job publishing stuff. Here's a pretty recent build from master: https://dist.apache.org/repos/dist/dev/spark/2.4.0-SNAPSHOT-2018_08_31_12_02-32da87d-docs/_site/index.html But it seems only docs are being published, which makes me think it's those builds that

Re: [discuss] replacing SPIP template with Heilmeier's Catechism?

2018-08-31 Thread Marcelo Vanzin
I like the questions (aside maybe from the cost one which perhaps does not matter much here), especially since they encourage explaining things in a more plain language than generally used by specs. But I don't think we can ignore design aspects; it's been my observation that a good portion of SPI

Re: [VOTE] SPIP: Executor Plugin (SPARK-24918)

2018-08-28 Thread Marcelo Vanzin
just about all your code > needs this init; I had understood the use cases to be more like "establish > some local config and init for this particular thing I'm doing for this > legacy system". > > On Tue, Aug 28, 2018 at 11:35 AM Marcelo Vanzin wrote: >> >> +1

Re: [VOTE] SPIP: Executor Plugin (SPARK-24918)

2018-08-28 Thread Marcelo Vanzin
+1 Class init is not enough because there is nowhere for you to force a random class to be initialized. This is basically adding that mechanism, instead of forcing people to add hacks using e.g. mapPartitions which don't even cover all scenarios. On Tue, Aug 28, 2018 at 7:09 AM, Sean Owen wrote:

Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-24 Thread Marcelo Vanzin
I think this would be useful, but I also share Saisai's and Marco's concern about the extra step when shutting down the application. If that could be minimized this would be a much more interesting feature. e.g. you could upload logs incrementally to HDFS, asynchronously, while the app is running.

Re: [DISCUSS] SparkR support on k8s back-end for Spark 2.4

2018-08-15 Thread Marcelo Vanzin
On Wed, Aug 15, 2018 at 1:35 PM, shane knapp wrote: > in fact, i don't see us getting rid of all of the centos machines until EOY > (see my above comment, re docs, release etc). these are the builds that > will remain on centos for the near future: > https://rise.cs.berkeley.edu/jenkins/label/spa

Re: Cleaning Spark releases from mirrors, and the flakiness of HiveExternalCatalogVersionsSuite

2018-08-13 Thread Marcelo Vanzin
On this topic... when I worked on 2.3.1 and caused this breakage by deleting and old release, I tried to write some code to make this more automatic: https://github.com/vanzin/spark/tree/SPARK-24532 I just found that the code was a little too large and hacky for what it does (find out the latest

[RESULT] [VOTE] Spark 2.1.3 (RC2)

2018-06-29 Thread Marcelo Vanzin
The vote passes. Thanks to all who helped with the release! I'll start publishing everything today, and an announcement will be sent when artifacts have propagated to the mirrors (probably early next week). +1 (* = binding): - Marcelo Vanzin * - Sean Owen * - Felix Cheung * - Tom Graves

Re: [VOTE] Spark 2.1.3 (RC2)

2018-06-28 Thread Marcelo Vanzin
ackport to older branches. On Thu, Jun 28, 2018 at 11:30 AM, Felix Cheung wrote: > If I recall we stop releasing Hadoop 2.3 or 2.4 in newer releases (2.2+?) - > that might be why they are not the release script. > > > ________ > From: Marcelo Vanzin > S

Re: [VOTE] Spark 2.1.3 (RC2)

2018-06-28 Thread Marcelo Vanzin
Alright, uploaded the missing packages. I'll send a PR to update the release scripts just in case... On Thu, Jun 28, 2018 at 10:08 AM, Sean Owen wrote: > If it's easy enough to produce them, I agree you can just add them to the RC > dir. > > On Thu, Jun 28, 2018 at 1

Re: [VOTE] Spark 2.1.3 (RC2)

2018-06-28 Thread Marcelo Vanzin
new RC. On Tue, Jun 26, 2018 at 1:25 PM, Marcelo Vanzin wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.1.3. > > The vote is open until Fri, June 29th @ 9PM UTC (2PM PDT) and passes if a > majority +1 PMC votes are cast, with a minimu

Re: [VOTE] Spark 2.1.3 (RC2)

2018-06-28 Thread Marcelo Vanzin
BTW that would be a great fix in the docs now that we'll have a 2.3.2 being prepared. On Thu, Jun 28, 2018 at 9:17 AM, Felix Cheung wrote: > Exactly... > > ____ > From: Marcelo Vanzin > Sent: Thursday, June 28, 2018 9:16:08 AM > To: Tom Graves

Re: [VOTE] Spark 2.1.3 (RC2)

2018-06-28 Thread Marcelo Vanzin
> > http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Spark-2-1-2-RC2-tt22540.html#a22555 > > Since it isn’t a regression I’d say +1 from me. > > > > From: Tom Graves > Sent: Thursday, June 28, 2018 6:56:16 AM > To: Marc

Re: Time for 2.3.2?

2018-06-28 Thread Marcelo Vanzin
Thu, Jun 28, 2018 at 12:56 PM Saisai Shao >>>>> wrote: >>>>> >>>>>> +1, like mentioned by Marcelo, these issues seems quite severe. >>>>>> >>>>>> I can work on the release if short of hands :). >>>>>> >&

Re: Time for 2.3.2?

2018-06-27 Thread Marcelo Vanzin
+1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes for those out. (Those are what delayed 2.2.2 and 2.1.3 for those watching...) On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan wrote: > Hi all, > > Spark 2.3.1 was released just a while ago, but unfortunately we discovered > and f

Re: [VOTE] Spark 2.1.3 (RC2)

2018-06-27 Thread Marcelo Vanzin
lakes: https://amplab.cs.berkeley.edu/jenkins/user/vanzin/my-views/view/Spark/ (Look for the 2.1 branch jobs.) > ____ > From: Marcelo Vanzin > Sent: Wednesday, June 27, 2018 6:55 PM > To: Felix Cheung > Cc: Marcelo Vanzin; Tom Graves; dev > > Sub

Re: [VOTE] Spark 2.1.3 (RC2)

2018-06-27 Thread Marcelo Vanzin
contrasts = NULL, ...) > Argument names in code not in docs: > singular.ok > Mismatches in argument names: > Position: 16 Code: singular.ok Docs: contrasts > Position: 17 Code: contrasts Docs: ... > > > From: Sean Owen > Sent: Wednesday, June

Re: [VOTE] Spark 2.2.2 (RC2)

2018-06-27 Thread Marcelo Vanzin
+1 Checked sigs + ran a bunch of tests on the hadoop-2.7 binary package. On Wed, Jun 27, 2018 at 1:30 PM, Tom Graves wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.2.2. > > The vote is open until Mon, July 2nd @ 9PM UTC (2PM PDT) and passes if a > majority

[VOTE] Spark 2.1.3 (RC2)

2018-06-26 Thread Marcelo Vanzin
Please vote on releasing the following candidate as Apache Spark version 2.1.3. The vote is open until Fri, June 29th @ 9PM UTC (2PM PDT) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 2.1.3 [ ] -1 Do not release this pack

Re: [VOTE] Spark 2.1.3 (RC2)

2018-06-26 Thread Marcelo Vanzin
Starting with my own +1. On Tue, Jun 26, 2018 at 1:25 PM, Marcelo Vanzin wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.1.3. > > The vote is open until Fri, June 29th @ 9PM UTC (2PM PDT) and passes if a > majority +1 PMC votes are cast, with

Re: Time for 2.1.3

2018-06-19 Thread Marcelo Vanzin
long). On Tue, Jun 12, 2018 at 4:27 PM, Marcelo Vanzin wrote: > Hey all, > > There are some fixes that went into 2.1.3 recently that probably > deserve a release. So as usual, please take a look if there's anything > else you'd like on that release, otherwise I'd l

Re: [ANNOUNCE] Announcing Apache Spark 2.3.1

2018-06-14 Thread Marcelo Vanzin
structured-streaming > Mastering Kafka Streams https://bit.ly/mastering-kafka-streams > Follow me at https://twitter.com/jaceklaskowski > > On Mon, Jun 11, 2018 at 9:47 PM, Marcelo Vanzin wrote: >> >> We are happy to announce the availability of Spark 2.3.1! >> >

Re: Missing HiveConf when starting PySpark from head

2018-06-14 Thread Marcelo Vanzin
Yes, my bad. The code in session.py needs to also catch TypeError like before. On Thu, Jun 14, 2018 at 11:03 AM, Li Jin wrote: > Sounds good. Thanks all for the quick reply. > > https://issues.apache.org/jira/browse/SPARK-24563 > > > On Thu, Jun 14, 2018 at 12:19 PM, Xiao Li wrote: >> >> Thanks

Time for 2.1.3

2018-06-12 Thread Marcelo Vanzin
Hey all, There are some fixes that went into 2.1.3 recently that probably deserve a release. So as usual, please take a look if there's anything else you'd like on that release, otherwise I'd like to start with the process by early next week. I'll go through jira to see what's the status of thing

[ANNOUNCE] Announcing Apache Spark 2.3.1

2018-06-11 Thread Marcelo Vanzin
We are happy to announce the availability of Spark 2.3.1! Apache Spark 2.3.1 is a maintenance release, based on the branch-2.3 maintenance branch of Spark. We strongly recommend all 2.3.x users to upgrade to this stable release. To download Spark 2.3.1, head over to the download page: http://spar

[VOTE] [RESULT] Spark 2.3.1 (RC4)

2018-06-08 Thread Marcelo Vanzin
The vote passes. Thanks to all who helped with the release! I'll follow up later with a release announcement once everything is published. +1 (* = binding): - Marcelo Vanzin * - Reynold Xin * - Sean Owen * - Denny Lee - Dongjoon Hyun - Ricardo Almeida - Hyukjin Kwon - John Zhuge - Mark Ha

Re: Scala 2.12 support

2018-06-07 Thread Marcelo Vanzin
But DB's shell output is on the most recent 2.11, not 2.12, right? On Thu, Jun 7, 2018 at 5:54 PM, Holden Karau wrote: > I agree that's a little odd, could we not add the bacspace terminal > character? Regardless even if not, I don't think that should be a blocker > for 2.12 support especially si

Re: Time for 2.2.2 release

2018-06-07 Thread Marcelo Vanzin
Took a look at our branch and most of the stuff that is not already in 2.2 are flaky test fixes, so +1. On Wed, Jun 6, 2018 at 7:54 AM, Tom Graves wrote: > Hello all, > > I think its time for another 2.2 release. > I took a look at Jira and I don't see anything explicitly targeted for 2.2.2 > tha

Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-02 Thread Marcelo Vanzin
3 didn’t work for me > either (even building with -Phadoop-2.7). I guess I’ve been relying on an > unsupported pattern and will need to figure something else out going forward > in order to use s3a://. > > > On Fri, Jun 1, 2018 at 9:09 PM Marcelo Vanzin wrote: >> >&g

Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Marcelo Vanzin
project) and figure > out what I need to change (as due diligence for Flintrock’s users). > > Nick > > > On Fri, Jun 1, 2018 at 8:21 PM Marcelo Vanzin wrote: >> >> Using the hadoop-aws package is probably going to be a little more >> complicated than that.

Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Marcelo Vanzin
local-m2-cache: tried > > file:/home/ec2-user/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar > > I’d guess I’m probably using the wrong version of hadoop-aws, but I called > make-distribution.sh with -Phadoop-2.8 so I’m not sure what else to try. > >

Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Marcelo Vanzin
Starting with my own +1 (binding). On Fri, Jun 1, 2018 at 3:28 PM, Marcelo Vanzin wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.3.1. > > Given that I expect at least a few people to be busy with Spark Summit next > week, I'm taking th

[VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Marcelo Vanzin
Please vote on releasing the following candidate as Apache Spark version 2.3.1. Given that I expect at least a few people to be busy with Spark Summit next week, I'm taking the liberty of setting an extended voting period. The vote will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PD

Re: [VOTE] Spark 2.3.1 (RC3)

2018-06-01 Thread Marcelo Vanzin
gain. On Fri, Jun 1, 2018 at 1:20 PM, Xiao Li wrote: > Sorry, I need to say -1 > > This morning, just found a regression in 2.3.1 and reverted > https://github.com/apache/spark/pull/21443 > > Xiao > > 2018-06-01 13:09 GMT-07:00 Marcelo Vanzin : >> >> Pleas

[VOTE] Spark 2.3.1 (RC3)

2018-06-01 Thread Marcelo Vanzin
Please vote on releasing the following candidate as Apache Spark version 2.3.1. Given that I expect at least a few people to be busy with Spark Summit next week, I'm taking the liberty of setting an extended voting period. The vote will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PD

Re: [VOTE] Spark 2.3.1 (RC2)

2018-05-25 Thread Marcelo Vanzin
ise we end up creating throwaway RCs that are just overhead. On Tue, May 22, 2018 at 12:45 PM, Marcelo Vanzin wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.3.1. > > The vote is open until Friday, May 25, at 20:00 UTC and passes if > at leas

Re: [VOTE] Spark 2.3.1 (RC2)

2018-05-23 Thread Marcelo Vanzin
n >> > discuss if we should do a new release for 2.0, 2.1, 2.2 later. >> > >> > Thanks, >> > Wenchen >> > >> > On Wed, May 23, 2018 at 9:54 PM, Sean Owen < >> >> > srowen@ >> >> > > wrote: >> > >

Re: [VOTE] Spark 2.3.1 (RC2)

2018-05-22 Thread Marcelo Vanzin
Starting with my own +1. Did the same testing as RC1. On Tue, May 22, 2018 at 12:45 PM, Marcelo Vanzin wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.3.1. > > The vote is open until Friday, May 25, at 20:00 UTC and passes if > at least 3 +

[VOTE] Spark 2.3.1 (RC2)

2018-05-22 Thread Marcelo Vanzin
Please vote on releasing the following candidate as Apache Spark version 2.3.1. The vote is open until Friday, May 25, at 20:00 UTC and passes if at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 2.3.1 [ ] -1 Do not release this package because ... To learn more about

Re: [VOTE] Spark 2.3.1 (RC1)

2018-05-21 Thread Marcelo Vanzin
FYI the fix for the blocker has just been committed. I'll prepare RC2 tomorrow morning assuming jenkins is reasonably happy with the current state of the branch. On Fri, May 18, 2018 at 10:39 AM, Marcelo Vanzin wrote: > Just to give folks an update. > > In case you haven't

Re: Running lint-java during PR builds?

2018-05-21 Thread Marcelo Vanzin
- all of ASF shares one queue. > > At the number of PRs Spark has this could be a big issue. > > > ________ > From: Marcelo Vanzin > Sent: Monday, May 21, 2018 9:08:28 AM > To: Hyukjin Kwon > Cc: Dongjoon Hyun; dev > Subject: Re: Running lint-java

Re: Running lint-java during PR builds?

2018-05-21 Thread Marcelo Vanzin
6351319 >>> >>> Actually, I've been monitoring the history here. (It's synced every 30 >>> minutes.) >>> >>> https://travis-ci.org/dongjoon-hyun/spark/builds >>> >>> Could we give a change to this? >>> >>> Be

Re: [VOTE] Spark 2.3.1 (RC1)

2018-05-18 Thread Marcelo Vanzin
; pretty serious. I've marked it a blocker, I think it should go into 2.3.1. > I'll also take a closer look comparing to the behavior of the old listener > bus. > > On Thu, May 17, 2018 at 12:18 PM, Marcelo Vanzin > wrote: >> >> Wenchen reviewed and pushed th

Re: [VOTE] Spark 2.3.1 (RC1)

2018-05-17 Thread Marcelo Vanzin
Wenchen reviewed and pushed that change, so he's the most qualified to make that decision. I plan to cut a new RC tomorrow so hopefully he'll see this by then. On Thu, May 17, 2018 at 10:13 AM, Artem Rudoy wrote: > Can we include https://issues.apache.org/jira/browse/SPARK-22371 as well > please

Re: [VOTE] Spark 2.3.1 (RC1)

2018-05-16 Thread Marcelo Vanzin
n only bugfixes. >> >> 2018-05-16 12:11 GMT+02:00 kant kodali : >>> >>> Can this https://issues.apache.org/jira/browse/SPARK-23406 be part of >>> 2.3.1? >>> >>> On Tue, May 15, 2018 at 2:07 PM, Marcelo Vanzin >>> wrote: &g

Re: [VOTE] Spark 2.3.1 (RC1)

2018-05-15 Thread Marcelo Vanzin
e. > > https://issues.apache.org/jira/browse/SPARK-24259 > > Xiao > > > 2018-05-15 14:00 GMT-07:00 Marcelo Vanzin : >> >> Please vote on releasing the following candidate as Apache Spark version >> 2.3.1. >> >> The vote is open until Friday, May 18, at 21:00 UTC

Re: [VOTE] Spark 2.3.1 (RC1)

2018-05-15 Thread Marcelo Vanzin
It's in. That link is only a list of the currently open bugs. On Tue, May 15, 2018 at 2:02 PM, Justin Miller wrote: > Did SPARK-24067 not make it in? I don’t see it in https://s.apache.org/Q3Uo. > > Thanks, > Justin > > On May 15, 2018, at 3:00 PM, Marcelo Vanzin wr

Re: [VOTE] Spark 2.3.1 (RC1)

2018-05-15 Thread Marcelo Vanzin
e RC ready. Still learning the ropes. Also, if you plan on doing this in the future, *do not* do "svn co" on the dist.apache.org repo. The ASF Infra folks will not be very kind to you. I'll update our RM docs later. On Tue, May 15, 2018 at 2:00 PM, Marcelo Vanzin wrote: > Please vot

[VOTE] Spark 2.3.1 (RC1)

2018-05-15 Thread Marcelo Vanzin
Please vote on releasing the following candidate as Apache Spark version 2.3.1. The vote is open until Friday, May 18, at 21:00 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 2.3.1 [ ] -1 Do not release this package because ... To le

Time for 2.3.1?

2018-05-10 Thread Marcelo Vanzin
Hello all, It's been a while since we shipped 2.3.0 and lots of important bug fixes have gone into the branch since then. I took a look at Jira and it seems there's not a lot of things explicitly targeted at 2.3.1 - the only potential blocker (a parquet issue) is being worked on since a new parque

Re: Spark UI Source Code

2018-05-07 Thread Marcelo Vanzin
On Mon, May 7, 2018 at 1:44 AM, Anshi Shrivastava wrote: > I've found a KVStore wrapper which stores all the metrics in a LevelDb > store. This KVStore wrapper is available as a spark-dependency but we cannot > access the metrics directly from spark since they are all private. I'm not sure what i

Re: time for Apache Spark 3.0?

2018-04-05 Thread Marcelo Vanzin
On Thu, Apr 5, 2018 at 10:30 AM, Matei Zaharia wrote: > Sorry, but just to be clear here, this is the 2.12 API issue: > https://issues.apache.org/jira/browse/SPARK-14643, with more details in this > doc: > https://docs.google.com/document/d/1P_wmH3U356f079AYgSsN53HKixuNdxSEvo8nw_tgLgM/edit. > >

  1   2   3   4   >