Re: YARN Maven build questions

2014-03-04 Thread Tom Graves
the default ones and at least one of the documented ones fail. Cheers, Lars On Fri, Feb 28, 2014 at 3:05 PM, Tom Graves tgraves...@yahoo.com wrote: what build command are you using?    What do you mean when you say YARN branch? The yarn builds have been working fine for me with maven

cloudera repo down again - mqtt

2014-03-14 Thread Tom Graves
It appears the cloudera repo for the mqtt stuff is down again.  Did someone  ping them the last time?   Can we pick this up from some other repo? [ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.4:process (default) on project spark-examples_2.10: Error

Re: cloudera repo down again - mqtt

2014-03-14 Thread Tom Graves
| London On Fri, Mar 14, 2014 at 7:37 AM, Tom Graves tgraves...@yahoo.com wrote: It appears the cloudera repo for the mqtt stuff is down again. Did someone  ping them the last time? Can we pick this up from some other repo? [ERROR] Failed to execute goal org.apache.maven.plugins:maven

Re: Spark 0.9.1 release

2014-03-20 Thread Tom Graves
Thanks for the heads up, saw that and will make sure that is resolved before pulling into 0.9.  Unless I'm missing something, they should just use sc.addJar to distributed the jar rather then relying on SPARK_YARN_APP_JAR. Tom On Thursday, March 20, 2014 3:31 PM, Patrick Wendell

Re: [VOTE] Release Apache Spark 0.9.1 (RC3)

2014-03-31 Thread Tom Graves
I should probably pull this off into another thread, but going forward can we try to not have the release votes end on a weekend? Since we only seem to give 3 days, it makes it really hard for anyone who is offline for the weekend to try it out.   Either that or extend the voting for more then

Re: [VOTE] Release Apache Spark 0.9.1 (RC3)

2014-04-02 Thread Tom Graves
Note I'm +1 with the doc changed to tell users to export SPARK_YARN_MODE=true before using spark-shell on yarn. I tested it on both hadoop 0.23 and 2.3 clusters using secure hdfs on linux. Tom On Tuesday, April 1, 2014 1:44 PM, Tom Graves tgraves...@yahoo.com wrote: No one else has reported

Re: [VOTE] Release Apache Spark 0.9.1 (RC3)

2014-04-03 Thread Tom Graves
I put up a pull request with documentation changes  https://github.com/apache/spark/pull/314   Tom On Wednesday, April 2, 2014 8:47 AM, Tom Graves tgraves...@yahoo.com wrote: Note I'm +1 with the doc changed to tell users to export SPARK_YARN_MODE=true before using spark-shell on yarn. I

Re: [VOTE] Release Apache Spark 1.0.0 (rc9)

2014-05-18 Thread Tom Graves
no ideas off hand, I'll take a look tomorrow. Tom On Sunday, May 18, 2014 7:28 PM, Matei Zaharia matei.zaha...@gmail.com wrote: Alright, I’ve opened https://github.com/apache/spark/pull/819 with the Windows fixes. I also found one other likely bug,

Re: [VOTE] Release Apache Spark 1.0.0 (rc9)

2014-05-20 Thread Tom Graves
I assume we will have an rc10 to fix the issues Matei found? Tom On Sunday, May 18, 2014 9:08 PM, Patrick Wendell pwend...@gmail.com wrote: Hey Matei - the issue you found is not related to security. This patch a few days ago broke builds for Hadoop 1 with YARN support enabled. The patch

Re: [VOTE] Release Apache Spark 1.0.0 (RC10)

2014-05-21 Thread Tom Graves
Has anyone tried pyspark on yarn and got it to work?  I was having issues when I built spark on redhat but when I built on my mac it had worked,  but now when I build it on my mac it also doesn't work. Tom On Tuesday, May 20, 2014 3:14 PM, Tathagata Das tathagata.das1...@gmail.com wrote:

Re: [VOTE] Release Apache Spark 1.0.0 (RC10)

2014-05-21 Thread Tom Graves
I don't think Kevin's issue would be with an api change in YarnClientImpl since in both cases he says he is using hadoop 2.3.0.  I'll take a look at his post in the user list. Tom On Wednesday, May 21, 2014 7:01 PM, Colin McCabe cmcc...@alumni.cmu.edu wrote: Hi Kevin, Can you try

Re: [VOTE] Release Apache Spark 1.0.0 (RC11)

2014-05-28 Thread Tom Graves
+1. Tested spark on yarn (cluster mode, client mode, pyspark, spark-shell) on hadoop 0.23 and 2.4.  Tom On Wednesday, May 28, 2014 3:07 PM, Sean McNamara sean.mcnam...@webtrends.com wrote: Pulled down, compiled, and tested examples on OS X and ubuntu. Deployed app we are building on spark

Re: [VOTE] Release Apache Spark 1.0.0 (RC11)

2014-06-04 Thread Tom Graves
Testing... Resending as it appears my message didn't go through last week. Tom On Wednesday, May 28, 2014 4:12 PM, Tom Graves tgraves...@yahoo.com wrote: +1. Tested spark on yarn (cluster mode, client mode, pyspark, spark-shell) on hadoop 0.23 and 2.4.  Tom On Wednesday, May 28, 2014 3

Re: [VOTE] Release Apache Spark 1.0.1 (RC2)

2014-07-07 Thread Tom Graves
+1. Ran some Spark on yarn jobs on a hadoop 2.4 cluster with authentication on. Tom On Friday, July 4, 2014 2:39 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.0.1! The tag to be voted on is v1.0.1-rc1 (commit

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Tom Graves
+1. Ran spark on yarn on hadoop 0.23 and 2.x. Tom On Wednesday, September 3, 2014 2:25 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):

Re: Spark authenticate enablement

2014-09-15 Thread Tom Graves
Spark authentication does work in standalone mode (atleast it did, I haven't tested it in a while). The same shared secret has to be set on all the daemons (master and workers) and then also in the configs of any applications submitted. Since everyone shares the same secret its by no means

Re: RFC: Deprecating YARN-alpha API's

2014-09-23 Thread Tom Graves
Any other comments or objections on this? Thanks,Tom On Tuesday, September 9, 2014 4:39 PM, Chester Chen ches...@alpinenow.com wrote: We were using it until recently, we are talking to our customers and see if we can get off it. Chester Alpine Data Labs On Tue, Sep 9, 2014 at

Re: [VOTE] Designating maintainers for some Spark components

2014-11-06 Thread Tom Graves
+1. Tom On Wednesday, November 5, 2014 9:21 PM, Matei Zaharia matei.zaha...@gmail.com wrote: BTW, my own vote is obviously +1 (binding). Matei On Nov 5, 2014, at 5:31 PM, Matei Zaharia matei.zaha...@gmail.com wrote: Hi all, I wanted to share a discussion we've been having on

Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-02 Thread Tom Graves
+1 tested on yarn. Tom On Friday, November 28, 2014 11:18 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.2.0! The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-13 Thread Tom Graves
+1 built and tested on Yarn on Hadoop 2.x cluster. Tom On Saturday, December 13, 2014 12:48 AM, Denny Lee denny.g@gmail.com wrote: +1 Tested on OSX Tested Scala 2.10.3, SparkSQL with Hive 0.12 / Hadoop 2.5, Thrift Server, MLLib SVD On Fri Dec 12 2014 at 8:57:16 PM Mark Hamstra

Re: [VOTE] Release Apache Spark 1.3.0 (RC1)

2015-02-20 Thread Tom Graves
Trying to run pyspark on yarn in client mode with basic wordcount example I see the following error when doing the collect: Error from python worker:  /usr/bin/python: No module named sqlPYTHONPATH was: 

Spark Sql reading hive partitioned tables?

2015-04-13 Thread Tom Graves
Hey, I was trying out spark sql using the HiveContext and doing a select on a partitioned table with lots of partitions (16,000+). It took over 6 minutes before it even started the job. It looks like it was querying the Hive metastore and got a good chunk of data back.  Which I'm guessing is

kryo version?

2015-05-06 Thread Tom Graves
Hey folks, I had a customer ask about updating the version of kryo to get fix:  https://github.com/EsotericSoftware/kryo/pull/164 which is in 2.23.Spark currently pull sin chill 0.5.0 which pulls in kryo 2.21.  I don't see a newer version of chill that has updated to kryo 2.23.   Anyone familiar

Re: [discuss] ending support for Java 6?

2015-05-06 Thread Tom Graves
...@databricks.com wrote: OK I sent an email. On Tue, May 5, 2015 at 2:47 PM, shane knapp skn...@berkeley.edu wrote: +1 to an announce to user and dev.  java6 is so old and sad. On Tue, May 5, 2015 at 2:24 PM, Tom Graves tgraves...@yahoo.com wrote: +1. I haven't seen major objections here so

Re: [discuss] ending support for Java 6?

2015-05-05 Thread Tom Graves
+1. I haven't seen major objections here so I would say send announcement and see if any users have objections Tom On Tuesday, May 5, 2015 5:09 AM, Patrick Wendell pwend...@gmail.com wrote: If there is broad consensus here to drop Java 1.6 in Spark 1.5, should we do an ANNOUNCE to

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Tom Graves
+1. Tested spark on yarn against hadoop 2.6. Tom On Wednesday, April 8, 2015 6:15 AM, Sean Owen so...@cloudera.com wrote: Still a +1 from me; same result (except that now of course the UISeleniumSuite test does not fail) On Wed, Apr 8, 2015 at 1:46 AM, Patrick Wendell

Re: [VOTE] Release Apache Spark 1.4.1

2015-06-26 Thread Tom Graves
So is this open for vote then or are we waiting on other things? Tom On Thursday, June 25, 2015 10:32 AM, Andrew Ash and...@andrewash.com wrote: I would guess that many tickets targeted at 1.4.1 were set that way during the tail end of the 1.4.0 voting process as people realized

Re: [VOTE] Release Apache Spark 1.4.1

2015-06-29 Thread Tom Graves
+1. Tested on yarn on hadoop 2.6 cluster Tom On Monday, June 29, 2015 2:04 AM, Tathagata Das tathagata.das1...@gmail.com wrote: @Ted, could you elaborate more on what was the test command that you ran? What profiles, using SBT or Maven?  TD On Sun, Jun 28, 2015 at 12:21 PM, Patrick

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Tom Graves
 While running our regression tests I found  https://issues.apache.org/jira/browse/SPARK-11555.  It is a break in backwards compatibility but its using the old spark-class and --num-workers interface which I hope no one is still using.   I'm a +0 as it doesn't seem super critical but I hate to

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Tom Graves
t work at > all on YARN unless dynamic allocation is on? the fix is easy, but > sounds like it could be a Blocker. > > On Fri, Nov 6, 2015 at 2:51 PM, Tom Graves <tgraves...@yahoo.com> wrote: >>  While running our regression tests I found >> https://issues.apach

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-12 Thread Tom Graves
I know there are multiple things being talked about here, but  I agree with Patrick here, we vote on the source distribution - src tarball (and of course the tag should match).  Perhaps in principle we vote on all the other specific binary distributions since they are generated from source

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-10 Thread Tom Graves
+1 Tom On Thursday, July 9, 2015 12:55 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.1! This release fixes a handful of known issues in Spark 1.4.0, listed here: http://s.apache.org/spark-1.4.1 The tag to

Re: [VOTE] Release Apache Spark 1.5.0 (RC1)

2015-08-25 Thread Tom Graves
Is there a jira to update the sql hive docs?Spark SQL and DataFrames - Spark 1.5.0 Documentation |   | |   |   |   |   |   | | Spark SQL and DataFrames - Spark 1.5.0 DocumentationSpark SQL and DataFrame Guide Overview DataFrames Starting Point: SQLContext Creating DataFrames DataFrame

Re: [VOTE] Release Apache Spark 1.5.0 (RC3)

2015-09-03 Thread Tom Graves
+1. Tested on Yarn with Hadoop 2.6.  A few of the things tested: pyspark, hive integration, aux shuffle handler, history server, basic submit cli behavior, distributed cache behavior, cluster and client mode... Tom On Tuesday, September 1, 2015 3:42 PM, Reynold Xin

Re: [VOTE] Release Apache Spark 1.5.0 (RC1)

2015-08-25 Thread Tom Graves
On Tuesday, August 25, 2015 1:56 PM, Tom Graves tgraves...@yahoo.com.INVALID wrote: Is there a jira to update the sql hive docs?Spark SQL and DataFrames - Spark 1.5.0 Documentation |   | |   |   |   |   |   | | Spark SQL and DataFrames - Spark 1.5.0 DocumentationSpark SQL and DataFrame

Re: [VOTE] Release Apache Spark 1.5.1 (RC1)

2015-09-25 Thread Tom Graves
+1. Tested Spark on Yarn on Hadoop 2.6 and 2.7. Tom On Thursday, September 24, 2015 2:34 AM, Reynold Xin wrote: Please vote on releasing the following candidate as Apache Spark version 1.5.1. The vote is open until Sun, Sep 27, 2015 at 10:00 UTC and passes if

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-18 Thread Tom Graves
+1.  Ran some regression tests on Spark on Yarn (hadoop 2.6 and 2.7). Tom On Wednesday, December 16, 2015 3:32 PM, Michael Armbrust wrote: Please vote on releasing the following candidate as Apache Spark version 1.6.0! The vote is open until Saturday, December

Re: A proposal for Spark 2.0

2015-12-22 Thread Tom Graves
Do we have a summary of all the discussions and what is planned for 2.0 then?   Perhaps we should put on the wiki for reference. Tom On Tuesday, December 22, 2015 12:12 AM, Reynold Xin wrote: FYI I updated the master branch's Spark version to 2.0.0-SNAPSHOT.  On

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-02 Thread Tom Graves
The documentation for the preview release also seem to be missing? Also what happens if we want to do a second preview release?  The naming doesn't seem to allow then unless we call it preview 2. Tom On Wednesday, June 1, 2016 6:27 PM, Sean Owen wrote: On Wed, Jun

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-07 Thread Tom Graves
On Tue, Jun 7, 2016 at 4:01 PM, Tom Graves <tgraves...@yahoo.com> wrote: > I just checked and I don't see the 2.0 preview release at all anymore on > .http://spark.apache.org/downloads.html, is it in transition?    The only > place I can see it is at > http://spark.apache.org/news/

Re: cutting 1.6.2 rc and 2.0.0 rc this week?

2016-06-16 Thread Tom Graves
+1 Tom On Wednesday, June 15, 2016 2:01 PM, Reynold Xin wrote: It's been a while and we have accumulated quite a few bug fixes in branch-1.6. I'm thinking about cutting 1.6.2 rc this week. Any patches somebody want to get in last minute? On a related note, I'm

Re: [discuss] ending support for Java 7 in Spark 2.0

2016-03-29 Thread Tom Graves
+1. Tom On Tuesday, March 29, 2016 1:17 PM, Reynold Xin wrote: They work. On Tue, Mar 29, 2016 at 10:01 AM, Koert Kuipers wrote: if scala prior to sbt 2.10.4 didn't support java 8, does that mean that 3rd party scala libraries compiled with a

Re: [discuss] ending support for Java 7 in Spark 2.0

2016-03-30 Thread Tom Graves
Steve, those are good points, I had forgotten Hadoop had those issues.    We run with jdk 8, hadoop is built for jdk7 compatibility, we are running hadoop 2.7 on our clusters and by the time Spark 2.0 is out I would expected a mix of Hadoop 2.7 and 2.8.  We also don't use spnego. I didn't quite

Re: [DISCUSS] Removing or changing maintainer process

2016-05-19 Thread Tom Graves
+1 (binding) Tom On Thursday, May 19, 2016 10:35 AM, Matei Zaharia wrote: Hi folks, Around 1.5 years ago, Spark added a maintainer process for reviewing API and architectural changes

Re: [VOTE] Removing module maintainer process

2016-05-23 Thread Tom Graves
+1 (binding) Tom On Sunday, May 22, 2016 7:34 PM, Matei Zaharia wrote: It looks like the discussion thread on this has only had positive replies, so I'm going to call a VOTE. The proposal is to remove the maintainer process in

Re: [discuss] separate API annotation into two components: InterfaceAudience & InterfaceStability

2016-05-13 Thread Tom Graves
So we definitely need to be careful here.  I know you didn't mention it but it mentioned by others so I would not recommend using LimitedPrivate.  I had started a discussion on Hadoop about some of this due to the way Spark needed to use some of the

Re: YARN Shuffle service and its compatibility

2016-04-19 Thread Tom Graves
It would be nice if we could keep this compatible between 1.6 and 2.0 so I'm more for Option B at this point since the change made seems minor and we can change to have shuffle service do internally like Marcelo mention. Then lets try to keep compatible, but if there is a forcing function lets

Re: [DISCUSS] Minimize use of MINOR, BUILD, and HOTFIX w/ no JIRA

2016-07-07 Thread Tom Graves
Popping this back up to the dev list again.  I see a bunch of checkins with minor or hotfix.   It seems to me we shouldn't be doing this, but I would like to hear thoughts from others.  I see no reason we can't have a jira for each of those issues, it only takes a few seconds to file one and it

Re: [DISCUSS] Minimize use of MINOR, BUILD, and HOTFIX w/ no JIRA

2016-07-07 Thread Tom Graves
s a JIRA. Also: we have some hot-fixes here that aren't connected to JIRAs. Either they belong with an existing JIRA and aren't tagged correctly, or, again, are patching changes that weren't really trivial enough to skip a JIRA to begin with. On Thu, Jul 7, 2016 at 7:47 PM, Tom Graves <tgrave

Re: [VOTE] Release Apache Spark 2.0.1 (RC4)

2016-09-30 Thread Tom Graves
+1 Tom On Wednesday, September 28, 2016 9:15 PM, Reynold Xin wrote: Please vote on releasing the following candidate as Apache Spark version 2.0.1. The vote is open until Sat, Oct 1, 2016 at 20:00 PDT and passes if a majority of at least 3+1 PMC votes are cast. [

Re: [discuss] separate API annotation into two components: InterfaceAudience & InterfaceStability

2016-08-24 Thread Tom Graves
ping, did this discussion conclude or did we decide what we are doing? Tom On Friday, May 13, 2016 3:19 PM, Michael Armbrust wrote: +1 to the general structure of Reynold's proposal.  I've found what we do currently a little confusing.  In particular, it

Re: [discuss] Spark 2.x release cadence

2016-09-28 Thread Tom Graves
+1 to 4 months. Tom On Tuesday, September 27, 2016 2:07 PM, Reynold Xin wrote: We are 2 months past releasing Spark 2.0.0, an important milestone for the project. Spark 2.0.0 deviated (took 6 month from the regular release cadence we had for the 1.x line, and we

Re: planning & discussion for larger scheduler changes

2017-03-27 Thread Tom Graves
1) I think this depends on individual case by case jira.  I haven't looked in detail at spark-14649 seems much larger although more the way I think we want to go. While SPARK-13669 seems less risky and easily configurable. 2) I don't know whether it needs an entire rewrite but I think there need

Re: planning & discussion for larger scheduler changes

2017-03-30 Thread Tom Graves
imilar things that could be done in other parts of the scheduler. Tom's comments re: (2) are more about performance improvements rather than readability / testability / debuggability, but also seem important and it does seem useful to have a JIRA tracking those. -Kay On Mon, Mar 27, 2017 at 11:06 A

Re: Spark Improvement Proposals

2017-03-13 Thread Tom Graves
I think a vote here would be good. I think most of the discussion was done by 4 or 5 people and its a long thread.  If nothing else it summarizes everything and gets people attention to the change. Tom On Thursday, March 9, 2017 10:55 AM, Sean Owen wrote: I think a

Re: Spark Improvement Proposals

2017-03-13 Thread Tom Graves
change, instead of precipitating a meta-vote.However, the text that's on the web site now can certainly be further amended if anyone wants to propose a change from here. On Mon, Mar 13, 2017 at 1:50 PM Tom Graves <tgraves...@yahoo.com> wrote: I think a vote here would be good.

Re: Spark Improvement Proposals

2017-03-13 Thread Tom Graves
en around a long time with no further comment, and I called several times for more input. That's pretty strong lazy consensus of the form we use every day.  On Mon, Mar 13, 2017 at 5:30 PM Tom Graves <tgraves...@yahoo.com> wrote: It seems like if you are adding responsibilities you sho

Re: Spark Improvement Proposals

2017-03-13 Thread Tom Graves
Another thing I think you should send out is when exactly does this take affect.  Is it any major new feature without a pull request?   Is it anything major starting with the 2.3 release?   Tom On Monday, March 13, 2017 1:08 PM, Tom Graves <tgraves...@yahoo.com.INVALID> wrote:

Re: planning & discussion for larger scheduler changes

2017-03-31 Thread Tom Graves
filed [SPARK-20178] Improve Scheduler fetch failures - ASF JIRA | | | | || | | | | | [SPARK-20178] Improve Scheduler fetch failures - ASF JIRA | | | | Tom On Thursday, March 30, 2017 1:21 PM, Tom Graves <tgraves...@yahoo.com> wrote:

Re: spark pypy support?

2017-08-15 Thread Tom Graves
update that to 2.5+ since we aren't testing with 2.3 anymore? On Mon, Aug 14, 2017 at 3:09 PM, Tom Graves <tgraves...@yahoo.com.invalid> wrote: I tried 5.7 and 2.5.1 so its probably something in my setup.  I'll investigate that more, wanted to make sure it was still supported because

spark pypy support?

2017-08-14 Thread Tom Graves
Anyone know if pypy works with spark. Saw a jira that it was supported back in Spark 1.2 but getting an error when trying and not sure if its something with my pypy version of just something spark doesn't support. AttributeError: 'builtin-code' object has no attribute 'co_filename' Traceback

Re: spark pypy support?

2017-08-14 Thread Tom Graves
yspark-sql', 'pyspark-streaming'] >> >> Starting test(python2.7): pyspark.mllib.tests >> >> Starting test(pypy): pyspark.sql.tests >> >> Starting test(pypy): pyspark.tests >> >> Starting test(pypy): pyspark.streaming.tests >> >> Finished test(pypy

Re: [VOTE] Apache Spark 2.1.1 (RC4)

2017-04-28 Thread Tom Graves
+1 Tom Graves On Thursday, April 27, 2017 5:37 PM, vaquar khan <vaquar.k...@gmail.com> wrote: +1  Regards, Vaquar khan On Apr 27, 2017 4:11 PM, "Holden Karau" <hol...@pigscanfly.ca> wrote: +1 (non-binding) PySpark packaging issue from the earlier RC seems to ha

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

2017-08-01 Thread Tom Graves
+1.  Tom On Monday, July 31, 2017, 12:28:02 PM CDT, Marcelo Vanzin wrote: Hey all, Following the SPIP process, I'm putting this SPIP up for a vote. It's been open for comments as an SPIP for about 3 weeks now, and had been open without the SPIP label for about 9 months

PR permission to kick Jenkins?

2017-05-05 Thread Tom Graves
Does  anyone know how to configure Jenkins to allow committers to tell it to test prs?  I used to have this access but lately it is either not working or only intermittently working. The commands like "ok to test", "test this please", etc.. Thanks,Tom

Re: Thoughts on release cadence?

2017-08-24 Thread Tom Graves
I think we need to up date the FEATURE section on the version policy page to match.  It says feature releases are every 4 months. TomOn Monday, July 31, 2017, 2:23:10 PM CDT, Sean Owen wrote: Done at https://spark.apache.org/versioning-policy.html On Mon, Jul 31, 2017 at

Re: [Vote] SPIP: Continuous Processing Mode for Structured Streaming

2017-11-06 Thread Tom Graves
+1 for the idea and feature, but I think the design is definitely lacking detail on the internal changes needed and how the execution pieces work and the communication.  Are you planning on posting more of those details or were you just planning on discussing in PR? Tom On Wednesday,

Re: Time for 2.1.3

2018-06-15 Thread Tom Graves
+1 for doing a 2.1.3 release.   Tom On Wednesday, June 13, 2018, 7:28:26 AM CDT, Marco Gaido wrote: Yes, you're right Herman. Sorry, my bad. Thanks.Marco 2018-06-13 14:01 GMT+02:00 Herman van Hövell tot Westerflier : Isn't this only a problem with Spark 2.3.x? On Wed, Jun 13, 2018 at

Time for 2.2.2 release

2018-06-06 Thread Tom Graves
(by replying here or updating the bug in Jira), otherwise I'm volunteering to prepare the first RC soon-ish (by early next week since Spark Summit is this week). Thanks!Tom Graves

[VOTE] Spark 2.2.2 (RC2)

2018-06-27 Thread Tom Graves
That being said, if there is something which is a regression that has not been correctly targeted please ping me or a committer to help target the issue. -- Tom Graves

Re: [VOTE] Spark 2.2.2 (RC2)

2018-07-02 Thread Tom Graves
2018년 6월 28일 (목) 오전 8:42, Sean Owen 님이 작성: +1 from me too. On Wed, Jun 27, 2018 at 3:31 PM Tom Graves wrote: Please vote on releasing the following candidate as Apache Spark version 2.2.2. The vote is open until Mon, July 2nd @ 9PM UTC (2PM PDT) and passes if a majority +1 PMC vote

[RESULT] [VOTE] Spark 2.2.2 (RC2)

2018-07-02 Thread Tom Graves
The vote passes. Thanks to all who helped with the release! I'll start publishing everything tomorrow, and an announcement will be sent when artifacts have propagated to the mirrors (probably early next week). +1 (* = binding): - Marcelo Vanzin * - Sean Owen * - Tom Graves * - Holder Kaurau

Re: [VOTE] Spark 2.3.0 (RC2)

2018-02-01 Thread Tom Graves
Testing with spark 2.3 and I see a difference in the sql coalesce talking to hive vs spark 2.2. It seems spark 2.3 ignores the coalesce. Query:spark.sql("SELECT COUNT(DISTINCT(something)) FROM sometable WHERE dt >= '20170301' AND dt <= '20170331' AND something IS NOT

Re: code freeze and branch cut for Apache Spark 2.4

2018-07-30 Thread Tom Graves
Shouldn't this be a discuss thread?   I'm also happy to see more release managers and agree the time is getting close, but we should see what features are in progress and see how close things are and propose a date based on that.  Cutting a branch to soon just creates more work for committers

Re: code freeze and branch cut for Apache Spark 2.4

2018-08-07 Thread Tom Graves
I would like to get clarification on our avro compatibility story before the release.  anyone interested please look at -  https://issues.apache.org/jira/browse/SPARK-24924 . I probably should have filed a separate jira and can if we don't resolve via discussion there. Tom  On Tuesday,

Re: [DISCUSS] Handling correctness/data loss jiras

2018-08-13 Thread Tom Graves
t do it, if it's to an active release branch (see below). Anything that important has to outweigh most any other concern, like behavior changes. On Mon, Aug 13, 2018 at 11:08 AM Tom Graves wrote: I'm not really sure what you mean by this, this proposal is to introduce a process for this type

Re: code freeze and branch cut for Apache Spark 2.4

2018-08-13 Thread Tom Graves
I agree with Imran, we need to fix SPARK-23243 and any correctness issues for that matter. Tom On Wednesday, August 8, 2018, 9:06:43 AM CDT, Imran Rashid wrote: On Tue, Aug 7, 2018 at 8:39 AM, Wenchen Fan wrote: SPARK-23243: Shuffle+Repartition on an RDD could lead to incorrect

Re: [DISCUSS] Handling correctness/data loss jiras

2018-08-13 Thread Tom Graves
n important question as an aside, one we haven't answered: when does a branch go inactive? I am sure 2.0.x is inactive, de facto, along with all 1.x. I think 2.1.x is inactive too. Should we put any rough guidance in place? a branch is maintained for 12-18 months? On Mon, Aug 13, 2018 at 8:45

[DISCUSS] Handling correctness/data loss jiras

2018-08-13 Thread Tom Graves
se Thanks,Tom Graves

Re: [DISCUSS] Handling correctness/data loss jiras

2018-08-17 Thread Tom Graves
as blocker by default.  There is also a label to mark the jira as having something needing to go into the release-notes. Tom On Tuesday, August 14, 2018, 3:32:27 PM CDT, Imran Rashid wrote: +1 on what we should do. On Mon, Aug 13, 2018 at 3:06 PM, Tom Graves wrote: > I mean, w

Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-20 Thread Tom Graves
fyi, I merged in a couple jira that were critical (and I thought would be good to include in the next release) that if we spin another RC will get included, we should update the jira SPARK-24755 and SPARK-24677, if anyone disagrees we could back those out but I think they would be good to

[ANNOUNCE] Apache Spark 2.2.2

2018-07-10 Thread Tom Graves
We are happy to announce the availability of Spark 2.2.2! Apache Spark 2.2.2 is a maintenance release, based on the branch-2.2 maintenance branch of Spark. We strongly recommend all 2.2.x users to upgrade to this stable release. The release notes are available at 

Re: [VOTE] Spark 2.1.3 (RC2)

2018-06-28 Thread Tom Graves
fix (in time) for 2.1.2? http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Spark-2-1-2-RC2-tt22540.html#a22555 Since it isn’t a regression I’d say +1 from me. From: Tom Graves Sent: Thursday, June 28, 2018 6:56:16 AM To: Marcelo Vanzin; Felix Cheung Cc: dev Subject: Re: [VOTE] Spark

Re: [VOTE] Spark 2.1.3 (RC2)

2018-06-28 Thread Tom Graves
: Yes, this is broken with newer version of R. We check explicitly for warning for the R check which should fail the test run. From: Marcelo Vanzin Sent: Wednesday, June 27, 2018 6:55 PM To: Felix Cheung Cc: Marcelo Vanzin; Tom Graves; dev Subject: Re: [VOTE] Spark 2.1.3 (RC2) Not sure I

Re: What's a blocker?

2018-10-25 Thread Tom Graves
So just to clarify a few things in case people didn't read the entire thread in the PR, the discussion is what is the criteria for a blocker and really my concerns are what people are using as criteria for not marking a jira as a blocker. The only thing we have documented to mark a jira as a

Re: What's a blocker?

2018-10-25 Thread Tom Graves
aybe it's reasonable to draw the "must" vs "should" line between them. On Thu, Oct 25, 2018 at 8:51 AM Tom Graves wrote: So just to clarify a few things in case people didn't read the entire thread in the PR, the discussion is what is the criteria for a blocker and reall

Re: Test and support only LTS JDK release?

2018-11-07 Thread Tom Graves
+1 seems reasonable at this point. Tom On Tuesday, November 6, 2018, 1:24:16 PM CST, DB Tsai wrote: Given Oracle's new 6-month release model, I feel the only realistic option is to only test and support JDK such as JDK 11 LTS and future LTS release. I would like to have a discussion

Re: [Structured Streaming] Kafka group.id is fixed

2018-11-19 Thread Tom Graves
This makes sense to me and was going to propose something similar in order to be able to use the kafka acls more effectively as well, can you file a jira for it? Tom On Friday, November 9, 2018, 2:26:12 AM CST, Anastasios Zouzias wrote: Hi all, I run in the following situation with

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-25 Thread Tom Graves
?  then I will vote +0. On Tue, Mar 5, 2019 at 8:25 AM Tom Graves wrote: So to me most of the questions here are implementation/design questions, I've had this issue in the past with SPIP's where I expected to have more high level design details but was basically told that belongs in the design jira

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-01 Thread Tom Graves
+1 for the SPIP. Tom On Friday, March 1, 2019, 8:14:43 AM CST, Xingbo Jiang wrote: Hi all, I want to call for a vote of SPARK-24615. It improves Spark by making it aware of GPUs exposed by cluster managers, and hence Spark can match GPU resources with user task requests properly. The 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-05 Thread Tom Graves
So to me most of the questions here are implementation/design questions, I've had this issue in the past with SPIP's where I expected to have more high level design details but was basically told that belongs in the design jira follow on. This makes me think we need to revisit what a SPIP

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-21 Thread Tom Graves
orm, in some release? and (2) is it *possible* to do this in a safe way?  then I will vote +0. On Tue, Mar 5, 2019 at 8:25 AM Tom Graves wrote: So to me most of the questions here are implementation/design questions, I've had this issue in the past with SPIP's where I expected to have more high l

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-21 Thread Tom Graves
to extend the existing resource allocation mechanisms to handle domain-specific resources, but it does feel to me like we should at least be considering doing that deeper redesign.   On Thu, Mar 21, 2019 at 7:33 AM Tom Graves wrote: Tthe proposal here is that all your resources are static

Jenkins commands?

2019-02-06 Thread Tom Graves
I'm curious if we have it documented anywhere or if there is a good place to look, what exact commands Spark runs in the pull request builds and the QA builds?   Thanks,Tom

Re: Jenkins commands?

2019-02-07 Thread Tom Graves
uot; \    -Pyarn \    -Phive \    -Phive-thriftserver \    -Pkinesis-asl \    -Pmesos \    --fail-at-end \    test  there some some specific rise/amp-lab variables involved (grep -r AMPLAB spark/*) for the build system, but this should cover it. On Wed, Feb 6, 2019 at 3:55 PM Tom Graves wrote: I'm c

[VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

2019-04-16 Thread Tom Graves
... Thanks!Tom Graves

Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

2019-05-29 Thread Tom Graves
of discussions we have come down to just the public API. If the community thinks a new set of public API is maintainable, I don’t see any problem with that. From: Tom Graves Sent: Sunday, May 26, 2019 8:22:59 AM To: hol...@pigscanfly.ca; Reynold Xin Cc: Bobby Evans; DB Tsai; Dongjoon Hyun; Imran Rashid

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

2019-06-21 Thread Tom Graves
+1 (binding) I haven't looked at the low level api, but like the idea and approach to get it started. Tom On Tuesday, June 18, 2019, 10:40:34 PM CDT, Guo, Chenzhao wrote: #yiv1391836063 #yiv1391836063 -- _filtered #yiv1391836063 {font-family:SimSun;panose-1:2 1 6 0 3 1 1 1 1 1;}

Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

2019-04-22 Thread Tom Graves
uld really give concrete ETL cases to prove that it is importantfor us to do so. On Mon, Apr 22, 2019 at 8:27 AM Tom Graves wrote: Based on there is still discussion and Spark Summit is this week, I'mgoing to extend the vote til Friday the 26th. Tom On Monday, April 22, 2019, 8:44:00 AM CDT, B

Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

2019-04-22 Thread Tom Graves
> > Processing Support > > > >  > > > > + (non-binding) > > > > Sent from my iPhone > > > > Pardon the dumb thumb typos :) > > > > > > On Apr 19, 2019, at 10:30 AM, Bryan Cutler wrote: > > > > +1 (non-b

  1   2   >