Re: State of the Build

2015-11-06 Thread Jakob Odersky
Reposting to the list... Thanks for all the feedback everyone, I get a clearer picture of the reasoning and implications now. Koert, according to your post in this thread http://apache-spark-developers-list.1001551.n3.nabble.com/Master-build-fails-tt14895.html#a15023, it is apparently very easy

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Krishna Sankar
+1 (non-binding, of course) (Hope I made it in time. ~T-20 !) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 25:52 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib (iPython 4.0, FYI, notebook install is separate “conda install ipython” and then “conda install

Re: Master build fails ?

2015-11-06 Thread Steve Loughran
> On 5 Nov 2015, at 20:07, Marcelo Vanzin wrote: > > Man that command is slow. Anyway, it seems guava 16 is being brought > transitively by curator 2.6.0 which should have been overridden by the > explicit dependency on curator 2.4.0, but apparently, as Steve > mentioned,

Re: pyspark with pypy not work for spark-1.5.1

2015-11-06 Thread Chang Ya-Hsuan
Hi I run ./python/ru-tests to test following modules of spark-1.5.1: [pyspark-core', 'pyspark-ml', 'pyspark-mllib', 'pyspark-sql', 'pyspark-streaming] against to following pypy versions: pypy-2.2.1 pypy-2.3 pypy-2.3.1 pypy-2.4.0 pypy-2.5.0 pypy-2.5.1 pypy-2.6.0 pypy-2.6.1 pypy-4.0.0

Re: Master build fails ?

2015-11-06 Thread Steve Loughran
> On 6 Nov 2015, at 17:35, Marcelo Vanzin wrote: > > On Fri, Nov 6, 2015 at 2:21 AM, Steve Loughran wrote: >> Maven's closest-first policy has a different flaw, namely that its not >> always obvious why a guava 14.0 that is two hops of

Re: Master build fails ?

2015-11-06 Thread Koert Kuipers
if there is no strong preference for one dependencies policy over another, but consistency between the 2 systems is desired, then i believe maven can be made to behave like ivy pretty easily with a setting in the pom On Fri, Nov 6, 2015 at 5:21 AM, Steve Loughran wrote:

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Luc Bourlier
+1 (non binding) Tested the integration with Mesos in the different configurations. Luc Le jeu. 5 nov. 2015 à 21:02, Nicholas Chammas a écrit : > -0 > > The spark-ec2 version is still set to 1.5.1 >

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Tom Graves
 While running our regression tests I found  https://issues.apache.org/jira/browse/SPARK-11555.  It is a break in backwards compatibility but its using the old spark-class and --num-workers interface which I hope no one is still using.   I'm a +0 as it doesn't seem super critical but I hate to

Re: Master build fails ?

2015-11-06 Thread Ted Yu
Since maven is the preferred build vehicle, ivy style dependencies policy would produce surprising results compared to today's behavior. I would suggest staying with current dependencies policy. My two cents. On Fri, Nov 6, 2015 at 6:25 AM, Koert Kuipers wrote: > if there

Re: [BUILD SYSTEM] quick jenkins downtime, november 5th 7am

2015-11-06 Thread shane knapp
this is happening now. On Thu, Nov 5, 2015 at 11:08 AM, shane knapp wrote: > well, i forgot to put this on my calendar and didn't get around to > getting it done this morning. :) > > anyways, i'll be shooting for tomorrow (friday) morning instead. > > shane > > On Mon, Nov

Re: [BUILD SYSTEM] quick jenkins downtime, november 5th 7am

2015-11-06 Thread shane knapp
and we're back! On Fri, Nov 6, 2015 at 7:39 AM, shane knapp wrote: > this is happening now. > > On Thu, Nov 5, 2015 at 11:08 AM, shane knapp wrote: >> well, i forgot to put this on my calendar and didn't get around to >> getting it done this morning.

Re: Looking for the method executors uses to write to HDFS

2015-11-06 Thread Reynold Xin
Are you looking for this? https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRelation.scala#L69 On Wed, Nov 4, 2015 at 5:11 AM, Tóth Zoltán wrote: > Hi, > > I'd like to write a parquet file from the

Re: Master build fails ?

2015-11-06 Thread Marcelo Vanzin
On Fri, Nov 6, 2015 at 2:21 AM, Steve Loughran wrote: > Maven's closest-first policy has a different flaw, namely that its not always > obvious why a guava 14.0 that is two hops of transitiveness should take > priority over a 16.0 version three hops away. Especially when

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Chester Chen
+1 Test against CDH5.4.2 with hadoop 2.6.0 version using yesterday's code, build locally. Regression running in Yarn Cluster mode against few internal ML ( logistic regression, linear regression, random forest and statistic summary) as well Mlib KMeans. all seems to work fine. Chester On Tue,

Re: [BUILD SYSTEM] quick jenkins downtime, november 5th 7am

2015-11-06 Thread Michael Armbrust
I'm noticing several problems with Jenkins since the upgrade. PR comments say: "Build started sha1 is merged." instead of actually printing the hash Also: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45246/console GitHub pull request #9527 of commit

Re: [BUILD SYSTEM] quick jenkins downtime, november 5th 7am

2015-11-06 Thread shane knapp
a pox on the github pull request builder... the update wiped out the github auth creds. :\ On Fri, Nov 6, 2015 at 12:30 PM, shane knapp wrote: > looking in to this now. > > On Fri, Nov 6, 2015 at 12:28 PM, Michael Armbrust > wrote: >> I'm noticing

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Michael Armbrust
+1 On Fri, Nov 6, 2015 at 9:27 AM, Chester Chen wrote: > +1 > Test against CDH5.4.2 with hadoop 2.6.0 version using yesterday's code, > build locally. > > Regression running in Yarn Cluster mode against few internal ML ( logistic > regression, linear regression, random

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Sean Owen
Hm, if I read that right, looks like --num-executors doesn't work at all on YARN unless dynamic allocation is on? the fix is easy, but sounds like it could be a Blocker. On Fri, Nov 6, 2015 at 2:51 PM, Tom Graves wrote: > While running our regression tests I found >

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Marcelo Vanzin
The way I read Tom's report, it just affects a long-deprecated command line option (--num-workers). I wouldn't block the release for it. On Fri, Nov 6, 2015 at 12:10 PM, Sean Owen wrote: > Hm, if I read that right, looks like --num-executors doesn't work at > all on YARN

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Tom Graves
Its either --num-workers or --num-executors when using the spark-class interface directly.  If you use spark-submit with --num-executors it ends up setting spark.executor.instances which works around the issue. Tom On Friday, November 6, 2015 2:14 PM, Marcelo Vanzin

GraphX EdgePartition format

2015-11-06 Thread Daniel Margo
I was looking through the GraphX source and noticed that the topology of an EdgePartition is a triplet of source, destination, and data columns -- essentially a COO sparse matrix -- sorted by source, and equipped with an index from each (global) vertex ID to the start of its (local) source

Ready to talk about Spark 2.0?

2015-11-06 Thread Sean Owen
Since branch-1.6 is cut, I was going to make version 1.7.0 in JIRA. However I've had a few side conversations recently about Spark 2.0, and I know I and others have a number of ideas about it already. I'll go ahead and make 1.7.0, but thought I'd ask, how much other interest is there in starting

Re: [BUILD SYSTEM] quick jenkins downtime, november 5th 7am

2015-11-06 Thread shane knapp
alright, i'm downgrading our ghprb plugin back to the last known working version. this will require a jenkins restart, which i will do immediately. sorry about this! :( On Fri, Nov 6, 2015 at 12:35 PM, shane knapp wrote: > a pox on the github pull request builder... the

Re: [BUILD SYSTEM] quick jenkins downtime, november 5th 7am

2015-11-06 Thread shane knapp
i (stupidly) updated the ghprb plugin as the version we're using is really, really old. this re-wrote the config and broke stuff. so, i just downgraded the plugin back to the last known working version, and noticed that some of the fields in the xml are missing. thankfully i have a backup

Re: [BUILD SYSTEM] quick jenkins downtime, november 5th 7am

2015-11-06 Thread shane knapp
ok, i think i've kicked jenkins enough that it's now working again w/o spamming tracebacks. sorry for the interruption... i should have realized that touching the house of cards (aka ghprb plugin) would cause it to fall down no matter what i did. :) shane ps - did i mention the hardware for