Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Burak Yavuz
+1 Tested on Mac OS X Burak On Thu, Jun 4, 2015 at 6:35 PM, Calvin Jia jia.cal...@gmail.com wrote: +1 Tested with input from Tachyon and persist off heap. On Thu, Jun 4, 2015 at 6:26 PM, Timothy Chen tnac...@gmail.com wrote: +1 Been testing cluster mode and client mode with mesos with

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 25:42 min (My brand new shiny MacBookPro12,1 : 16GB. Inaugurated the machine with compile test 1.4.0-RC4 !) mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests 2. Tested

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Reynold Xin
Enjoy your new shiny mbp. On Fri, Jun 5, 2015 at 12:10 AM, Krishna Sankar ksanka...@gmail.com wrote: +1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 25:42 min (My brand new shiny MacBookPro12,1 : 16GB. Inaugurated the machine with compile test 1.4.0-RC4 !)

Re: Regarding Connecting spark to Mesos documentation

2015-06-05 Thread François Garillot
The make-distribution script will indeed take maven options. If you want to add this to the documentation, one possibility is that you could supplement the information in that file : https://github.com/apache/spark/blob/master/docs/running-on-mesos.md with a pull-request. You'll also find

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Sean Owen
Everything checks out again, and the tests pass for me on Ubuntu + Java 7 with '-Pyarn -Phadoop-2.6', except that I always get SparkSubmitSuite errors like ... - success sanity check *** FAILED *** java.lang.RuntimeException: [download failed:

Re: PySpark on PyPi

2015-06-05 Thread Josh Rosen
This has been proposed before: https://issues.apache.org/jira/browse/SPARK-1267 There's currently tighter coupling between the Python and Java halves of PySpark than just requiring SPARK_HOME to be set; if we did this, I bet we'd run into tons of issues when users try to run a newer version of

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Kousuke Saruta
+1 Built on Mac OS X with -Dhadoop.version=2.4.0 -Pyarn -Phive -Phive-thriftserver. Tested on YARN (cluster/client) on CentOS 7. Also WebUI incuding DAG and Timeline View work. On 2015/06/05 15:01, Burak Yavuz wrote: +1 Tested on Mac OS X Burak On Thu, Jun 4, 2015 at 6:35 PM, Calvin Jia

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Yin Huai
Sean, Can you add -Phive -Phive-thriftserver and try those Hive tests? Thanks, Yin On Fri, Jun 5, 2015 at 5:19 AM, Sean Owen so...@cloudera.com wrote: Everything checks out again, and the tests pass for me on Ubuntu + Java 7 with '-Pyarn -Phadoop-2.6', except that I always get

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Ram Sriharsha
+1 , tested with hadoop 2.6/ yarn on centos 6.5 after building w/ -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver and ran a few SQL tests and the ML examples On Fri, Jun 5, 2015 at 10:55 AM, Hari Shreedharan hshreedha...@cloudera.com wrote: +1. Build looks good, ran a

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Bobby Chowdary
Not sure if its a blocker but there might be a minor issue with hive context, there is also a work around *Works:* from pyspark.sql import HiveContext sqlContext = HiveContext(sc) df = sqlContext.sql(select * from test.test1) *Does not Work:* df = sqlContext.table(test.test1) Py4JJavaError:

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Sandy Ryza
+1 (non-binding) Built from source and ran some jobs against a pseudo-distributed YARN cluster. -Sandy On Fri, Jun 5, 2015 at 11:05 AM, Ram Sriharsha sriharsha@gmail.com wrote: +1 , tested with hadoop 2.6/ yarn on centos 6.5 after building w/ -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Hari Shreedharan
+1. Build looks good, ran a couple apps on YARN Thanks, Hari On Fri, Jun 5, 2015 at 10:52 AM, Yin Huai yh...@databricks.com wrote: Sean, Can you add -Phive -Phive-thriftserver and try those Hive tests? Thanks, Yin On Fri, Jun 5, 2015 at 5:19 AM, Sean Owen so...@cloudera.com wrote:

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Marcelo Vanzin
On Fri, Jun 5, 2015 at 5:19 AM, Sean Owen so...@cloudera.com wrote: - success sanity check *** FAILED *** java.lang.RuntimeException: [download failed: org.jboss.netty#netty;3.2.2.Final!netty.jar(bundle), download failed: commons-net#commons-net;3.1!commons-net.jar] at

Scheduler question: stages with non-arithmetic numbering

2015-06-05 Thread Mike Hynes
Hi folks, When I look at the output logs for an iterative Spark program, I see that the stage IDs are not arithmetically numbered---that is, there are gaps between stages and I might find log information about Stage 0, 1,2, 5, but not 3 or 4. As an example, the output from the Spark logs below

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Marcelo Vanzin
+1 (non-binding) Ran some of our internal test suite (yarn + standalone) against the hadoop-2.6 and without-hadoop binaries. On Tue, Jun 2, 2015 at 8:53 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to

Re: PySpark on PyPi

2015-06-05 Thread Jey Kottalam
Couldn't we have a pip installable pyspark package that just serves as a shim to an existing Spark installation? Or it could even download the latest Spark binary if SPARK_HOME isn't set during installation. Right now, Spark doesn't play very well with the usual Python ecosystem. For example, why

Re: PySpark on PyPi

2015-06-05 Thread Olivier Girardot
Ok, I get it. Now what can we do to improve the current situation, because right now if I want to set-up a CI env for PySpark, I have to : 1- download a pre-built version of pyspark and unzip it somewhere on every agent 2- define the SPARK_HOME env 3- symlink this distribution pyspark dir inside