Re: The latest master branch didn't compile with -Phive?

2015-07-09 Thread Stephen Boesch
Please do a *clean* package and reply back if you still encounter issues. 2015-07-09 7:24 GMT-07:00 Yijie Shen henry.yijies...@gmail.com: Hi, I use the clean version just clone from the master branch, build with: build/mvn -Phive -Phadoop-2.4 -DskipTests package And BUILD FAILURE at last,

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread Sean Owen
+1 nonbinding. All previous RC issues appear resolved. All tests pass with the -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver invocation. Signatures et al are OK. On Thu, Jul 9, 2015 at 6:55 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache

Re: The latest master branch didn't compile with -Phive?

2015-07-09 Thread Ted Yu
Looking at https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=centos/2875/consoleFull : [error] [error] while compiling:

The latest master branch didn't compile with -Phive?

2015-07-09 Thread Yijie Shen
Hi, I use the clean version just clone from the master branch, build with: build/mvn -Phive -Phadoop-2.4 -DskipTests package And BUILD FAILURE at last, due to: [error] while compiling: /Users/yijie/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala [error]

Re: The latest master branch didn't compile with -Phive?

2015-07-09 Thread Ted Yu
From https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=centos/2875/consoleFull : + build/mvn -DzincPort=3439 -DskipTests -Phadoop-2.4 -Pyarn -Phive -Phive-thriftserver -Pkinesis-asl clean package FYI On Thu, Jul 9, 2015 at 7:51 AM, Sean

Re: spark-ec2 Fails installing ganglia properly in 1.4

2015-07-09 Thread Shivaram Venkataraman
There are a couple of PRs open for it (linked from the JIRA) and I am reviewing https://github.com/mesos/spark-ec2/pull/121-- Also the EC2 fixes can be out of band from the release itself, so the fix will make to 1.4.1 once the above PR is merged. Thanks Shivaram On Thu, Jul 9, 2015 at 9:21 AM,

spark-ec2 Fails installing ganglia properly in 1.4

2015-07-09 Thread Pradeep Bashyal
Hello, The ec2/spark-ec2 fails installing ganglia properly in 1.4. The issue seems to be an older version of httpd(2.2) but the /etc/httpd/conf/httpd.conf is for 2.4 [ec2-user@ip-172-30-0-123 ~]$ httpd -v Server version: Apache/2.2.29 (Unix) Server built: Mar 12 2015 03:50:17 There is already

Re: The latest master branch didn't compile with -Phive?

2015-07-09 Thread Ted Yu
I guess the compilation issue didn't surface in QA run because sbt was used: [info] Building Spark (w/Hive 0.13.1) using SBT with these arguments: -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive-thriftserver -Phive package assembly/assembly streaming-kafka-assembly/assembly

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread Mark Hamstra
+1 On Wed, Jul 8, 2015 at 10:55 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.1! This release fixes a handful of known issues in Spark 1.4.0, listed here: http://s.apache.org/spark-1.4.1 The tag to be voted on is

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread Michael Armbrust
+1 On Thu, Jul 9, 2015 at 10:07 AM, Mark Hamstra m...@clearstorydata.com wrote: +1 On Wed, Jul 8, 2015 at 10:55 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.1! This release fixes a handful of known issues in

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread Andrew Or
+1 2015-07-09 10:26 GMT-07:00 Michael Armbrust mich...@databricks.com: +1 On Thu, Jul 9, 2015 at 10:07 AM, Mark Hamstra m...@clearstorydata.com wrote: +1 On Wed, Jul 8, 2015 at 10:55 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread York, Brennon
+1 (non-binding) * ran spark-on-YARN MLLib ALS recommendation pipeline (success) * no regression / performance issues * ran spark-on-YARN GraphX pipeline (success) * no regression / performance issues On 7/8/15, 10:55 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing

Re: callJMethod?

2015-07-09 Thread Vasili I. Galchin
Now I want to look at the PySpark side for comparison. I assume same mechanism to do remote function call!! Maybe in the slides .. I assume there are multiple JVMs for load balancing and fault tolerance, yes?? How can I get one pdf with all slides together and not slide show? Vasili On Thu,

PySpark vs R

2015-07-09 Thread Vasili I. Galchin
Hello, Just trying to get up to speed ( a week .. pls be patient with me). I have been reading several docs .. plus ... reading PySpark vs R code. I don't see an invariant between the Python and R implementations. ?? Probably I should read native Scala code, yes? Kind thx, Vasili

Re: Spark ThriftServer encounter java.lang.IllegalArgumentException: Unknown auth type: null Allowed values are: [auth-int, auth-conf, auth]

2015-07-09 Thread gogototo
I think it's the hive 0.13.1 issue, which fixed in hive 0.14. https://issues.apache.org/jira/browse/HIVE-6741 shall you please release some artifact of org.spark-project.hive 0.14 above ? Thx very much! -- View this message in context:

Re: callJMethod?

2015-07-09 Thread Vasili I. Galchin
very nice explanation. Thx On Thu, Jul 9, 2015 at 4:41 PM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: callJMethod is a private R function that is defined in https://github.com/apache/spark/blob/a0cc3e5aa3fcfd0fce6813c520152657d327aaf2/R/pkg/R/backend.R#L31 callJMethod serializes

Re: databases currently supported by Spark SQL JDBC

2015-07-09 Thread JaeSung Jun
As long as JDBC driver is provided, any database can be used in JDBC datasource provider. you can provide driver class in options field like followings : CREATE TEMPORARY TABLE jdbcTable USING org.apache.spark.sql.jdbc OPTIOS( url jdbc:oracle:thin:@myhost:1521:orcl driver

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread Patrick Wendell
+1 On Wed, Jul 8, 2015 at 10:55 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.1! This release fixes a handful of known issues in Spark 1.4.0, listed here: http://s.apache.org/spark-1.4.1 The tag to be voted on is

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread Reynold Xin
+1 On Wed, Jul 8, 2015 at 11:58 PM, Patrick Wendell pwend...@gmail.com wrote: +1 On Wed, Jul 8, 2015 at 10:55 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.1! This release fixes a handful of known issues in

Spark and Haskell support

2015-07-09 Thread Vasili I. Galchin
Hello, 1) I have been rereading kind email responses to my Spark queries. Thx. 2) I have also been reading R code: 1) RDD.R 2) DataFrame.R 3) All following API's = https://cwiki.apache.org/confluence/display/SPARK/Spark+Internals 4) Python ...

Questions about Fault tolerance of Spark

2015-07-09 Thread 牛兆捷
Hi All: We already know that Spark utilizes the lineage to recompute the RDDs when failure occurs. I want to study the performance of this fault-tolerant approach and have some questions about it. 1) Is there any benchmark (or standard failure model) to test the fault tolerance of these kinds of

databases currently supported by Spark SQL JDBC

2015-07-09 Thread Niranda Perera
Hi, I'm planning to use Spark SQL JDBC datasource provider in various RDBMS databases. what are the databases currently supported by Spark JDBC relation provider? rgds -- Niranda @n1r44 https://twitter.com/N1R44 https://pythagoreanscript.wordpress.com/

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread Luciano Resende
+1 (non-binding) mostly looking in the legal aspects of the release. On Wed, Jul 8, 2015 at 10:55 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.1! This release fixes a handful of known issues in Spark 1.4.0, listed

jenkins downtime 7/13/15, 7am PDT

2015-07-09 Thread shane knapp
i'll be taking jenkins down for system and jenkins app updates. this should be pretty quick and i'm expecting to have everything back up and building by 9am. i will send a reminder email this weekend, and again when i start the maintenance. if there's any reason for me to delay this, please let

Are These Issues Suitable for our Senior Project?

2015-07-09 Thread emrehan
Hi all, We could contribute to a feature to Spark MLlib by May 2016 and make it count as our undergraduate senior project. The following list of issues seem interesting to us: * https://issues.apache.org/jira/browse/SPARK-2273 https://issues.apache.org/jira/browse/SPARK-2273– Online

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread Holden Karau
+1 - compiled on ubuntu centos, spark-perf run against yarn in client mode on a small cluster comparing 1.4.0 1.4.1 (for core) doesn't have any huge jumps (albeit with a small scaling factor). On Wed, Jul 8, 2015 at 11:58 PM, Patrick Wendell pwend...@gmail.com wrote: +1 On Wed, Jul 8, 2015

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread Krishna Sankar
+1 1. Compiled OSX 10.10 (Yosemite) OK Total time: 38:11 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMeans OK Center And

Re: Are These Issues Suitable for our Senior Project?

2015-07-09 Thread Feynman Liang
Exciting, thanks for the contribution! I'm currently aware of: - SPARK-8499 is currently in progress (in a duplicate issue); I updated the JIRA to reflect that. - SPARK-5992 has a spark package http://spark-packages.org/package/mrsqueeze/spark-hash linked but I'm unclear on whether

Re: The latest master branch didn't compile with -Phive?

2015-07-09 Thread Josh Rosen
Jenkins runs compile-only builds for Maven as an early warning system for this type of issue; you can see from https://amplab.cs.berkeley.edu/jenkins/view/Spark-QA-Compile/ that the Maven compilation is now broken in master. On Thu, Jul 9, 2015 at 8:48 AM, Ted Yu yuzhih...@gmail.com wrote: I

Re: spark-ec2 Fails installing ganglia properly in 1.4

2015-07-09 Thread Pradeep Bashyal
Awesome! Thank you. On Thu, Jul 9, 2015 at 11:33 AM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: There are a couple of PRs open for it (linked from the JIRA) and I am reviewing https://github.com/mesos/spark-ec2/pull/121-- Also the EC2 fixes can be out of band from the release

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-09 Thread Burak Yavuz
+1 nonbinding. On Thu, Jul 9, 2015 at 7:38 AM, Sean Owen so...@cloudera.com wrote: +1 nonbinding. All previous RC issues appear resolved. All tests pass with the -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver invocation. Signatures et al are OK. On Thu, Jul 9, 2015 at 6:55 AM, Patrick

Re: Are These Issues Suitable for our Senior Project?

2015-07-09 Thread Joseph Bradley
It would be great to get more contributions! If you're new to contributing, it will be good to start with some small contributions and check out: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark But if those build up to a larger contribution, the top ones I'd pick out are:

Re: Are These Issues Suitable for our Senior Project?

2015-07-09 Thread Xiangrui Meng
Hi Emrehan, Thanks for asking! There are actually many TODOs for MLlib. I would recommend starting with small tasks before picking a topic for your senior project. Please check https://issues.apache.org/jira/browse/SPARK-8445 for the 1.5 roadmap and see whether there are ones you are interested

callJMethod?

2015-07-09 Thread Vasili I. Galchin
Hello, I am reading R code, e.g. RDD.R, DataFrame.R, etc. I see that callJMethod is repeatedly call. Is callJMethod part of the Spark Java API? Thx. Vasili

Re: The latest master branch didn't compile with -Phive?

2015-07-09 Thread Sean Owen
This is an error from scalac and not Spark. I find it happens frequently for me but goes away on a clean build. *shrug* On Thu, Jul 9, 2015 at 3:45 PM, Ted Yu yuzhih...@gmail.com wrote: Looking at

Re: callJMethod?

2015-07-09 Thread Shivaram Venkataraman
callJMethod is a private R function that is defined in https://github.com/apache/spark/blob/a0cc3e5aa3fcfd0fce6813c520152657d327aaf2/R/pkg/R/backend.R#L31 callJMethod serializes the function names, arguments and sends them over a socket to the JVM. This is the socket-based R to JVM bridge