Re: [VOTE] Release Apache Spark 1.6.0 (RC2)

2015-12-14 Thread Krishna Sankar
Guys, The sc.version gives 1.6.0-SNAPSHOT. Need to change to 1.6.0. Can you pl verify ? Cheers On Sat, Dec 12, 2015 at 9:39 AM, Michael Armbrust wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.6.0! > > The vote is open until

Re: [VOTE] Release Apache Spark 1.6.0 (RC2)

2015-12-14 Thread Ricardo Almeida
+1 (non binding) Tested our workloads on a standalone cluster: - Spark Core - Spark SQL - Spark MLlib - Python API On 12 December 2015 at 18:39, Michael Armbrust wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.6.0! > > The vote is

Dev Environment (again)

2015-12-14 Thread Al Pivonka
I've read through the mail archives and read the different threads. I believe there is a great deal of value in teaching others. I'm a 14yr vet of Java and would like to contribute to different Spark projects. Here are my dilemmas: 1)How does one quickly get a working environment up and

Re: Maven build against Hadoop 2.4 times out

2015-12-14 Thread Ted Yu
Attached was the tail of test suite output from local run. I got test failure. FYI On Sun, Dec 13, 2015 at 10:03 PM, Yin Huai wrote: > Can you reproduce the problem in your local environment? Our 1.6 hadoop > 2.4 maven build looks pretty good. Since our 1.6 is pretty

BIRCH clustering algorithm

2015-12-14 Thread Dženan Softić
Hi, As a part of the project, we are trying to create parallel implementation of BIRCH clustering algorithm [1]. We are mostly getting idea how to do it from this paper, which used CUDA to make BIRCH parallel [2]. ([2] is short paper, just section 4. is relevant). We would like to implement

Re: Problem using User Defined Predicate pushdown with core RDD and parquet - UDP class not found

2015-12-14 Thread chao chu
+spark user mailing list Hi there, I have exactly the same problem as mentioned below. My current work around is to add the jar containing my UDP in one of the system classpath (for example, put it under the same path as

Re: [VOTE] Release Apache Spark 1.6.0 (RC2)

2015-12-14 Thread Mark Hamstra
I'm afraid you're correct, Krishna: core/src/main/scala/org/apache/spark/package.scala: val SPARK_VERSION = "1.6.0-SNAPSHOT" docs/_config.yml:SPARK_VERSION: 1.6.0-SNAPSHOT On Mon, Dec 14, 2015 at 6:51 PM, Krishna Sankar wrote: > Guys, >The sc.version gives

Re: Secondary Indexing of RDDs?

2015-12-14 Thread Nitin Goyal
Spar SQL's in-memory cache stores statistics per column which in turn is used to skip batches(default size 1) within partition https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/ColumnStats.scala#L25 Hope this helps Thanks -Nitin On

Re: [VOTE] Release Apache Spark 1.6.0 (RC2)

2015-12-14 Thread Sean Owen
With Java 7 / Ubuntu 15, and "-Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver", I still see the Docker tests fail every time. Is anyone else seeing them fail (or running them)? The Hive CliSuite also fails (stack trace at the bottom). Same deal -- if people are running this test and it's not

Re: [build system] brief downtime right now

2015-12-14 Thread shane knapp
something is up w/apache. looking. On Mon, Dec 14, 2015 at 11:37 AM, shane knapp wrote: > after killing and restarting jenkins, things seem to be VERY slow. > i'm gonna kick jenkins again and see if that helps. > > > > On Mon, Dec 14, 2015 at 11:26 AM, shane knapp

Re: [build system] brief downtime right now

2015-12-14 Thread shane knapp
...and we're back. we were getting reverse proxy timeouts, which seem to have been caused by jenkins churning and doing a lot of IO. i'll dig in to the logs and see if i can find out what happened. weird. shane On Mon, Dec 14, 2015 at 11:51 AM, shane knapp wrote: >

Re: Maven build against Hadoop 2.4 times out

2015-12-14 Thread shane knapp
++joshrosen This Is Known[tm], and we have a bug open against it: https://issues.apache.org/jira/browse/SPARK-11823 On Mon, Dec 14, 2015 at 7:42 AM, Ted Yu wrote: > Attached was the tail of test suite output from local run. > I got test failure. > > FYI > > On Sun, Dec 13,

SparkML algos limitations question.

2015-12-14 Thread Eugene Morozov
Hello! I'm currently working on POC and try to use Random Forest (classification and regression). I also have to check SVM and Multiclass perceptron (other algos are less important at the moment). So far I've discovered that Random Forest has a limitation of maxDepth for trees and just out of

Re: [VOTE] Release Apache Spark 1.6.0 (RC2)

2015-12-14 Thread Andrew Or
+1 Ran PageRank on standalone mode with 4 nodes and noticed a speedup after the specific commits that were in RC2 but not RC1: c247b6a Dec 10 [SPARK-12155][SPARK-12253] Fix executor OOM in unified memory management 05e441e Dec 9 [SPARK-12165][SPARK-12189] Fix bugs in eviction of storage memory

Re: [VOTE] Release Apache Spark 1.6.0 (RC2)

2015-12-14 Thread Kousuke Saruta
+1 (non-binding) Tested some workloads using basic API and DataFrame API on my 4-nodes YARN cluster (1 master and 3 slaves). I also tested the Web UI. (I'm resending this mail just in case because it seems that I failed to send the mail to dev@) On 2015/12/13 2:39, Michael Armbrust wrote:

Re: [build system] brief downtime right now

2015-12-14 Thread shane knapp
ok, we're back up and building. On Mon, Dec 14, 2015 at 10:31 AM, shane knapp wrote: > last week i forgot to downgrade R to 3.1.1, and since there's not much > activity right now, i'm going to take jenkins down and finish up the > ticket. > >

Re: [VOTE] Release Apache Spark 1.6.0 (RC2)

2015-12-14 Thread Michael Armbrust
Here are a fixed version of the docs for 1.6: http://people.apache.org/~pwendell/spark-releases/spark-1.6.0-rc2-docsfixed-docs There still might be some minor rendering issues of the ML page, but people are investigating. On Sat, Dec 12, 2015 at 6:58 PM, Burak Yavuz wrote: >

[build system] brief downtime right now

2015-12-14 Thread shane knapp
last week i forgot to downgrade R to 3.1.1, and since there's not much activity right now, i'm going to take jenkins down and finish up the ticket. https://issues.apache.org/jira/browse/SPARK-11255 we should be back up and running within 30 minutes. thanks! shane

Re: [VOTE] Release Apache Spark 1.6.0 (RC2)

2015-12-14 Thread Reynold Xin
+1 Tested some dataframe operations on my Mac. On Saturday, December 12, 2015, Michael Armbrust wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.6.0! > > The vote is open until Tuesday, December 15, 2015 at 6:00 UTC and passes > if a

Re: [SparkR] Any reason why saveDF's mode is append by default ?

2015-12-14 Thread Jeff Zhang
Thanks Shivaram, created https://issues.apache.org/jira/browse/SPARK-12318 I will work on it. On Mon, Dec 14, 2015 at 4:13 PM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > I think its just a bug -- I think we originally followed the Python > API (in the original PR [1]) but the

Re: [build system] brief downtime right now

2015-12-14 Thread Yin Huai
Hi Shane, Seems Spark's lint-r started to fail from https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/Spark-Master-SBT/4260/AMPLAB_JENKINS_BUILD_PROFILE=hadoop1.0,label=spark-test/console. Is it related to the upgrade work of R? Thanks, Yin On Mon, Dec 14, 2015 at

Re: [build system] brief downtime right now

2015-12-14 Thread shane knapp
that looks like the lintr checks failed, causing the build to fail. On Mon, Dec 14, 2015 at 3:05 PM, Yin Huai wrote: > Hi Shane, > > Seems Spark's lint-r started to fail from >