spark git commit: [SPARK-13787][ML][PYSPARK] Pyspark feature importances for decision tree and random forest

2016-03-10 Thread mlnick
Repository: spark Updated Branches: refs/heads/master 0b713e045 -> 234f781ae [SPARK-13787][ML][PYSPARK] Pyspark feature importances for decision tree and random forest ## What changes were proposed in this pull request? This patch adds a `featureImportance` property to the Pyspark API for

spark git commit: [SPARK-13512][ML] add example and doc for MaxAbsScaler

2016-03-10 Thread mlnick
Repository: spark Updated Branches: refs/heads/master 6ca990fb3 -> 0b713e045 [SPARK-13512][ML] add example and doc for MaxAbsScaler ## What changes were proposed in this pull request? jira: https://issues.apache.org/jira/browse/SPARK-13512 Add example and doc for ml.feature.MaxAbsScaler. ##

spark git commit: [SPARK-13294][PROJECT INFRA] Remove MiMa's dependency on spark-class / Spark assembly

2016-03-10 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master d18276cb1 -> 6ca990fb3 [SPARK-13294][PROJECT INFRA] Remove MiMa's dependency on spark-class / Spark assembly This patch removes the need to build a full Spark assembly before running the `dev/mima` script. - I modified the `tools`

spark git commit: [SPARK-13672][ML] Add python examples of BisectingKMeans in ML and MLLIB

2016-03-10 Thread mlnick
Repository: spark Updated Branches: refs/heads/master e33bc67c8 -> d18276cb1 [SPARK-13672][ML] Add python examples of BisectingKMeans in ML and MLLIB JIRA: https://issues.apache.org/jira/browse/SPARK-13672 ## What changes were proposed in this pull request? add two python examples of

spark git commit: [MINOR][CORE] Fix a duplicate "and" in a log message.

2016-03-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 74c4e2651 -> e33bc67c8 [MINOR][CORE] Fix a duplicate "and" in a log message. Author: Marcelo Vanzin Closes #11642 from vanzin/spark-conf-typo. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [HOT-FIX] fix compile

2016-03-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 6871cc8f3 -> 74c4e2651 [HOT-FIX] fix compile Fix the compilation failure introduced by https://github.com/apache/spark/pull/11555 because of a merge conflict. Author: Wenchen Fan Closes #11648 from

spark git commit: [SPARK-12718][SPARK-13720][SQL] SQL generation support for window functions

2016-03-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 560489f4e -> 6871cc8f3 [SPARK-12718][SPARK-13720][SQL] SQL generation support for window functions ## What changes were proposed in this pull request? Add SQL generation support for window functions. The idea is simple, just treat

spark git commit: [SPARK-13732][SPARK-13797][SQL] Remove projectList from Window and Eliminate useless Window

2016-03-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 4d535d1f1 -> 560489f4e [SPARK-13732][SPARK-13797][SQL] Remove projectList from Window and Eliminate useless Window What changes were proposed in this pull request? `projectList` is useless. Its value is always the same as the

spark git commit: [SPARK-13389][SPARKR] SparkR support first/last with ignore NAs

2016-03-10 Thread shivaram
Repository: spark Updated Branches: refs/heads/master c3a6269ca -> 4d535d1f1 [SPARK-13389][SPARKR] SparkR support first/last with ignore NAs ## What changes were proposed in this pull request? SparkR support first/last with ignore NAs cc sun-rui felixcheung shivaram ## How was the this

spark git commit: [SPARK-13789] Infer additional constraints from attribute equality

2016-03-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 416e71af4 -> c3a6269ca [SPARK-13789] Infer additional constraints from attribute equality ## What changes were proposed in this pull request? This PR adds support for inferring an additional set of data constraints based on attribute

spark git commit: [SPARK-13327][SPARKR] Added parameter validations for colnames<-

2016-03-10 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.6 078c71466 -> db4795a7e [SPARK-13327][SPARKR] Added parameter validations for colnames<- Author: Oscar D. Lara Yejas Author: Oscar D. Lara Yejas Closes #11220 from

spark git commit: [SPARK-13327][SPARKR] Added parameter validations for colnames<-

2016-03-10 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 88fa86662 -> 416e71af4 [SPARK-13327][SPARKR] Added parameter validations for colnames<- Author: Oscar D. Lara Yejas Author: Oscar D. Lara Yejas Closes #11220 from

spark git commit: [MINOR][DOC] Fix supported hive version in doc

2016-03-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 07ace27cb -> 078c71466 [MINOR][DOC] Fix supported hive version in doc ## What changes were proposed in this pull request? Today, Spark 1.6.1 and updated docs are release. Unfortunately, there is obsolete hive version information on

spark git commit: [MINOR][DOC] Fix supported hive version in doc

2016-03-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1d542785b -> 88fa86662 [MINOR][DOC] Fix supported hive version in doc ## What changes were proposed in this pull request? Today, Spark 1.6.1 and updated docs are release. Unfortunately, there is obsolete hive version information on docs:

[2/4] spark git commit: [SPARK-13244][SQL] Migrates DataFrame to Dataset

2016-03-10 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark/blob/1d542785/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects.scala -- diff --git

[3/4] spark git commit: [SPARK-13244][SQL] Migrates DataFrame to Dataset

2016-03-10 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark/blob/1d542785/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleParamsExample.java -- diff --git

[1/4] spark git commit: [SPARK-13244][SQL] Migrates DataFrame to Dataset

2016-03-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 27fe6bacc -> 1d542785b http://git-wip-us.apache.org/repos/asf/spark/blob/1d542785/sql/core/src/test/java/test/org/apache/spark/sql/JavaDataFrameSuite.java -- diff --git

[4/4] spark git commit: [SPARK-13244][SQL] Migrates DataFrame to Dataset

2016-03-10 Thread yhuai
[SPARK-13244][SQL] Migrates DataFrame to Dataset ## What changes were proposed in this pull request? This PR unifies DataFrame and Dataset by migrating existing DataFrame operations to Dataset and make `DataFrame` a type alias of `Dataset[Row]`. Most Scala code changes are source compatible,

spark git commit: [SPARK-13604][CORE] Sync worker's state after registering with master

2016-03-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 020ff8cd3 -> 27fe6bacc [SPARK-13604][CORE] Sync worker's state after registering with master ## What changes were proposed in this pull request? Here lists all cases that Master cannot talk with Worker for a while and then network is

spark git commit: [SPARK-13751] [SQL] generate better code for Filter

2016-03-10 Thread davies
Repository: spark Updated Branches: refs/heads/master 91fed8e9c -> 020ff8cd3 [SPARK-13751] [SQL] generate better code for Filter ## What changes were proposed in this pull request? This PR improve the codegen of Filter by: 1. filter out the rows early if it have null value in it that will

[2/2] spark git commit: [SPARK-13696] Remove BlockStore class & simplify interfaces of mem. & disk stores

2016-03-10 Thread andrewor14
[SPARK-13696] Remove BlockStore class & simplify interfaces of mem. & disk stores Today, both the MemoryStore and DiskStore implement a common `BlockStore` API, but I feel that this API is inappropriate because it abstracts away important distinctions between the behavior of these two stores.

[1/2] spark git commit: [SPARK-13696] Remove BlockStore class & simplify interfaces of mem. & disk stores

2016-03-10 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 3d2b6f56e -> 81d48532d http://git-wip-us.apache.org/repos/asf/spark/blob/81d48532/core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala -- diff --git

spark git commit: [SPARK-13790] Speed up ColumnVector's getDecimal

2016-03-10 Thread davies
Repository: spark Updated Branches: refs/heads/master 19f4ac6dc -> 747d2f538 [SPARK-13790] Speed up ColumnVector's getDecimal ## What changes were proposed in this pull request? We should reuse an object similar to the other non-primitive type getters. For a query that computes averages over

spark git commit: [SPARK-13759][SQL] Add IsNotNull constraints for expressions with an inequality

2016-03-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 235f4ac6f -> 19f4ac6dc [SPARK-13759][SQL] Add IsNotNull constraints for expressions with an inequality ## What changes were proposed in this pull request? This PR adds support for inferring `IsNotNull` constraints from expressions with

svn commit: r1734450 - in /spark: ./ _layouts/ js/ news/_posts/ releases/_posts/ site/ site/docs/ site/docs/1.6.1/ site/docs/1.6.1/api/ site/docs/1.6.1/api/R/ site/docs/1.6.1/api/java/ site/docs/1.6.1

2016-03-10 Thread marmbrus
Author: marmbrus Date: Thu Mar 10 19:28:30 2016 New Revision: 1734450 URL: http://svn.apache.org/viewvc?rev=1734450=rev Log: Release Spark 1.6.1 [This commit notification would consist of 933 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-13727][CORE] SparkConf.contains does not consider deprecated keys

2016-03-10 Thread vanzin
Repository: spark Updated Branches: refs/heads/master d24801ad2 -> 235f4ac6f [SPARK-13727][CORE] SparkConf.contains does not consider deprecated keys The contains() method does not return consistently with get() if the key is deprecated. For example, import org.apache.spark.SparkConf val

svn commit: r12718 - /dev/spark/spark-1.6.1-rc1/ /release/spark/spark-1.6.1/

2016-03-10 Thread marmbrus
Author: marmbrus Date: Thu Mar 10 19:14:45 2016 New Revision: 12718 Log: Release Spark 1.6.1 Added: release/spark/spark-1.6.1/ - copied from r12717, dev/spark/spark-1.6.1-rc1/ Removed: dev/spark/spark-1.6.1-rc1/

svn commit: r12717 - /dev/spark/spark-1.6.1-rc1/

2016-03-10 Thread marmbrus
Author: marmbrus Date: Thu Mar 10 19:10:54 2016 New Revision: 12717 Log: Add spark-1.6.1-rc1 Added: dev/spark/spark-1.6.1-rc1/ dev/spark/spark-1.6.1-rc1/spark-1.6.1-bin-cdh4.tgz (with props) dev/spark/spark-1.6.1-rc1/spark-1.6.1-bin-cdh4.tgz.asc

spark git commit: [SPARK-13636] [SQL] Directly consume UnsafeRow in wholestage codegen plans

2016-03-10 Thread davies
Repository: spark Updated Branches: refs/heads/master 74267beb3 -> d24801ad2 [SPARK-13636] [SQL] Directly consume UnsafeRow in wholestage codegen plans JIRA: https://issues.apache.org/jira/browse/SPARK-13636 ## What changes were proposed in this pull request? As shown in the wholestage

spark git commit: [SPARK-13758][STREAMING][CORE] enhance exception message to avoid misleading

2016-03-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master 927e22eff -> 74267beb3 [SPARK-13758][STREAMING][CORE] enhance exception message to avoid misleading We have a recoverable Spark streaming job with checkpoint enabled, it could be executed correctly at first time, but throw following

spark git commit: [SPARK-13663][CORE] Upgrade Snappy Java to 1.1.2.1

2016-03-10 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 60cb27040 -> 07ace27cb [SPARK-13663][CORE] Upgrade Snappy Java to 1.1.2.1 Update snappy to 1.1.2.1 to pull in a single fix -- the OOM fix we already worked around. Supersedes https://github.com/apache/spark/pull/11524 Jenkins tests.

spark git commit: [SPARK-13663][CORE] Upgrade Snappy Java to 1.1.2.1

2016-03-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master 9fe38aba1 -> 927e22eff [SPARK-13663][CORE] Upgrade Snappy Java to 1.1.2.1 ## What changes were proposed in this pull request? Update snappy to 1.1.2.1 to pull in a single fix -- the OOM fix we already worked around. Supersedes

spark git commit: [SPARK-11108][ML] OneHotEncoder should support other numeric types

2016-03-10 Thread mlnick
Repository: spark Updated Branches: refs/heads/master 9525c563d -> 9fe38aba1 [SPARK-11108][ML] OneHotEncoder should support other numeric types Adding support for other numeric types: * Integer * Short * Long * Float * Decimal Author: sethah Closes #9777 from