spark git commit: [HOTFIX] Fix build break.

2016-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 7c8a399a2 -> 980db2bd4 [HOTFIX] Fix build break. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/980db2bd Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/98

spark git commit: [SPARK-16489][SQL] Guard against variable reuse mistakes in expression code generation

2016-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 d1c992fea -> 7c8a399a2 [SPARK-16489][SQL] Guard against variable reuse mistakes in expression code generation In code generation, it is incorrect for expressions to reuse variable names across different instances of itself. As an exam

spark git commit: [SPARK-16488] Fix codegen variable namespace collision in pmod and partitionBy

2016-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 9808735e0 -> d1c992fea [SPARK-16488] Fix codegen variable namespace collision in pmod and partitionBy This patch fixes a variable namespace collision bug in pmod and partitionBy Regression test for one possible occurrence. A more gener

spark git commit: [SPARK-16514][SQL] Fix various regex codegen bugs

2016-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 702178d1f -> 9808735e0 [SPARK-16514][SQL] Fix various regex codegen bugs ## What changes were proposed in this pull request? RegexExtract and RegexReplace currently crash on non-nullable input due use of a hard-coded local variable na

spark git commit: [SPARK-16514][SQL] Fix various regex codegen bugs

2016-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 4303d292b -> 41df62c59 [SPARK-16514][SQL] Fix various regex codegen bugs ## What changes were proposed in this pull request? RegexExtract and RegexReplace currently crash on non-nullable input due use of a hard-coded local variable na

spark git commit: [SPARK-16514][SQL] Fix various regex codegen bugs

2016-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master 56bd399a8 -> 1c58fa905 [SPARK-16514][SQL] Fix various regex codegen bugs ## What changes were proposed in this pull request? RegexExtract and RegexReplace currently crash on non-nullable input due use of a hard-coded local variable name (

spark git commit: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-12 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.0 2f47b3778 -> 4303d292b [SPARK-16284][SQL] Implement reflect SQL function ## What changes were proposed in this pull request? This patch implements reflect SQL function, which can be used to invoke a Java method in SQL. Slightly differe

spark git commit: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-12 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 7f968867f -> 56bd399a8 [SPARK-16284][SQL] Implement reflect SQL function ## What changes were proposed in this pull request? This patch implements reflect SQL function, which can be used to invoke a Java method in SQL. Slightly different f

spark git commit: [SPARK-16119][SQL] Support PURGE option to drop table / partition.

2016-07-12 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 68df47aca -> 7f968867f [SPARK-16119][SQL] Support PURGE option to drop table / partition. This option is used by Hive to directly delete the files instead of moving them to the trash. This is needed in certain configurations where moving th

spark git commit: [SPARK-16414][YARN] Fix bugs for "Can not get user config when calling SparkHadoopUtil.get.conf on yarn cluser mode"

2016-07-12 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-2.0 f41947654 -> 2f47b3778 [SPARK-16414][YARN] Fix bugs for "Can not get user config when calling SparkHadoopUtil.get.conf on yarn cluser mode" ## What changes were proposed in this pull request? The `SparkHadoopUtil` singleton was instan

spark git commit: [SPARK-16405] Add metrics and source for external shuffle service

2016-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master d513c99c1 -> 68df47aca [SPARK-16405] Add metrics and source for external shuffle service ## What changes were proposed in this pull request? Since externalShuffleService is essential for spark, better monitoring for shuffle service is nec

spark git commit: [SPARK-16414][YARN] Fix bugs for "Can not get user config when calling SparkHadoopUtil.get.conf on yarn cluser mode"

2016-07-12 Thread vanzin
Repository: spark Updated Branches: refs/heads/master c377e49e3 -> d513c99c1 [SPARK-16414][YARN] Fix bugs for "Can not get user config when calling SparkHadoopUtil.get.conf on yarn cluser mode" ## What changes were proposed in this pull request? The `SparkHadoopUtil` singleton was instantiat

spark git commit: [SPARK-16489][SQL] Guard against variable reuse mistakes in expression code generation

2016-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 7b63e7d92 -> f41947654 [SPARK-16489][SQL] Guard against variable reuse mistakes in expression code generation In code generation, it is incorrect for expressions to reuse variable names across different instances of itself. As an exam

spark git commit: [SPARK-16489][SQL] Guard against variable reuse mistakes in expression code generation

2016-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5ad68ba5c -> c377e49e3 [SPARK-16489][SQL] Guard against variable reuse mistakes in expression code generation ## What changes were proposed in this pull request? In code generation, it is incorrect for expressions to reuse variable names

spark git commit: [SPARK-15752][SQL] Optimize metadata only query that has an aggregate whose children are deterministic project or filter operators.

2016-07-12 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 6cb75db9a -> 5ad68ba5c [SPARK-15752][SQL] Optimize metadata only query that has an aggregate whose children are deterministic project or filter operators. ## What changes were proposed in this pull request? when query only use metadata (ex

spark git commit: [SPARK-16470][ML][OPTIMIZER] Check linear regression training whether actually reach convergence and add warning if not

2016-07-12 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.0 9e0d2e226 -> 7b63e7d92 [SPARK-16470][ML][OPTIMIZER] Check linear regression training whether actually reach convergence and add warning if not ## What changes were proposed in this pull request? In `ml.regression.LinearRegression`, it

spark git commit: [SPARK-16470][ML][OPTIMIZER] Check linear regression training whether actually reach convergence and add warning if not

2016-07-12 Thread srowen
Repository: spark Updated Branches: refs/heads/master 5b28e0258 -> 6cb75db9a [SPARK-16470][ML][OPTIMIZER] Check linear regression training whether actually reach convergence and add warning if not ## What changes were proposed in this pull request? In `ml.regression.LinearRegression`, it use

spark git commit: [SPARK-16189][SQL] Add ExternalRDD logical plan for input with RDD to have a chance to eliminate serialize/deserialize.

2016-07-12 Thread wenchen
Repository: spark Updated Branches: refs/heads/master fc11c509e -> 5b28e0258 [SPARK-16189][SQL] Add ExternalRDD logical plan for input with RDD to have a chance to eliminate serialize/deserialize. ## What changes were proposed in this pull request? Currently the input `RDD` of `Dataset` is a

spark git commit: [MINOR][ML] update comment where is inconsistent with code in ml.regression.LinearRegression

2016-07-12 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.0 689261465 -> 9e0d2e226 [MINOR][ML] update comment where is inconsistent with code in ml.regression.LinearRegression ## What changes were proposed in this pull request? In `train` method of `ml.regression.LinearRegression` when handlin

spark git commit: [MINOR][ML] update comment where is inconsistent with code in ml.regression.LinearRegression

2016-07-12 Thread srowen
Repository: spark Updated Branches: refs/heads/master c9a676215 -> fc11c509e [MINOR][ML] update comment where is inconsistent with code in ml.regression.LinearRegression ## What changes were proposed in this pull request? In `train` method of `ml.regression.LinearRegression` when handling si