spark git commit: [SPARK-20685] Fix BatchPythonEvaluation bug in case of single UDF w/ repeated arg.

2017-05-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.1 bdc08ab64 -> 92a71a667 [SPARK-20685] Fix BatchPythonEvaluation bug in case of single UDF w/ repeated arg. ## What changes were proposed in this pull request? There's a latent corner-case bug in PySpark UDF evaluation where executing

spark git commit: [SPARK-20685] Fix BatchPythonEvaluation bug in case of single UDF w/ repeated arg.

2017-05-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 86cef4df5 -> 3eb0ee06a [SPARK-20685] Fix BatchPythonEvaluation bug in case of single UDF w/ repeated arg. ## What changes were proposed in this pull request? There's a latent corner-case bug in PySpark UDF evaluation where executing

spark git commit: [SPARK-20685] Fix BatchPythonEvaluation bug in case of single UDF w/ repeated arg.

2017-05-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/master af8b6cc82 -> 8ddbc431d [SPARK-20685] Fix BatchPythonEvaluation bug in case of single UDF w/ repeated arg. ## What changes were proposed in this pull request? There's a latent corner-case bug in PySpark UDF evaluation where executing a

spark-website git commit: Trigger git sync

2017-05-10 Thread srowen
Repository: spark-website Updated Branches: refs/heads/asf-site 01e0279a0 -> c2c0905b4 Trigger git sync Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/c2c0905b Tree:

spark git commit: [SPARK-20689][PYSPARK] python doctest leaking bucketed table

2017-05-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 5c2c4dcce -> af8b6cc82 [SPARK-20689][PYSPARK] python doctest leaking bucketed table ## What changes were proposed in this pull request? It turns out pyspark doctest is calling saveAsTable without ever dropping them. Since we have

spark git commit: [SPARK-19447] Remove remaining references to generated rows metric

2017-05-10 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.2 358516dcb -> 86cef4df5 [SPARK-19447] Remove remaining references to generated rows metric ## What changes were proposed in this pull request? https://github.com/apache/spark/commit/b486ffc86d8ad6c303321dcf8514afee723f61f8 left behind

spark git commit: [SPARK-19447] Remove remaining references to generated rows metric

2017-05-10 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master fcb88f921 -> 5c2c4dcce [SPARK-19447] Remove remaining references to generated rows metric ## What changes were proposed in this pull request? https://github.com/apache/spark/commit/b486ffc86d8ad6c303321dcf8514afee723f61f8 left behind

spark git commit: [MINOR][BUILD] Fix lint-java breaks.

2017-05-10 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.2 5f6029c75 -> 358516dcb [MINOR][BUILD] Fix lint-java breaks. ## What changes were proposed in this pull request? This PR proposes to fix the lint-breaks as below: ``` [ERROR] src/main/java/org/apache/spark/unsafe/Platform.java:[51]

spark git commit: [MINOR][BUILD] Fix lint-java breaks.

2017-05-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master 76e4a5566 -> fcb88f921 [MINOR][BUILD] Fix lint-java breaks. ## What changes were proposed in this pull request? This PR proposes to fix the lint-breaks as below: ``` [ERROR] src/main/java/org/apache/spark/unsafe/Platform.java:[51]

spark git commit: [SPARK-20678][SQL] Ndv for columns not in filter condition should also be updated

2017-05-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.2 0851b6cfb -> 5f6029c75 [SPARK-20678][SQL] Ndv for columns not in filter condition should also be updated ## What changes were proposed in this pull request? In filter estimation, we update column stats for those columns in filter

spark git commit: [SPARK-20678][SQL] Ndv for columns not in filter condition should also be updated

2017-05-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 789bdbe3d -> 76e4a5566 [SPARK-20678][SQL] Ndv for columns not in filter condition should also be updated ## What changes were proposed in this pull request? In filter estimation, we update column stats for those columns in filter

spark git commit: [SPARK-20688][SQL] correctly check analysis for scalar sub-queries

2017-05-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.1 69786ea3a -> bdc08ab64 [SPARK-20688][SQL] correctly check analysis for scalar sub-queries In `CheckAnalysis`, we should call `checkAnalysis` for `ScalarSubquery` at the beginning, as later we will call `plan.output` which is invalid

spark git commit: [SPARK-20688][SQL] correctly check analysis for scalar sub-queries

2017-05-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.2 7597a522b -> 0851b6cfb [SPARK-20688][SQL] correctly check analysis for scalar sub-queries ## What changes were proposed in this pull request? In `CheckAnalysis`, we should call `checkAnalysis` for `ScalarSubquery` at the beginning,

spark git commit: [SPARK-20688][SQL] correctly check analysis for scalar sub-queries

2017-05-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master b512233a4 -> 789bdbe3d [SPARK-20688][SQL] correctly check analysis for scalar sub-queries ## What changes were proposed in this pull request? In `CheckAnalysis`, we should call `checkAnalysis` for `ScalarSubquery` at the beginning, as

spark git commit: [SPARK-20393][WEBU UI] Strengthen Spark to prevent XSS vulnerabilities

2017-05-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master a4cbf26bc -> b512233a4 [SPARK-20393][WEBU UI] Strengthen Spark to prevent XSS vulnerabilities ## What changes were proposed in this pull request? Add stripXSS and stripXSSMap to Spark Core's UIUtils. Calling these functions at any point

spark git commit: [SPARK-20637][CORE] Remove mention of old RDD classes from comments

2017-05-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master ca4625e0e -> a4cbf26bc [SPARK-20637][CORE] Remove mention of old RDD classes from comments ## What changes were proposed in this pull request? A few comments around the code mention RDD classes that do not exist anymore. I'm not sure of

spark git commit: [SPARK-20630][WEB UI] Fixed column visibility in Executor Tab

2017-05-10 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.2 3ed2f4d51 -> 7597a522b [SPARK-20630][WEB UI] Fixed column visibility in Executor Tab ## What changes were proposed in this pull request? #14617 added new columns to the executor table causing the visibility checks for the logs and

spark git commit: [SPARK-20630][WEB UI] Fixed column visibility in Executor Tab

2017-05-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master 804949c6b -> ca4625e0e [SPARK-20630][WEB UI] Fixed column visibility in Executor Tab ## What changes were proposed in this pull request? #14617 added new columns to the executor table causing the visibility checks for the logs and

spark git commit: [SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should use values not Params

2017-05-10 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.0 46659974e -> d86dae8fe [SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should use values not Params ## What changes were proposed in this pull request? - Replace `getParam` calls with `getOrDefault` calls. -

spark git commit: [SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should use values not Params

2017-05-10 Thread yliang
Repository: spark Updated Branches: refs/heads/master 0ef16bd4b -> 804949c6b [SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should use values not Params ## What changes were proposed in this pull request? - Replace `getParam` calls with `getOrDefault` calls. - Fix

spark git commit: [SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should use values not Params

2017-05-10 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.1 8e097890a -> 69786ea3a [SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should use values not Params ## What changes were proposed in this pull request? - Replace `getParam` calls with `getOrDefault` calls. -

spark git commit: [SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should use values not Params

2017-05-10 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.2 ef50a9548 -> 3ed2f4d51 [SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should use values not Params ## What changes were proposed in this pull request? - Replace `getParam` calls with `getOrDefault` calls. -

spark git commit: [SPARK-20668][SQL] Modify ScalaUDF to handle nullability.

2017-05-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/master a819dab66 -> 0ef16bd4b [SPARK-20668][SQL] Modify ScalaUDF to handle nullability. ## What changes were proposed in this pull request? When registering Scala UDF, we can know if the udf will return nullable value or not. `ScalaUDF` and

spark git commit: [SPARK-20686][SQL] PropagateEmptyRelation incorrectly handles aggregate without grouping

2017-05-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.1 50f28dfe4 -> 8e097890a [SPARK-20686][SQL] PropagateEmptyRelation incorrectly handles aggregate without grouping The query ``` SELECT 1 FROM (SELECT COUNT(*) WHERE FALSE) t1 ``` should return a single row of output because the

spark git commit: [SPARK-20670][ML] Simplify FPGrowth transform

2017-05-10 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master a90c5cd82 -> a819dab66 [SPARK-20670][ML] Simplify FPGrowth transform ## What changes were proposed in this pull request? jira: https://issues.apache.org/jira/browse/SPARK-20670 As suggested by Sean Owen in

spark git commit: [SPARK-20686][SQL] PropagateEmptyRelation incorrectly handles aggregate without grouping

2017-05-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.2 7b6f3a118 -> ef50a9548 [SPARK-20686][SQL] PropagateEmptyRelation incorrectly handles aggregate without grouping ## What changes were proposed in this pull request? The query ``` SELECT 1 FROM (SELECT COUNT(*) WHERE FALSE) t1 ```

spark git commit: [SPARK-20686][SQL] PropagateEmptyRelation incorrectly handles aggregate without grouping

2017-05-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 3d2131ab4 -> a90c5cd82 [SPARK-20686][SQL] PropagateEmptyRelation incorrectly handles aggregate without grouping ## What changes were proposed in this pull request? The query ``` SELECT 1 FROM (SELECT COUNT(*) WHERE FALSE) t1 ``` should