spark git commit: [SPARK-18849][ML][SPARKR][DOC] vignettes final check update

2016-12-14 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.1 d399a297d -> 2a8de2e11 [SPARK-18849][ML][SPARKR][DOC] vignettes final check update ## What changes were proposed in this pull request? doc cleanup ## How was this patch tested? ~~vignettes is not building for me. I'm going to kick

spark git commit: [SPARK-18849][ML][SPARKR][DOC] vignettes final check update

2016-12-14 Thread shivaram
Repository: spark Updated Branches: refs/heads/master ec0eae486 -> 7d858bc5c [SPARK-18849][ML][SPARKR][DOC] vignettes final check update ## What changes were proposed in this pull request? doc cleanup ## How was this patch tested? ~~vignettes is not building for me. I'm going to kick off a

spark git commit: [SPARK-18875][SPARKR][DOCS] Fix R API doc generation by adding `DESCRIPTION` file

2016-12-14 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.1 b14fc3918 -> d399a297d [SPARK-18875][SPARKR][DOCS] Fix R API doc generation by adding `DESCRIPTION` file ## What changes were proposed in this pull request? Since Apache Spark 1.4.0, R API document page has a broken link on

spark git commit: [SPARK-18875][SPARKR][DOCS] Fix R API doc generation by adding `DESCRIPTION` file

2016-12-14 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 669815d44 -> d36ed9e1d [SPARK-18875][SPARKR][DOCS] Fix R API doc generation by adding `DESCRIPTION` file ## What changes were proposed in this pull request? Since Apache Spark 1.4.0, R API document page has a broken link on

spark git commit: [SPARK-18875][SPARKR][DOCS] Fix R API doc generation by adding `DESCRIPTION` file

2016-12-14 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 5d510c693 -> ec0eae486 [SPARK-18875][SPARKR][DOCS] Fix R API doc generation by adding `DESCRIPTION` file ## What changes were proposed in this pull request? Since Apache Spark 1.4.0, R API document page has a broken link on `DESCRIPTION

spark git commit: [SPARK-18869][SQL] Add TreeNode.p that returns BaseType

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 a32317845 -> 669815d44 [SPARK-18869][SQL] Add TreeNode.p that returns BaseType ## What changes were proposed in this pull request? After the bug fix in SPARK-18854, TreeNode.apply now returns TreeNode[_] rather than a more specific

spark git commit: [SPARK-18869][SQL] Add TreeNode.p that returns BaseType

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 cb2c8428d -> b14fc3918 [SPARK-18869][SQL] Add TreeNode.p that returns BaseType ## What changes were proposed in this pull request? After the bug fix in SPARK-18854, TreeNode.apply now returns TreeNode[_] rather than a more specific

spark git commit: [SPARK-18869][SQL] Add TreeNode.p that returns BaseType

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master d6f11a12a -> 5d510c693 [SPARK-18869][SQL] Add TreeNode.p that returns BaseType ## What changes were proposed in this pull request? After the bug fix in SPARK-18854, TreeNode.apply now returns TreeNode[_] rather than a more specific type.

spark git commit: [SPARK-18856][SQL] non-empty partitioned table should not report zero size

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 0d94201e0 -> cb2c8428d [SPARK-18856][SQL] non-empty partitioned table should not report zero size ## What changes were proposed in this pull request? In `DataSource`, if the table is not analyzed, we will use 0 as the default value

spark git commit: [SPARK-18856][SQL] non-empty partitioned table should not report zero size

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8db4d95c0 -> d6f11a12a [SPARK-18856][SQL] non-empty partitioned table should not report zero size ## What changes were proposed in this pull request? In `DataSource`, if the table is not analyzed, we will use 0 as the default value for

spark git commit: [SPARK-18703][SQL] Drop Staging Directories and Data Files After each Insertion/CTAS of Hive serde Tables

2016-12-14 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 324388531 -> 8db4d95c0 [SPARK-18703][SQL] Drop Staging Directories and Data Files After each Insertion/CTAS of Hive serde Tables ### What changes were proposed in this pull request? Below are the files/directories generated for three

spark git commit: [SPARK-18865][SPARKR] SparkR vignettes MLP and LDA updates

2016-12-14 Thread felixcheung
Repository: spark Updated Branches: refs/heads/branch-2.1 280c35af9 -> 0d94201e0 [SPARK-18865][SPARKR] SparkR vignettes MLP and LDA updates ## What changes were proposed in this pull request? When do the QA work, I found that the following issues: 1). `spark.mlp` doesn't include an example;

spark git commit: [SPARK-18865][SPARKR] SparkR vignettes MLP and LDA updates

2016-12-14 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master ffdd1fcd1 -> 324388531 [SPARK-18865][SPARKR] SparkR vignettes MLP and LDA updates ## What changes were proposed in this pull request? When do the QA work, I found that the following issues: 1). `spark.mlp` doesn't include an example; 2).

[1/2] spark git commit: Revert "Revert "[SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for subqueries""

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 a5c178bc0 -> a32317845 Revert "Revert "[SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for subqueries"" This reverts commit a5c178bc07092b698ee17894a439deb47699db0f. Project:

[2/2] spark git commit: Fix compilation error

2016-12-14 Thread rxin
Fix compilation error Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a3231784 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a3231784 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a3231784 Branch:

spark git commit: Revert "[SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for subqueries"

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1ff738afc -> a5c178bc0 Revert "[SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for subqueries" This reverts commit 1ff738afc1b11eacb11ac4f37324334a6b6fe41b. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: [SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for subqueries

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 2b18bd4b9 -> 1ff738afc [SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for subqueries This is a bug introduced by subquery handling. numberedTreeString (which uses generateTreeString under the hood) numbers trees

spark git commit: [SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for subqueries

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 d0d9c5725 -> 280c35af9 [SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for subqueries ## What changes were proposed in this pull request? This is a bug introduced by subquery handling. numberedTreeString (which uses

spark git commit: [SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for subqueries

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 786274257 -> ffdd1fcd1 [SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for subqueries ## What changes were proposed in this pull request? This is a bug introduced by subquery handling. numberedTreeString (which uses

spark git commit: [SPARK-18795][ML][SPARKR][DOC] Added KSTest section to SparkR vignettes

2016-12-14 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 1ac6567bd -> 786274257 [SPARK-18795][ML][SPARKR][DOC] Added KSTest section to SparkR vignettes ## What changes were proposed in this pull request? Added short section for KSTest. Also added logreg model to list of ML models in vignette.

spark git commit: [SPARK-18795][ML][SPARKR][DOC] Added KSTest section to SparkR vignettes

2016-12-14 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-2.1 c4de90fc7 -> d0d9c5725 [SPARK-18795][ML][SPARKR][DOC] Added KSTest section to SparkR vignettes ## What changes were proposed in this pull request? Added short section for KSTest. Also added logreg model to list of ML models in

spark git commit: [SPARK-18852][SS] StreamingQuery.lastProgress should be null when recentProgress is empty

2016-12-14 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 e8866f9fc -> c4de90fc7 [SPARK-18852][SS] StreamingQuery.lastProgress should be null when recentProgress is empty ## What changes were proposed in this pull request? Right now `StreamingQuery.lastProgress` throws

spark git commit: [SPARK-18852][SS] StreamingQuery.lastProgress should be null when recentProgress is empty

2016-12-14 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5d7994736 -> 1ac6567bd [SPARK-18852][SS] StreamingQuery.lastProgress should be null when recentProgress is empty ## What changes were proposed in this pull request? Right now `StreamingQuery.lastProgress` throws NoSuchElementException

spark git commit: [SPARK-18853][SQL] Project (UnaryNode) is way too aggressive in estimating statistics

2016-12-14 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.0 1d5c7f452 -> 2b18bd4b9 [SPARK-18853][SQL] Project (UnaryNode) is way too aggressive in estimating statistics This patch reduces the default number element estimation for arrays and maps from 100 to 1. The issue with the 100 number is

spark git commit: [SPARK-18853][SQL] Project (UnaryNode) is way too aggressive in estimating statistics

2016-12-14 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.1 af12a21ca -> e8866f9fc [SPARK-18853][SQL] Project (UnaryNode) is way too aggressive in estimating statistics ## What changes were proposed in this pull request? This patch reduces the default number element estimation for arrays and

spark git commit: [SPARK-18853][SQL] Project (UnaryNode) is way too aggressive in estimating statistics

2016-12-14 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 89ae26dcd -> 5d7994736 [SPARK-18853][SQL] Project (UnaryNode) is way too aggressive in estimating statistics ## What changes were proposed in this pull request? This patch reduces the default number element estimation for arrays and maps

spark git commit: [SPARK-18753][SQL] Keep pushed-down null literal as a filter in Spark-side post-filter for FileFormat datasources

2016-12-14 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.1 16d4bd4a2 -> af12a21ca [SPARK-18753][SQL] Keep pushed-down null literal as a filter in Spark-side post-filter for FileFormat datasources ## What changes were proposed in this pull request? Currently, `FileSourceStrategy` does not

spark git commit: [SPARK-18753][SQL] Keep pushed-down null literal as a filter in Spark-side post-filter for FileFormat datasources

2016-12-14 Thread lian
Repository: spark Updated Branches: refs/heads/master 169b9d73e -> 89ae26dcd [SPARK-18753][SQL] Keep pushed-down null literal as a filter in Spark-side post-filter for FileFormat datasources ## What changes were proposed in this pull request? Currently, `FileSourceStrategy` does not handle

spark git commit: [SPARK-18830][TESTS] Fix tests in PipedRDDSuite to pass on Windows

2016-12-14 Thread srowen
Repository: spark Updated Branches: refs/heads/master c6b8eb71a -> 169b9d73e [SPARK-18830][TESTS] Fix tests in PipedRDDSuite to pass on Windows ## What changes were proposed in this pull request? This PR proposes to fix the tests failed on Windows as below: ``` [info] - pipe with empty

spark git commit: [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in classpaths in commands for local-cluster mode to work around the path length limitation on Windows

2016-12-14 Thread srowen
Repository: spark Updated Branches: refs/heads/master ba4aab9b8 -> c6b8eb71a [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in classpaths in commands for local-cluster mode to work around the path length limitation on Windows ## What changes were proposed in this pull request? Currently,

spark git commit: [SPARK-18730] Post Jenkins test report page instead of the full console output page to GitHub

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master ac013ea58 -> ba4aab9b8 [SPARK-18730] Post Jenkins test report page instead of the full console output page to GitHub ## What changes were proposed in this pull request? Currently, the full console output page of a Spark Jenkins PR build

spark git commit: [SPARK-18730] Post Jenkins test report page instead of the full console output page to GitHub

2016-12-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 f999312e7 -> 16d4bd4a2 [SPARK-18730] Post Jenkins test report page instead of the full console output page to GitHub ## What changes were proposed in this pull request? Currently, the full console output page of a Spark Jenkins PR

spark git commit: [SPARK-18846][SCHEDULER] Fix flakiness in SchedulerIntegrationSuite

2016-12-14 Thread irashid
Repository: spark Updated Branches: refs/heads/master cccd64393 -> ac013ea58 [SPARK-18846][SCHEDULER] Fix flakiness in SchedulerIntegrationSuite There is a small race in SchedulerIntegrationSuite. The test assumes that the taskscheduler thread processing that last task will finish before the

spark git commit: [SPARK-18814][SQL] CheckAnalysis rejects TPCDS query 32

2016-12-14 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.1 8ef005931 -> f999312e7 [SPARK-18814][SQL] CheckAnalysis rejects TPCDS query 32 ## What changes were proposed in this pull request? Move the checking of GROUP BY column in correlated scalar subquery from CheckAnalysis to Analysis to

spark git commit: [SPARK-18814][SQL] CheckAnalysis rejects TPCDS query 32

2016-12-14 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 3e307b495 -> cccd64393 [SPARK-18814][SQL] CheckAnalysis rejects TPCDS query 32 ## What changes were proposed in this pull request? Move the checking of GROUP BY column in correlated scalar subquery from CheckAnalysis to Analysis to fix a