spark git commit: [SPARK-18721][SS] Fix ForeachSink with watermark + append

2016-12-05 Thread tdas
Repository: spark Updated Branches: refs/heads/master b8c7b8d31 -> 7863c6237 [SPARK-18721][SS] Fix ForeachSink with watermark + append ## What changes were proposed in this pull request? Right now ForeachSink creates a new physical plan, so StreamExecution cannot retrieval metrics and

spark git commit: [SPARK-18672][CORE] Close recordwriter in SparkHadoopMapReduceWriter before committing

2016-12-05 Thread srowen
Repository: spark Updated Branches: refs/heads/master 772ddbeaa -> b8c7b8d31 [SPARK-18672][CORE] Close recordwriter in SparkHadoopMapReduceWriter before committing ## What changes were proposed in this pull request? It seems some APIs such as `PairRDDFunctions.saveAsHadoopDataset()` do not

spark git commit: [SPARK-18572][SQL] Add a method `listPartitionNames` to `ExternalCatalog`

2016-12-05 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.1 d4588165e -> 8ca6a82c1 [SPARK-18572][SQL] Add a method `listPartitionNames` to `ExternalCatalog` (Link to Jira issue: https://issues.apache.org/jira/browse/SPARK-18572) ## What changes were proposed in this pull request? Currently

spark git commit: [SPARK-18722][SS] Move no data rate limit from StreamExecution to ProgressReporter

2016-12-05 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 1946854ab -> d4588165e [SPARK-18722][SS] Move no data rate limit from StreamExecution to ProgressReporter ## What changes were proposed in this pull request? Move no data rate limit from StreamExecution to ProgressReporter to make

spark git commit: [SPARK-18722][SS] Move no data rate limit from StreamExecution to ProgressReporter

2016-12-05 Thread tdas
Repository: spark Updated Branches: refs/heads/master 508de38c9 -> 4af142f55 [SPARK-18722][SS] Move no data rate limit from StreamExecution to ProgressReporter ## What changes were proposed in this pull request? Move no data rate limit from StreamExecution to ProgressReporter to make

spark git commit: [SPARK-18555][SQL] DataFrameNaFunctions.fill miss up original values in long integers

2016-12-05 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2398fde45 -> 508de38c9 [SPARK-18555][SQL] DataFrameNaFunctions.fill miss up original values in long integers ## What changes were proposed in this pull request? DataSet.na.fill(0) used on a DataSet which has a long value column, it

spark git commit: [SPARK-18720][SQL][MINOR] Code Refactoring of withColumn

2016-12-05 Thread wenchen
Repository: spark Updated Branches: refs/heads/master bb57bfe97 -> 2398fde45 [SPARK-18720][SQL][MINOR] Code Refactoring of withColumn ### What changes were proposed in this pull request? Our existing withColumn for adding metadata can simply use the existing public withColumn API. ### How

spark git commit: [SPARK-18657][SPARK-18668] Make StreamingQuery.id persists across restart and not auto-generate StreamingQuery.name

2016-12-05 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 6c4c33684 -> 1946854ab [SPARK-18657][SPARK-18668] Make StreamingQuery.id persists across restart and not auto-generate StreamingQuery.name Here are the major changes in this PR. - Added the ability to recover `StreamingQuery.id` from

spark git commit: [SPARK-18657][SPARK-18668] Make StreamingQuery.id persists across restart and not auto-generate StreamingQuery.name

2016-12-05 Thread tdas
Repository: spark Updated Branches: refs/heads/master 1b2785c3d -> bb57bfe97 [SPARK-18657][SPARK-18668] Make StreamingQuery.id persists across restart and not auto-generate StreamingQuery.name ## What changes were proposed in this pull request? Here are the major changes in this PR. - Added

spark git commit: [SPARK-18729][SS] Move DataFrame.collect out of synchronized block in MemorySink

2016-12-05 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 fecd23d2c -> 6c4c33684 [SPARK-18729][SS] Move DataFrame.collect out of synchronized block in MemorySink ## What changes were proposed in this pull request? Move DataFrame.collect out of synchronized block so that we can query content

spark git commit: [SPARK-18729][SS] Move DataFrame.collect out of synchronized block in MemorySink

2016-12-05 Thread tdas
Repository: spark Updated Branches: refs/heads/master 3ba69b648 -> 1b2785c3d [SPARK-18729][SS] Move DataFrame.collect out of synchronized block in MemorySink ## What changes were proposed in this pull request? Move DataFrame.collect out of synchronized block so that we can query content in

spark git commit: [SPARK-18634][PYSPARK][SQL] Corruption and Correctness issues with exploding Python UDFs

2016-12-05 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.1 c6a4e3d96 -> fecd23d2c [SPARK-18634][PYSPARK][SQL] Corruption and Correctness issues with exploding Python UDFs ## What changes were proposed in this pull request? As reported in the Jira, there are some weird issues with exploding

spark git commit: [SPARK-18634][PYSPARK][SQL] Corruption and Correctness issues with exploding Python UDFs

2016-12-05 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.0 dc61ed406 -> bde1d4133 [SPARK-18634][PYSPARK][SQL] Corruption and Correctness issues with exploding Python UDFs ## What changes were proposed in this pull request? As reported in the Jira, there are some weird issues with exploding

spark git commit: [SPARK-18634][PYSPARK][SQL] Corruption and Correctness issues with exploding Python UDFs

2016-12-05 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 18eaabb71 -> 3ba69b648 [SPARK-18634][PYSPARK][SQL] Corruption and Correctness issues with exploding Python UDFs ## What changes were proposed in this pull request? As reported in the Jira, there are some weird issues with exploding

spark git commit: [SPARK-18694][SS] Add StreamingQuery.explain and exception to Python and fix StreamingQueryException (branch 2.1)

2016-12-05 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 39759ff00 -> c6a4e3d96 [SPARK-18694][SS] Add StreamingQuery.explain and exception to Python and fix StreamingQueryException (branch 2.1) ## What changes were proposed in this pull request? Backport #16125 to branch 2.1. ## How was

spark git commit: [SPARK-18719] Add spark.ui.showConsoleProgress to configuration docs

2016-12-05 Thread davies
Repository: spark Updated Branches: refs/heads/master 5a92dc76a -> 18eaabb71 [SPARK-18719] Add spark.ui.showConsoleProgress to configuration docs This PR adds `spark.ui.showConsoleProgress` to the configuration docs. I tested this PR by building the docs locally and confirming that this

spark git commit: [DOCS][MINOR] Update location of Spark YARN shuffle jar

2016-12-05 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 01a7d33d0 -> 5a92dc76a [DOCS][MINOR] Update location of Spark YARN shuffle jar Looking at the distributions provided on spark.apache.org, I see that the Spark YARN shuffle jar is under `yarn/` and not `lib/`. This change is so minor I'm

spark git commit: [DOCS][MINOR] Update location of Spark YARN shuffle jar

2016-12-05 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-2.1 e23c8cfc8 -> 39759ff00 [DOCS][MINOR] Update location of Spark YARN shuffle jar Looking at the distributions provided on spark.apache.org, I see that the Spark YARN shuffle jar is under `yarn/` and not `lib/`. This change is so minor

spark git commit: [SPARK-18711][SQL] should disable subexpression elimination for LambdaVariable

2016-12-05 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.1 30c074308 -> e23c8cfc8 [SPARK-18711][SQL] should disable subexpression elimination for LambdaVariable ## What changes were proposed in this pull request? This is kind of a long-standing bug, it's hidden until

spark git commit: [SPARK-18711][SQL] should disable subexpression elimination for LambdaVariable

2016-12-05 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 246012859 -> 01a7d33d0 [SPARK-18711][SQL] should disable subexpression elimination for LambdaVariable ## What changes were proposed in this pull request? This is kind of a long-standing bug, it's hidden until

spark git commit: [SPARK-18694][SS] Add StreamingQuery.explain and exception to Python and fix StreamingQueryException

2016-12-05 Thread tdas
Repository: spark Updated Branches: refs/heads/master 410b78986 -> 246012859 [SPARK-18694][SS] Add StreamingQuery.explain and exception to Python and fix StreamingQueryException ## What changes were proposed in this pull request? - Add StreamingQuery.explain and exception to Python. - Fix

spark git commit: [MINOR][DOC] Use SparkR `TRUE` value and add default values for `StructField` in SQL Guide.

2016-12-05 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.1 1821cbead -> afd2321b6 [MINOR][DOC] Use SparkR `TRUE` value and add default values for `StructField` in SQL Guide. ## What changes were proposed in this pull request? In `SQL Programming Guide`, this PR uses `TRUE` instead of `True`

spark git commit: [MINOR][DOC] Use SparkR `TRUE` value and add default values for `StructField` in SQL Guide.

2016-12-05 Thread shivaram
Repository: spark Updated Branches: refs/heads/master eb8dd6813 -> 410b78986 [MINOR][DOC] Use SparkR `TRUE` value and add default values for `StructField` in SQL Guide. ## What changes were proposed in this pull request? In `SQL Programming Guide`, this PR uses `TRUE` instead of `True` in

spark git commit: [SPARK-18279][DOC][ML][SPARKR] Add R examples to ML programming guide.

2016-12-05 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.1 88e07efe8 -> 1821cbead [SPARK-18279][DOC][ML][SPARKR] Add R examples to ML programming guide. ## What changes were proposed in this pull request? Add R examples to ML programming guide for the following algorithms as POC: * spark.glm *

spark git commit: [SPARK-18279][DOC][ML][SPARKR] Add R examples to ML programming guide.

2016-12-05 Thread yliang
Repository: spark Updated Branches: refs/heads/master bdfe7f674 -> eb8dd6813 [SPARK-18279][DOC][ML][SPARKR] Add R examples to ML programming guide. ## What changes were proposed in this pull request? Add R examples to ML programming guide for the following algorithms as POC: * spark.glm *

spark git commit: [SPARK-18625][ML] OneVsRestModel should support setFeaturesCol and setPredictionCol

2016-12-05 Thread yliang
Repository: spark Updated Branches: refs/heads/master e9730b707 -> bdfe7f674 [SPARK-18625][ML] OneVsRestModel should support setFeaturesCol and setPredictionCol ## What changes were proposed in this pull request? add `setFeaturesCol` and `setPredictionCol` for `OneVsRestModel` ## How was

spark git commit: [SPARK-18625][ML] OneVsRestModel should support setFeaturesCol and setPredictionCol

2016-12-05 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.1 c13c2939f -> 88e07efe8 [SPARK-18625][ML] OneVsRestModel should support setFeaturesCol and setPredictionCol ## What changes were proposed in this pull request? add `setFeaturesCol` and `setPredictionCol` for `OneVsRestModel` ## How