spark git commit: [SPARK-21237][SQL] Invalidate stats once table data is changed

2017-06-28 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 25c2edf6f -> 82e24912d [SPARK-21237][SQL] Invalidate stats once table data is changed ## What changes were proposed in this pull request? Invalidate spark's stats after data changing commands: - InsertIntoHadoopFsRelationCommand -

spark git commit: [SPARK-21229][SQL] remove QueryPlan.preCanonicalized

2017-06-28 Thread wenchen
Repository: spark Updated Branches: refs/heads/master fc92d25f2 -> 25c2edf6f [SPARK-21229][SQL] remove QueryPlan.preCanonicalized ## What changes were proposed in this pull request? `QueryPlan.preCanonicalized` is only overridden in a few places, and it does introduce an extra concept to

spark git commit: Revert "[SPARK-21094][R] Terminate R's worker processes in the parent of R's daemon to prevent a leak"

2017-06-28 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master db44f5f3e -> fc92d25f2 Revert "[SPARK-21094][R] Terminate R's worker processes in the parent of R's daemon to prevent a leak" This reverts commit 6b3d02285ee0debc73cbcab01b10398a498fbeb8. Project:

spark git commit: [SPARK-21224][R] Specify a schema by using a DDL-formatted string when reading in R

2017-06-28 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 0c8444cf6 -> db44f5f3e [SPARK-21224][R] Specify a schema by using a DDL-formatted string when reading in R ## What changes were proposed in this pull request? This PR proposes to support a DDL-formetted string as schema as below: ```r

spark git commit: [SPARK-14657][SPARKR][ML] RFormula w/o intercept should output reference category when encoding string terms

2017-06-28 Thread yliang
Repository: spark Updated Branches: refs/heads/master 376d90d55 -> 0c8444cf6 [SPARK-14657][SPARKR][ML] RFormula w/o intercept should output reference category when encoding string terms ## What changes were proposed in this pull request? Please see

spark git commit: [SPARK-20889][SPARKR] Grouped documentation for STRING column methods

2017-06-28 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master b72b8521d -> 376d90d55 [SPARK-20889][SPARKR] Grouped documentation for STRING column methods ## What changes were proposed in this pull request? Grouped documentation for string column methods. Author: actuaryzhang

spark git commit: [SPARK-21222] Move elimination of Distinct clause from analyzer to optimizer

2017-06-28 Thread wenchen
Repository: spark Updated Branches: refs/heads/master e68aed70f -> b72b8521d [SPARK-21222] Move elimination of Distinct clause from analyzer to optimizer ## What changes were proposed in this pull request? Move elimination of Distinct clause from analyzer to optimizer Distinct clause is

spark git commit: [SPARK-21216][SS] Hive strategies missed in Structured Streaming IncrementalExecution

2017-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 838effb98 -> e68aed70f [SPARK-21216][SS] Hive strategies missed in Structured Streaming IncrementalExecution ## What changes were proposed in this pull request? If someone creates a HiveSession, the planner in `IncrementalExecution`

spark git commit: Revert "[SPARK-13534][PYSPARK] Using Apache Arrow to increase performance of DataFrame.toPandas"

2017-06-28 Thread wenchen
Repository: spark Updated Branches: refs/heads/master e793bf248 -> 838effb98 Revert "[SPARK-13534][PYSPARK] Using Apache Arrow to increase performance of DataFrame.toPandas" This reverts commit e44697606f429b01808c1a22cb44cb5b89585c5c. Project:

spark git commit: [SPARK-20889][SPARKR] Grouped documentation for MATH column methods

2017-06-28 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 2d686a19e -> e793bf248 [SPARK-20889][SPARKR] Grouped documentation for MATH column methods ## What changes were proposed in this pull request? Grouped documentation for math column methods. Author: actuaryzhang