spark git commit: [SPARK-20024][SQL][TEST-MAVEN] SessionCatalog reset need to set the current database of ExternalCatalog

2017-03-20 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 68d65fae7 -> d2dcd6792 [SPARK-20024][SQL][TEST-MAVEN] SessionCatalog reset need to set the current database of ExternalCatalog ### What changes were proposed in this pull request? SessionCatalog API setCurrentDatabase does not set the

spark git commit: [SPARK-19949][SQL] unify bad record handling in CSV and JSON

2017-03-20 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 21e366aea -> 68d65fae7 [SPARK-19949][SQL] unify bad record handling in CSV and JSON ## What changes were proposed in this pull request? Currently JSON and CSV have exactly the same logic about handling bad records, this PR tries to

spark git commit: [SPARK-19912][SQL] String literals should be escaped for Hive metastore partition pruning

2017-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 7fa116f8f -> 21e366aea [SPARK-19912][SQL] String literals should be escaped for Hive metastore partition pruning ## What changes were proposed in this pull request? Since current `HiveShim`'s `convertFilters` does not escape the string

spark git commit: [SPARK-19912][SQL] String literals should be escaped for Hive metastore partition pruning

2017-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.1 d205d40ae -> c4c7b1857 [SPARK-19912][SQL] String literals should be escaped for Hive metastore partition pruning ## What changes were proposed in this pull request? Since current `HiveShim`'s `convertFilters` does not escape the

spark git commit: [SPARK-17204][CORE] Fix replicated off heap storage

2017-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 0ec1db547 -> 7fa116f8f [SPARK-17204][CORE] Fix replicated off heap storage (Jira: https://issues.apache.org/jira/browse/SPARK-17204) ## What changes were proposed in this pull request? There are a couple of bugs in the `BlockManager`

spark git commit: [SPARK-17204][CORE] Fix replicated off heap storage

2017-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.1 af8bf2183 -> d205d40ae [SPARK-17204][CORE] Fix replicated off heap storage (Jira: https://issues.apache.org/jira/browse/SPARK-17204) ## What changes were proposed in this pull request? There are a couple of bugs in the `BlockManager`

spark git commit: [SPARK-19980][SQL] Add NULL checks in Bean serializer

2017-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/master e9c91badc -> 0ec1db547 [SPARK-19980][SQL] Add NULL checks in Bean serializer ## What changes were proposed in this pull request? A Bean serializer in `ExpressionEncoder` could change values when Beans having NULL. A concrete example is

spark git commit: [SPARK-20010][SQL] Sort information is lost after sort merge join

2017-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 10691d36d -> e9c91badc [SPARK-20010][SQL] Sort information is lost after sort merge join ## What changes were proposed in this pull request? After sort merge join for inner join, now we only keep left key ordering. However, after inner

spark git commit: [SPARK-19573][SQL] Make NaN/null handling consistent in approxQuantile

2017-03-20 Thread lixiao
Repository: spark Updated Branches: refs/heads/master c2d1761a5 -> 10691d36d [SPARK-19573][SQL] Make NaN/null handling consistent in approxQuantile ## What changes were proposed in this pull request? update `StatFunctions.multipleApproxQuantiles` to handle NaN/null ## How was this patch

spark git commit: [SPARK-19906][SS][DOCS] Documentation describing how to write queries to Kafka

2017-03-20 Thread tdas
Repository: spark Updated Branches: refs/heads/master bec6b16c1 -> c2d1761a5 [SPARK-19906][SS][DOCS] Documentation describing how to write queries to Kafka ## What changes were proposed in this pull request? Add documentation that describes how to write streaming and batch queries to Kafka.

spark git commit: [SPARK-19899][ML] Replace featuresCol with itemsCol in ml.fpm.FPGrowth

2017-03-20 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master fc7554599 -> bec6b16c1 [SPARK-19899][ML] Replace featuresCol with itemsCol in ml.fpm.FPGrowth ## What changes were proposed in this pull request? Replaces `featuresCol` `Param` with `itemsCol`. See

spark git commit: [SPARK-19970][SQL] Table owner should be USER instead of PRINCIPAL in kerberized clusters

2017-03-20 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 7ce30e00b -> fc7554599 [SPARK-19970][SQL] Table owner should be USER instead of PRINCIPAL in kerberized clusters ## What changes were proposed in this pull request? In the kerberized hadoop cluster, when Spark creates tables, the owner

spark git commit: [SPARK-19990][SQL][TEST-MAVEN] create a temp file for file in test.jar's resource when run mvn test accross different modules

2017-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 816391159 -> 7ce30e00b [SPARK-19990][SQL][TEST-MAVEN] create a temp file for file in test.jar's resource when run mvn test accross different modules ## What changes were proposed in this pull request? After we have merged the

spark git commit: [SPARK-17791][SQL] Join reordering using star schema detection

2017-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/master f14f81e90 -> 816391159 [SPARK-17791][SQL] Join reordering using star schema detection ## What changes were proposed in this pull request? Star schema consists of one or more fact tables referencing a number of dimension tables. In

spark git commit: [SPARK-20020][SPARKR][FOLLOWUP] DataFrame checkpoint API fix version tag

2017-03-20 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 965a5abcf -> f14f81e90 [SPARK-20020][SPARKR][FOLLOWUP] DataFrame checkpoint API fix version tag ## What changes were proposed in this pull request? doc only change ## How was this patch tested? manual Author: Felix Cheung

spark git commit: [SPARK-19994][SQL] Wrong outputOrdering for right/full outer smj

2017-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.0 6ee7d5bf4 -> 3983b3dcd [SPARK-19994][SQL] Wrong outputOrdering for right/full outer smj ## What changes were proposed in this pull request? For right outer join, values of the left key will be filled with nulls if it can't match the

spark git commit: [SPARK-19994][SQL] Wrong outputOrdering for right/full outer smj

2017-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.1 b60f69025 -> af8bf2183 [SPARK-19994][SQL] Wrong outputOrdering for right/full outer smj ## What changes were proposed in this pull request? For right outer join, values of the left key will be filled with nulls if it can't match the

spark git commit: [SPARK-19994][SQL] Wrong outputOrdering for right/full outer smj

2017-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/master c40597720 -> 965a5abcf [SPARK-19994][SQL] Wrong outputOrdering for right/full outer smj ## What changes were proposed in this pull request? For right outer join, values of the left key will be filled with nulls if it can't match the