[SPARK-16980][SQL] Load only catalog table partition metadata required to
answer a query
(This PR addresses https://issues.apache.org/jira/browse/SPARK-16980.)
## What changes were proposed in this pull request?
In a new Spark session, when a partitioned Hive table is converted to use
Spark's
Repository: spark
Updated Branches:
refs/heads/master 2d96d35dc -> 6ce1b675e
http://git-wip-us.apache.org/repos/asf/spark/blob/6ce1b675/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClient.scala
--
diff --git
Repository: spark
Updated Branches:
refs/heads/master 72adfbf94 -> 2d96d35dc
[SPARK-17946][PYSPARK] Python crossJoin API similar to Scala
## What changes were proposed in this pull request?
Add a crossJoin function to the DataFrame API similar to that in Scala. Joins
with no condition
Repository: spark
Updated Branches:
refs/heads/master f00df40cf -> 72adfbf94
[SPARK-17900][SQL] Graduate a list of Spark SQL APIs to stable
## What changes were proposed in this pull request?
This patch graduates a list of Spark SQL APIs and mark them stable.
The following are marked stable:
Repository: spark
Updated Branches:
refs/heads/master 5aeb7384c -> f00df40cf
[SPARK-11775][PYSPARK][SQL] Allow PySpark to register Java UDF
Currently pyspark can only call the builtin java UDF, but can not call custom
java UDF. It would be better to allow that. 2 benefits:
* Leverage the
Repository: spark
Updated Branches:
refs/heads/master da9aeb0fd -> 5aeb7384c
[SPARK-16063][SQL] Add storageLevel to Dataset
[SPARK-11905](https://issues.apache.org/jira/browse/SPARK-11905) added support
for `persist`/`cache` for `Dataset`. However, there is no user-facing API to
check if a
Repository: spark
Updated Branches:
refs/heads/branch-2.0 d7fa3e324 -> c53b83749
[SPARK-17863][SQL] should not add column into Distinct
## What changes were proposed in this pull request?
We are trying to resolve the attribute in sort by pulling up some column for
grandchild into child, but
Repository: spark
Updated Branches:
refs/heads/master 522dd0d0e -> da9aeb0fd
[SPARK-17863][SQL] should not add column into Distinct
## What changes were proposed in this pull request?
We are trying to resolve the attribute in sort by pulling up some column for
grandchild into child, but
Repository: spark
Updated Branches:
refs/heads/master 7ab86244e -> 522dd0d0e
Revert "[SPARK-17620][SQL] Determine Serde by hive.default.fileformat when
Creating Hive Serde Tables"
This reverts commit 7ab86244e30ca81eb4fa40ea77b4c2b8881cbab2.
Project:
Repository: spark
Updated Branches:
refs/heads/master de1c1ca5c -> 7ab86244e
[SPARK-17620][SQL] Determine Serde by hive.default.fileformat when Creating
Hive Serde Tables
## What changes were proposed in this pull request?
Make sure the hive.default.fileformat is used to when creating the
Repository: spark
Updated Branches:
refs/heads/master 05800b4b4 -> de1c1ca5c
[SPARK-17941][ML][TEST] Logistic regression tests should use sample weights.
## What changes were proposed in this pull request?
The sample weight testing for logistic regressions is not robust. Logistic
regression
Repository: spark
Updated Branches:
refs/heads/master fa37877af -> 05800b4b4
[TEST] Ignore flaky test in StreamingQueryListenerSuite
## What changes were proposed in this pull request?
Ignoring the flaky test introduced in #15307
Repository: spark
Updated Branches:
refs/heads/branch-1.6 18b173cfc -> 745c5e70f
[SPARK-17884][SQL] To resolve Null pointer exception when casting from empty
string to interval type
## What changes were proposed in this pull request?
This change adds a check in castToInterval method of Cast
Repository: spark
Updated Branches:
refs/heads/master a0ebcb3a3 -> fa37877af
Typo: form -> from
## What changes were proposed in this pull request?
Minor typo fix
## How was this patch tested?
Existing unit tests on Jenkins
Author: Andrew Ash
Closes #15486 from
Repository: spark
Updated Branches:
refs/heads/master 7486442fe -> a0ebcb3a3
[DOC] Fix typo in sql hive doc
Change is too trivial to file a JIRA.
Author: Dhruve Ashar
Closes #15485 from dhruve/master.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Repository: spark
Updated Branches:
refs/heads/master 28b645b1e -> 7486442fe
[SPARK-17073][SQL][FOLLOWUP] generate column-level statistics
## What changes were proposed in this pull request?
This pr adds some test cases for statistics: case sensitive column names, non
ascii column names,
Repository: spark
Updated Branches:
refs/heads/master a1b136d05 -> c8b612dec
[SPARK-17870][MLLIB][ML] Change statistic to pValue for SelectKBest and
SelectPercentile because of DoF difference
## What changes were proposed in this pull request?
For feature selection method ChiSquareSelector,
Repository: spark
Updated Branches:
refs/heads/master 1db8feab8 -> a1b136d05
[SPARK-14634][ML] Add BisectingKMeansSummary
## What changes were proposed in this pull request?
Add BisectingKMeansSummary
## How was this patch tested?
unit test
Author: Zheng RuiFeng
Repository: spark
Updated Branches:
refs/heads/master 2fb12b0a3 -> 1db8feab8
[SPARK-15402][ML][PYSPARK] PySpark ml.evaluation should support save/load
## What changes were proposed in this pull request?
Since ```ml.evaluation``` has supported save/load at Scala side, supporting it
at Python
Repository: spark
Updated Branches:
refs/heads/master 6c29b3de7 -> 2fb12b0a3
[SPARK-17903][SQL] MetastoreRelation should talk to external catalog instead of
hive client
## What changes were proposed in this pull request?
`HiveExternalCatalog` should be the only interface to talk to the hive
Repository: spark
Updated Branches:
refs/heads/master 8543996c3 -> 6c29b3de7
[SPARK-17925][SQL] Break fileSourceInterfaces.scala into multiple pieces
## What changes were proposed in this pull request?
This patch does a few changes to the file structure of data sources:
- Break
21 matches
Mail list logo