sqlContext
import sqlContext.sql
^
```
yhuai marmbrus
Author: Andrew Or and...@databricks.com
Closes #5997 from andrewor14/sql-shell-crash and squashes the following commits:
61147e6 [Andrew Or] Also expect NoClassDefFoundError
Project: http://git-wip-us.apache.org/repos/asf
Repository: spark
Updated Branches:
refs/heads/branch-1.4 1a3e9e982 - bb5872f2d
[SPARK-7232] [SQL] Add a Substitution batch for spark sql analyzer
Added a new batch named `Substitution` before `Resolution` batch. The
motivation for this is there are kind of cases we want to do some
Repository: spark
Updated Branches:
refs/heads/master 714db2ef5 - f496bf3c5
[SPARK-7232] [SQL] Add a Substitution batch for spark sql analyzer
Added a new batch named `Substitution` before `Resolution` batch. The
motivation for this is there are kind of cases we want to do some
[SPARK-6908] [SQL] Use isolated Hive client
This PR switches Spark SQL's Hive support to use the isolated hive client
interface introduced by #5851, instead of directly interacting with the client.
By using this isolated client we can now allow users to dynamically configure
the version of
Repository: spark
Updated Branches:
refs/heads/branch-1.4 2e8a141b5 - 05454fd8a
http://git-wip-us.apache.org/repos/asf/spark/blob/05454fd8/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/ClientWrapper.scala
--
diff
Repository: spark
Updated Branches:
refs/heads/master 22ab70e06 - cd1d4110c
http://git-wip-us.apache.org/repos/asf/spark/blob/cd1d4110/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/ClientWrapper.scala
--
diff --git
to
determine it is handling a key-value pari, a key, or a value. It is safe to use
`SparkSqlSerializer2` in more cases.
Author: Yin Huai yh...@databricks.com
Closes #5849 from yhuai/serializer2MoreCases and squashes the following commits:
53a5eaa [Yin Huai] Josh's comments.
487f540 [Yin Huai
to
determine it is handling a key-value pari, a key, or a value. It is safe to use
`SparkSqlSerializer2` in more cases.
Author: Yin Huai yh...@databricks.com
Closes #5849 from yhuai/serializer2MoreCases and squashes the following commits:
53a5eaa [Yin Huai] Josh's comments.
487f540 [Yin Huai] Use
sqlContext
import sqlContext.sql
^
```
yhuai marmbrus
Author: Andrew Or and...@databricks.com
Closes #5997 from andrewor14/sql-shell-crash and squashes the following commits:
61147e6 [Andrew Or] Also expect NoClassDefFoundError
(cherry picked from commit
Repository: spark
Updated Branches:
refs/heads/master 845d1d4d0 - 774099670
[HOT-FIX] Move HiveWindowFunctionQuerySuite.scala to hive compatibility dir.
Author: Yin Huai yh...@databricks.com
Closes #5951 from yhuai/fixBuildMaven and squashes the following commits:
fdde183 [Yin Huai] Move
Repository: spark
Updated Branches:
refs/heads/branch-1.4 2163367ea - 14bcb84e8
[HOT-FIX] Move HiveWindowFunctionQuerySuite.scala to hive compatibility dir.
Author: Yin Huai yh...@databricks.com
Closes #5951 from yhuai/fixBuildMaven and squashes the following commits:
fdde183 [Yin Huai
Repository: spark
Updated Branches:
refs/heads/master 150f671c2 - c3eb441f5
[SPARK-6201] [SQL] promote string and do widen types for IN
huangjs
Acutally spark sql will first go through analysis period, in which we do widen
types and promote strings, and then optimization, where constant IN
Repository: spark
Updated Branches:
refs/heads/master 0a901dd3a - cde548388
[SPARK-7375] [SQL] Avoid row copying in exchange when sort.serializeMapOutputs
takes effect
This patch refactors the SQL `Exchange` operator's logic for determining
whether map outputs need to be copied before being
Repository: spark
Updated Branches:
refs/heads/branch-1.4 448ff333f - 21212a27c
[SPARK-7375] [SQL] Avoid row copying in exchange when sort.serializeMapOutputs
takes effect
This patch refactors the SQL `Exchange` operator's logic for determining
whether map outputs need to be copied before
Repository: spark
Updated Branches:
refs/heads/master 4f87e9562 - ed9be06a4
[SPARK-7330] [SQL] avoid NPE at jdbc rdd
Thank nadavoosh point this out in #5590
Author: Daoyuan Wang daoyuan.w...@intel.com
Closes #5877 from adrian-wang/jdbcrdd and squashes the following commits:
cc11900
Repository: spark
Updated Branches:
refs/heads/branch-1.4 91ce13109 - 84ee348bc
[SPARK-7330] [SQL] avoid NPE at jdbc rdd
Thank nadavoosh point this out in #5590
Author: Daoyuan Wang daoyuan.w...@intel.com
Closes #5877 from adrian-wang/jdbcrdd and squashes the following commits:
cc11900
Repository: spark
Updated Branches:
refs/heads/branch-1.3 cbf232daa - edcd3643a
[SPARK-7330] [SQL] avoid NPE at jdbc rdd
Thank nadavoosh point this out in #5590
Author: Daoyuan Wang daoyuan.w...@intel.com
Closes #5877 from adrian-wang/jdbcrdd and squashes the following commits:
cc11900
Repository: spark
Updated Branches:
refs/heads/branch-1.4 e0632ffaf - be66d1924
[SQL] [MINOR] use catalyst type converter in ScalaUdf
It's a follow-up of https://github.com/apache/spark/pull/5154, we can speed up
scala udf evaluation by create type converter in advance.
Author: Wenchen Fan
Repository: spark
Updated Branches:
refs/heads/master ca4257aec - 2f22424e9
[SQL] [MINOR] use catalyst type converter in ScalaUdf
It's a follow-up of https://github.com/apache/spark/pull/5154, we can speed up
scala udf evaluation by create type converter in advance.
Author: Wenchen Fan
Repository: spark
Updated Branches:
refs/heads/master 530397ba2 - 9dadf019b
[SPARK-7673] [SQL] WIP: HadoopFsRelation and ParquetRelation2 performance
optimizations
This PR introduces several performance optimizations to `HadoopFsRelation` and
`ParquetRelation2`:
1. Moving `FileStatus`
).explain(true)
```
In our master `explain` takes 40s in my laptop. With this PR, `explain` takes
14s.
Author: Yin Huai yh...@databricks.com
Closes #6252 from yhuai/broadcastHadoopConf and squashes the following commits:
6fa73df [Yin Huai] Address comments of Josh and Andrew.
807fbf9 [Yin Huai] Make
).explain(true)
```
In our master `explain` takes 40s in my laptop. With this PR, `explain` takes
14s.
Author: Yin Huai yh...@databricks.com
Closes #6252 from yhuai/broadcastHadoopConf and squashes the following commits:
6fa73df [Yin Huai] Address comments of Josh and Andrew.
807fbf9 [Yin Huai
Repository: spark
Updated Branches:
refs/heads/branch-1.4 b6182ce89 - 4fd674336
[SPARK-7320] [SQL] Add Cube / Rollup for dataframe
This is a follow up for #6257, which broke the maven test.
Add cube rollup for DataFrame
For example:
```scala
testData.rollup($a + $b, $b).agg(sum($a - $b))
Repository: spark
Updated Branches:
refs/heads/master 895baf8f7 - 42c592adb
[SPARK-7320] [SQL] Add Cube / Rollup for dataframe
This is a follow up for #6257, which broke the maven test.
Add cube rollup for DataFrame
For example:
```scala
testData.rollup($a + $b, $b).agg(sum($a - $b))
Repository: spark
Updated Branches:
refs/heads/master bcb47ad77 - 7b7f7b6c6
[SPARK-8020] [SQL] Spark SQL conf in spark-defaults.conf make metadataHive get
constructed too early
https://issues.apache.org/jira/browse/SPARK-8020
Author: Yin Huai yh...@databricks.com
Closes #6571 from yhuai
yhuai/SPARK-8020-1 and squashes the following commits:
0398f5b [Yin Huai] First populate the SQLConf and then construct executionHive
and metadataHive.
(cherry picked from commit 7b7f7b6c6fd903e2ecfc886d29eaa9df58adcfc3)
Signed-off-by: Yin Huai yh...@databricks.com
Project: http://git-wip
Repository: spark
Updated Branches:
refs/heads/master ed5c2dccd - bbdfc0a40
[SPARK-8121] [SQL] Fixes InsertIntoHadoopFsRelation job initialization for
Hadoop 1.x
For Hadoop 1.x, `TaskAttemptContext` constructor clones the `Configuration`
argument, thus configurations done in
Repository: spark
Updated Branches:
refs/heads/branch-1.4 a3afc2cba - 69197c3e3
[SPARK-8121] [SQL] Fixes InsertIntoHadoopFsRelation job initialization for
Hadoop 1.x (branch 1.4 backport based on
https://github.com/apache/spark/pull/6669)
Project:
Repository: spark
Updated Branches:
refs/heads/master 4f16d3fe2 - 4060526cd
[SPARK-7747] [SQL] [DOCS] spark.sql.planner.externalSort
Add documentation for spark.sql.planner.externalSort
Author: Luca Martinetti l...@luca.io
Closes #6272 from lucamartinetti/docs-externalsort and squashes the
Repository: spark
Updated Branches:
refs/heads/branch-1.4 200c980a1 - 94f65bcce
[SPARK-7747] [SQL] [DOCS] spark.sql.planner.externalSort
Add documentation for spark.sql.planner.externalSort
Author: Luca Martinetti l...@luca.io
Closes #6272 from lucamartinetti/docs-externalsort and squashes
Repository: spark
Updated Branches:
refs/heads/master 6ebe419f3 - eb19d3f75
[SPARK-6964] [SQL] Support Cancellation in the Thrift Server
Support runInBackground in SparkExecuteStatementOperation, and add cancellation
Author: Dong Wang d...@databricks.com
Closes #6207 from
Repository: spark
Updated Branches:
refs/heads/branch-1.4 815e05654 - cbaf59544
[SPARK-8014] [SQL] Avoid premature metadata discovery when writing a
HadoopFsRelation with a save mode other than Append
The current code references the schema of the DataFrame to be written before
checking save
Repository: spark
Updated Branches:
refs/heads/branch-1.4 ee7f365bd - 54a4ea407
[SPARK-7973] [SQL] Increase the timeout of two CliSuite tests.
https://issues.apache.org/jira/browse/SPARK-7973
Author: Yin Huai yh...@databricks.com
Closes #6525 from yhuai/SPARK-7973 and squashes the following
Repository: spark
Updated Branches:
refs/heads/master 28dbde387 - f1646e102
[SPARK-7973] [SQL] Increase the timeout of two CliSuite tests.
https://issues.apache.org/jira/browse/SPARK-7973
Author: Yin Huai yh...@databricks.com
Closes #6525 from yhuai/SPARK-7973 and squashes the following
Repository: spark
Updated Branches:
refs/heads/branch-1.4 b836bac3f - 451c8722a
[SPARK-8406] [SQL] Backports SPARK-8406 and PR #6864 to branch-1.4
Author: Cheng Lian l...@databricks.com
Closes #6932 from liancheng/spark-8406-for-1.4 and squashes the following
commits:
a0168fe [Cheng Lian]
://github.com/liancheng/spark/tree/spark-8513
Some background and a summary of offline discussion with yhuai about this issue
for better understanding:
In 1.4.0, we added `HadoopFsRelation` to abstract partition support of all data
sources that are based on Hadoop `FileSystem` interface
/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
Author: Michael Armbrust mich...@databricks.com
Closes #6914 from yhuai/timeCompareString-1.4 and squashes the following
commits:
9882915 [Michael Armbrust] [SPARK-8420] [SQL] Fix comparision of
timestamps/dates
actions (i.e.
`save/saveAsTable/json/parquet/jdbc`) always override mode. Second, it adds
input argument `partitionBy` to `save/saveAsTable/parquet`.
Author: Yin Huai yh...@databricks.com
Closes #6937 from yhuai/SPARK-8532 and squashes the following commits:
f972d5d [Yin Huai] davies's comment
actions (i.e.
`save/saveAsTable/json/parquet/jdbc`) always override mode. Second, it adds
input argument `partitionBy` to `save/saveAsTable/parquet`.
Author: Yin Huai yh...@databricks.com
Closes #6937 from yhuai/SPARK-8532 and squashes the following commits:
f972d5d [Yin Huai] davies's comment
Repository: spark
Updated Branches:
refs/heads/master 9814b971f - a333a72e0
[SPARK-8420] [SQL] Fix comparision of timestamps/dates with strings
In earlier versions of Spark SQL we casted `TimestampType` and `DataType` to
`StringType` when it was involved in a binary comparison with a
Repository: spark
Updated Branches:
refs/heads/branch-1.4 1a6b51078 - 0131142d9
[SPARK-8093] [SQL] Remove empty structs inferred from JSON documents
Author: Nathan Howell nhow...@godaddy.com
Closes #6799 from NathanHowell/spark-8093 and squashes the following commits:
76ac3e8 [Nathan
Repository: spark
Updated Branches:
refs/heads/master 1fa29c2df - 9814b971f
[SPARK-8093] [SQL] Remove empty structs inferred from JSON documents
Author: Nathan Howell nhow...@godaddy.com
Closes #6799 from NathanHowell/spark-8093 and squashes the following commits:
76ac3e8 [Nathan Howell]
Repository: spark
Updated Branches:
refs/heads/branch-1.4 2a7ea31a9 - b836bac3f
[HOTFIX] Hotfix branch-1.4 building by removing avgMetrics in
CrossValidatorSuite
Ref. #6905
ping yhuai
Author: Liang-Chi Hsieh vii...@gmail.com
Closes #6929 from viirya/hot_fix_cv_test and squashes
Repository: spark
Updated Branches:
refs/heads/branch-1.4 0131142d9 - 2510365fa
[HOT-FIX] Fix compilation (caused by 0131142d98b191f6cc112d383aa10582a3ac35bf)
Author: Yin Huai yh...@databricks.com
Closes #6913 from yhuai/branch-1.4-hotfix and squashes the following commits:
7f91fa0 [Yin
Repository: spark
Updated Branches:
refs/heads/branch-1.4 2510365fa - 2248ad8b7
[SPARK-8498] [SQL] Add regression test for SPARK-8470
**Summary of the problem in SPARK-8470.** When using `HiveContext` to create a
data frame of a user case class, Spark throws
Repository: spark
Updated Branches:
refs/heads/master b305e377f - 093c34838
[SPARK-8498] [SQL] Add regression test for SPARK-8470
**Summary of the problem in SPARK-8470.** When using `HiveContext` to create a
data frame of a user case class, Spark throws
Repository: spark
Updated Branches:
refs/heads/master e988adb58 - f9b397f54
[SPARK-8567] [SQL] Add logs to record the progress of HiveSparkSubmitSuite.
Author: Yin Huai yh...@databricks.com
Closes #7009 from yhuai/SPARK-8567 and squashes the following commits:
62fb1f9 [Yin Huai] Add sc.stop
Repository: spark
Updated Branches:
refs/heads/master a458efc66 - 50c3a86f4
[SPARK-6749] [SQL] Make metastore client robust to underlying socket connection
loss
This works around a bug in the underlying RetryingMetaStoreClient (HIVE-10384)
by refreshing the metastore client on thrift
.
(This test suite only fails on Jenkins and doesn't spill out any log...)
cc yhuai
Author: Cheng Lian l...@databricks.com
Closes #6978 from liancheng/debug-hive-spark-submit-suite and squashes the
following commits:
b031647 [Cheng Lian] Prints process stdout/stderr instead of logging them
#6966 from yhuai/SPARK-8578-branch-1.4 and squashes the following
commits:
9c3947b [Yin Huai] Do not use a custom output commiter when appendiing data.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7e53ff25
Tree: http://git
to an
existing dir. This changes adds the logic to check if we are appending data,
and if so, we use the output committer associated with the file output format.
Author: Yin Huai yh...@databricks.com
Closes #6964 from yhuai/SPARK-8578 and squashes the following commits:
43544c4 [Yin Huai] Do not use
Repository: spark
Updated Branches:
refs/heads/branch-1.4 994abbaeb - d73900a90
[SPARK-7859] [SQL] Collect_set() behavior differences which fails the unit test
under jdk8
To reproduce that:
```
JAVA_HOME=/home/hcheng/Java/jdk1.8.0_45 | build/sbt -Phadoop-2.3 -Phive
'test-only
Context fails
to create in spark shell because of the class loader issue.
Author: Yin Huai yh...@databricks.com
Closes #6459 from yhuai/SPARK-7853 and squashes the following commits:
37ad33e [Yin Huai] Do not use hiveQlTable at all.
47cdb6d [Yin Huai] Move hiveconf.set to the end of setConf
that Hive Context fails
to create in spark shell because of the class loader issue.
Author: Yin Huai yh...@databricks.com
Closes #6459 from yhuai/SPARK-7853 and squashes the following commits:
37ad33e [Yin Huai] Do not use hiveQlTable at all.
47cdb6d [Yin Huai] Move hiveconf.set to the end of setConf
Repository: spark
Updated Branches:
refs/heads/branch-1.4 90525c9ba - a25ce91f9
[SPARK-7847] [SQL] Fixes dynamic partition directory escaping
Please refer to [SPARK-7847] [1] for details.
[1]: https://issues.apache.org/jira/browse/SPARK-7847
Author: Cheng Lian l...@databricks.com
Closes
Repository: spark
Updated Branches:
refs/heads/master ff0ddff46 - 15459db4f
[SPARK-7847] [SQL] Fixes dynamic partition directory escaping
Please refer to [SPARK-7847] [1] for details.
[1]: https://issues.apache.org/jira/browse/SPARK-7847
Author: Cheng Lian l...@databricks.com
Closes #6389
Repository: spark
Updated Branches:
refs/heads/master 6fec1a940 - 8161562ea
[SPARK-7790] [SQL] date and decimal conversion for dynamic partition key
Author: Daoyuan Wang daoyuan.w...@intel.com
Closes #6318 from adrian-wang/dynpart and squashes the following commits:
ad73b61 [Daoyuan Wang]
Repository: spark
Updated Branches:
refs/heads/branch-1.4 d33142fd8 - 89fe93fc3
[SPARK-7684] [SQL] Refactoring MetastoreDataSourcesSuite to workaround
SPARK-7684
As stated in SPARK-7684, currently `TestHive.reset` has some execution order
specific bug, which makes running specific test
Repository: spark
Updated Branches:
refs/heads/branch-1.4 89fe93fc3 - e07b71560
[SPARK-7853] [SQL] Fixes a class loader issue in Spark SQL
This PR is based on PR #6396 authored by chenghao-intel. Essentially, Spark SQL
should use context classloader to load SerDe classes.
yhuai helped
Repository: spark
Updated Branches:
refs/heads/master b97ddff00 - db3fd054f
[SPARK-7853] [SQL] Fixes a class loader issue in Spark SQL
This PR is based on PR #6396 authored by chenghao-intel. Essentially, Spark SQL
should use context classloader to load SerDe classes.
yhuai helped updating
Repository: spark
Updated Branches:
refs/heads/branch-1.4 faadbd4d9 - d0bd68ff8
[SPARK-7868] [SQL] Ignores _temporary directories in HadoopFsRelation
So that potential partial/corrupted data files left by failed tasks/jobs won't
affect normal data scan.
Author: Cheng Lian
Repository: spark
Updated Branches:
refs/heads/master a51b133de - e7b617755
[SPARK-7950] [SQL] Sets spark.sql.hive.version in
HiveThriftServer2.startWithContext()
When starting `HiveThriftServer2` via `startWithContext`, property
`spark.sql.hive.version` isn't set. This causes Simba ODBC
Repository: spark
Updated Branches:
refs/heads/branch-1.4 23bd05fff - caea7a618
[SPARK-7950] [SQL] Sets spark.sql.hive.version in
HiveThriftServer2.startWithContext()
When starting `HiveThriftServer2` via `startWithContext`, property
`spark.sql.hive.version` isn't set. This causes Simba
`; and
3. Renaming the title of the session page from `ThriftServer` to `JDBC/ODBC
Session`.
https://issues.apache.org/jira/browse/SPARK-7907
Author: Yin Huai yh...@databricks.com
Closes #6448 from yhuai/JDBCServer and squashes the following commits:
eadcc3d [Yin Huai] Update test.
9168005 [Yin Huai
. Renaming the title of the session page from `ThriftServer` to `JDBC/ODBC
Session`.
https://issues.apache.org/jira/browse/SPARK-7907
Author: Yin Huai yh...@databricks.com
Closes #6448 from yhuai/JDBCServer and squashes the following commits:
eadcc3d [Yin Huai] Update test.
9168005 [Yin Huai] Use
from yhuai/getBackEvaluatedType and squashes the following commits:
618c2eb [Yin Huai] Add EvaluatedType back.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8c3fc3a6
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree
Repository: spark
Updated Branches:
refs/heads/branch-1.4 de0802499 - f142867ec
[SPARK-8776] Increase the default MaxPermSize
I am increasing the perm gen size to 256m.
https://issues.apache.org/jira/browse/SPARK-8776
Author: Yin Huai yh...@databricks.com
Closes #7196 from yhuai/SPARK-8776
Repository: spark
Updated Branches:
refs/heads/master a59d14f62 - f743c79ab
[SPARK-8776] Increase the default MaxPermSize
I am increasing the perm gen size to 256m.
https://issues.apache.org/jira/browse/SPARK-8776
Author: Yin Huai yh...@databricks.com
Closes #7196 from yhuai/SPARK-8776
`SQLTestUtils` and `ParquetTest` in `src/main`. We should only add
stuff that will be needed by `sql/console` or Python tests (for Python, we need
it in `src/main`, right? davies).
Author: Yin Huai yh...@databricks.com
Closes #6334 from yhuai/SPARK-7805 and squashes the following commits
Repository: spark
Updated Branches:
refs/heads/branch-1.4 947d700ec - 11d998eb7
[SPARK-7845] [BUILD] Bump Hadoop 1 tests to version 1.2.1
https://issues.apache.org/jira/browse/SPARK-7845
Author: Yin Huai yh...@databricks.com
Closes #6384 from yhuai/hadoop1Test and squashes the following
Repository: spark
Updated Branches:
refs/heads/master 3c1a2d049 - bfbc0df72
[SPARK-7845] [BUILD] Bump Hadoop 1 tests to version 1.2.1
https://issues.apache.org/jira/browse/SPARK-7845
Author: Yin Huai yh...@databricks.com
Closes #6384 from yhuai/hadoop1Test and squashes the following commits
Repository: spark
Updated Branches:
refs/heads/master ad0badba1 - efe3bfdf4
[SPARK-7322, SPARK-7836, SPARK-7822][SQL] DataFrame window function related
updates
1. ntile should take an integer as parameter.
2. Added Python API (based on #6364)
3. Update documentation of various DataFrame
Repository: spark
Updated Branches:
refs/heads/branch-1.4 ea9db50bc - d1515381c
[SPARK-7322, SPARK-7836, SPARK-7822][SQL] DataFrame window function related
updates
1. ntile should take an integer as parameter.
2. Added Python API (based on #6364)
3. Update documentation of various DataFrame
Closes #6366 from yhuai/insert and squashes the following commits:
3d717fb [Yin Huai] Use insertInto to handle the casue when table exists and
Append is used for saveAsTable.
56d2540 [Yin Huai] Add PreWriteCheck to HiveContext's analyzer.
c636e35 [Yin Huai] Remove unnecessary empty lines
...@databricks.com
Closes #6366 from yhuai/insert and squashes the following commits:
3d717fb [Yin Huai] Use insertInto to handle the casue when table exists and
Append is used for saveAsTable.
56d2540 [Yin Huai] Add PreWriteCheck to HiveContext's analyzer.
c636e35 [Yin Huai] Remove unnecessary empty lines
Repository: spark
Updated Branches:
refs/heads/master 13348e21b - 8730fbb47
[SPARK-7749] [SQL] Fixes partition discovery for non-partitioned tables
When no partition columns can be found, we should have an empty
`PartitionSpec`, rather than a `PartitionSpec` with empty partition columns.
Repository: spark
Updated Branches:
refs/heads/branch-1.4 b97a8053a - 70d9839cf
[SPARK-7749] [SQL] Fixes partition discovery for non-partitioned tables
When no partition columns can be found, we should have an empty
`PartitionSpec`, rather than a `PartitionSpec` with empty partition columns.
Repository: spark
Updated Branches:
refs/heads/branch-1.4 33e0e - 96c82515b
[SPARK-7763] [SPARK-7616] [SQL] Persists partition columns into metastore
Author: Yin Huai yh...@databricks.com
Author: Cheng Lian l...@databricks.com
Closes #6285 from liancheng/spark-7763 and squashes the
Repository: spark
Updated Branches:
refs/heads/master 311fab6f1 - 30f3f556f
[SPARK-7763] [SPARK-7616] [SQL] Persists partition columns into metastore
Author: Yin Huai yh...@databricks.com
Author: Cheng Lian l...@databricks.com
Closes #6285 from liancheng/spark-7763 and squashes the following
Repository: spark
Updated Branches:
refs/heads/branch-1.4 c9a80fc40 - ba04b5236
[SPARK-7718] [SQL] Speed up partitioning by avoiding closure cleaning
According to yhuai we spent 6-7 seconds cleaning closures in a partitioning job
that takes 12 seconds. Since we provide these closures
Repository: spark
Updated Branches:
refs/heads/master 6b18cdc1b - 5287eec5a
[SPARK-7718] [SQL] Speed up partitioning by avoiding closure cleaning
According to yhuai we spent 6-7 seconds cleaning closures in a partitioning job
that takes 12 seconds. Since we provide these closures in Spark we
Repository: spark
Updated Branches:
refs/heads/master feb3a9d3f - a25c1ab8f
[SPARK-7565] [SQL] fix MapType in JsonRDD
The key of Map in JsonRDD should be converted into UTF8String (also failed
records), Thanks to yhuai viirya
Closes #6084
Author: Davies Liu dav...@databricks.com
Closes
Repository: spark
Updated Branches:
refs/heads/branch-1.4 f0e421351 - 3aa618510
[SPARK-7565] [SQL] fix MapType in JsonRDD
The key of Map in JsonRDD should be converted into UTF8String (also failed
records), Thanks to yhuai viirya
Closes #6084
Author: Davies Liu dav...@databricks.com
Repository: spark
Updated Branches:
refs/heads/master 1ee8eb431 - feb3a9d3f
[SPARK-7320] [SQL] [Minor] Move the testData into beforeAll()
Follow up of #6340, to avoid the test report missing once it fails.
Author: Cheng Hao hao.ch...@intel.com
Closes #6312 from chenghao-intel/rollup_minor
Repository: spark
Updated Branches:
refs/heads/branch-1.4 f08c6f319 - f0e421351
[SPARK-7320] [SQL] [Minor] Move the testData into beforeAll()
Follow up of #6340, to avoid the test report missing once it fails.
Author: Cheng Hao hao.ch...@intel.com
Closes #6312 from
join
t2 on (t1.x = t2.x) join t3 on (t2.x = t3.x)` will only have three Exchange
operators (when shuffled joins are needed) instead of four.
The code in this PR was authored by yhuai; I'm opening this PR to factor out
this change from #7685, a larger pull request which contains two other
(https://github.com/apache/spark/pull/6780). Also, we need to backport the fix
of `TakeOrderedAndProject` as well (https://github.com/apache/spark/pull/8179).
Author: Wenchen Fan cloud0...@outlook.com
Author: Yin Huai yh...@databricks.com
Closes #8252 from yhuai/backport7289And9949.
Project
Repository: spark
Updated Branches:
refs/heads/branch-1.5 e2c6ef810 - 90245f65c
[SPARK-10005] [SQL] Fixes schema merging for nested structs
In case of schema merging, we only handled first level fields when converting
Parquet groups to `InternalRow`s. Nested struct fields are not properly
Repository: spark
Updated Branches:
refs/heads/master cf016075a - ae2370e72
[SPARK-10005] [SQL] Fixes schema merging for nested structs
In case of schema merging, we only handled first level fields when converting
Parquet groups to `InternalRow`s. Nested struct fields are not properly
: Yin Huai yh...@databricks.com
Closes #8346 from yhuai/parquetMinSplit.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e3355090
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e3355090
Diff: http://git-wip-us.apache.org
.
Author: Yin Huai yh...@databricks.com
Closes #8346 from yhuai/parquetMinSplit.
(cherry picked from commit e3355090d4030daffed5efb0959bf1d724c13c13)
Signed-off-by: Yin Huai yh...@databricks.com
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf
Repository: spark
Updated Branches:
refs/heads/branch-1.5 675e22494 - 5be517584
[SPARK-10100] [SQL] Eliminate hash table lookup if there is no grouping key in
aggregation.
This improves performance by ~ 20 - 30% in one of my local test and should fix
the performance regression from 1.4 to
Repository: spark
Updated Branches:
refs/heads/master 43e013542 - b4f4e91c3
[SPARK-10100] [SQL] Eliminate hash table lookup if there is no grouping key in
aggregation.
This improves performance by ~ 20 - 30% in one of my local test and should fix
the performance regression from 1.4 to 1.5
from yhuai/SPARK-8567-1.4 and squashes the following commits:
0ae2e14 [Yin Huai] [SPARK-8567] [SQL] Add logs to record the progress of
HiveSparkSubmitSuite.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0de1737a
Tree: http
Repository: spark
Updated Branches:
refs/heads/branch-1.4 80d53565a - ffc793a6c
[SPARK-8715] ArrayOutOfBoundsException fixed for DataFrameStatSuite.crosstab
cc yhuai
Author: Burak Yavuz brk...@gmail.com
Closes #7100 from brkyvz/ct-flakiness-fix and squashes the following commits:
abc299a
Repository: spark
Updated Branches:
refs/heads/master f79410c49 - e6c3f7462
[SPARK-8650] [SQL] Use the user-specified app name priority in
SparkSQLCLIDriver or HiveThriftServer2
When run `./bin/spark-sql --name query1.sql`
[Before]
Repository: spark
Updated Branches:
refs/heads/master e78ec1a8f - 3744b7fd4
[SPARK-9422] [SQL] Remove the placeholder attributes used in the aggregation
buffers
https://issues.apache.org/jira/browse/SPARK-9422
Author: Yin Huai yh...@databricks.com
Closes #7737 from yhuai/removePlaceHolder
# from yhuai/SPARK-9466 and squashes the following commits:
e0e3a86 [Yin Huai] Increate the timeout.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/815c8245
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/815c8245
Repository: spark
Updated Branches:
refs/heads/master 0a1d2ca42 - 39ab199a3
[SPARK-8640] [SQL] Enable Processing of Multiple Window Frames in a Single
Window Operator
This PR enables the processing of multiple window frames in a single window
operator. This should improve the performance of
Author: Yin Huai yh...@databricks.com
Closes #7832 from yhuai/SPARK-9233 and squashes the following commits:
4e4e4cc [Yin Huai] style
ca80e07 [Yin Huai] Test window function with codegen.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark
1 - 100 of 774 matches
Mail list logo