Repository: spark
Updated Branches:
refs/heads/master 34dcf1010 - 23d982204
[SPARK-9141] [SQL] Remove project collapsing from DataFrame API
Currently we collapse successive projections that are added by `withColumn`.
However, this optimization violates the constraint that adding nodes to a
Repository: spark
Updated Branches:
refs/heads/branch-1.5 eedb996dd - 125827a4f
[SPARK-9141] [SQL] Remove project collapsing from DataFrame API
Currently we collapse successive projections that are added by `withColumn`.
However, this optimization violates the constraint that adding nodes
Repository: spark
Updated Branches:
refs/heads/master 7a969a696 - 1f8c364b9
[SPARK-9141] [SQL] [MINOR] Fix comments of PR #7920
This is a follow-up of https://github.com/apache/spark/pull/7920 to fix
comments.
Author: Yin Huai yh...@databricks.com
Closes #7964 from yhuai/SPARK-9141-follow
Repository: spark
Updated Branches:
refs/heads/branch-1.5 03bcf627d - 19018d542
[SPARK-9141] [SQL] [MINOR] Fix comments of PR #7920
This is a follow-up of https://github.com/apache/spark/pull/7920 to fix
comments.
Author: Yin Huai yh...@databricks.com
Closes #7964 from yhuai/SPARK-9141
Repository: spark
Updated Branches:
refs/heads/master eb5b8f4a6 - 5f0fb6466
[SPARK-9649] Fix flaky test MasterSuite - randomize ports
```
Error Message
Failed to bind to: /127.0.0.1:7093: Service 'sparkMaster' failed after 16
retries!
Stacktrace
java.net.BindException: Failed to bind
Repository: spark
Updated Branches:
refs/heads/branch-1.5 b8136d7e0 - 05cbf133d
[SPARK-9649] Fix flaky test MasterSuite - randomize ports
```
Error Message
Failed to bind to: /127.0.0.1:7093: Service 'sparkMaster' failed after 16
retries!
Stacktrace
java.net.BindException: Failed to
Repository: spark
Updated Branches:
refs/heads/master 7bbf02f0b - 5363ed715
[SPARK-9361] [SQL] Refactor new aggregation code to reduce the times of
checking compatibility
JIRA: https://issues.apache.org/jira/browse/SPARK-9361
Currently, we call `aggregate.Utils.tryConvert` in many places to
Repository: spark
Updated Branches:
refs/heads/master d378396f8 - dfe347d2c
[SPARK-9785] [SQL] HashPartitioning compatibility should consider expression
ordering
HashPartitioning compatibility is currently defined w.r.t the _set_ of
expressions, but the ordering of those expressions matters
[SPARK-9646] [SQL] Add metrics for all join and aggregate operators
This PR added metrics for all join and aggregate operators. However, I found
the metrics may be confusing in the following two case:
1. The iterator is not totally consumed and the metric values will be less.
2. Recreating the
[SPARK-9646] [SQL] Add metrics for all join and aggregate operators
This PR added metrics for all join and aggregate operators. However, I found
the metrics may be confusing in the following two case:
1. The iterator is not totally consumed and the metric values will be less.
2. Recreating the
Repository: spark
Updated Branches:
refs/heads/branch-1.5 71460b889 - 767ee1884
http://git-wip-us.apache.org/repos/asf/spark/blob/767ee188/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
--
from yhuai/unsafeEmptyMap and squashes the following commits:
9727abe [Yin Huai] Address Josh's comments.
34b6f76 [Yin Huai] 1. UnsafeKVExternalSorter does not use 0 as the initialSize
to create an UnsafeInMemorySorter if its BytesToBytesMap is empty. 2. Do not
spill a InMemorySorter
yhuai/unsafeEmptyMap and squashes the following commits:
9727abe [Yin Huai] Address Josh's comments.
34b6f76 [Yin Huai] 1. UnsafeKVExternalSorter does not use 0 as the initialSize
to create an UnsafeInMemorySorter if its BytesToBytesMap is empty. 2. Do not
spill a InMemorySorter if it is empty
Repository: spark
Updated Branches:
refs/heads/master 93085c992 - 9f94c85ff
[SPARK-9593] [SQL] [HOTFIX] Makes the Hadoop shims loading fix more robust
This is a follow-up of #7929.
We found that Jenkins SBT master build still fails because of the Hadoop shims
loading issue. But the failure
Repository: spark
Updated Branches:
refs/heads/branch-1.5 c39d5d144 - 11c28a568
[SPARK-9593] [SQL] Fixes Hadoop shims loading
This PR is used to workaround CDH Hadoop versions like 2.0.0-mr1-cdh4.1.1.
Internally, Hive `ShimLoader` tries to load different versions of Hadoop shims
by checking
Repository: spark
Updated Branches:
refs/heads/branch-1.5 2382b483a - b51159def
[SPARK-9632] [SQL] [HOT-FIX] Fix build.
seems https://github.com/apache/spark/pull/7955 breaks the build.
Author: Yin Huai yh...@databricks.com
Closes #8001 from yhuai/SPARK-9632-fixBuild and squashes
Repository: spark
Updated Branches:
refs/heads/master 2eca46a17 - cdd53b762
[SPARK-9632] [SQL] [HOT-FIX] Fix build.
seems https://github.com/apache/spark/pull/7955 breaks the build.
Author: Yin Huai yh...@databricks.com
Closes #8001 from yhuai/SPARK-9632-fixBuild and squashes the following
Repository: spark
Updated Branches:
refs/heads/branch-1.5 11c28a568 - cc4c569a8
[SPARK-9593] [SQL] [HOTFIX] Makes the Hadoop shims loading fix more robust
This is a follow-up of #7929.
We found that Jenkins SBT master build still fails because of the Hadoop shims
loading issue. But the
before that so we never caught it. This patch
re-enables the test and adds the code necessary to make it pass.
JoshRosen yhuai
Author: Andrew Or and...@databricks.com
Closes #8015 from andrewor14/SPARK-9674 and squashes the following commits:
225eac2 [Andrew Or] Merge branch 'master
before that so we never caught it. This patch
re-enables the test and adds the code necessary to make it pass.
JoshRosen yhuai
Author: Andrew Or and...@databricks.com
Closes #8015 from andrewor14/SPARK-9674 and squashes the following commits:
225eac2 [Andrew Or] Merge branch 'master
Repository: spark
Updated Branches:
refs/heads/branch-1.5 874b9d855 - 251d1eef4
[SPARK-6212] [SQL] The EXPLAIN output of CTAS only shows the analyzed plan
JIRA: https://issues.apache.org/jira/browse/SPARK-6212
Author: Yijie Shen henry.yijies...@gmail.com
Closes #7986 from
Repository: spark
Updated Branches:
refs/heads/master 25c363e93 - 3ca995b78
[SPARK-6212] [SQL] The EXPLAIN output of CTAS only shows the analyzed plan
JIRA: https://issues.apache.org/jira/browse/SPARK-6212
Author: Yijie Shen henry.yijies...@gmail.com
Closes #7986 from yjshen/ctas_explain
Repository: spark
Updated Branches:
refs/heads/branch-1.5 b12f0737f - 1ce5061bb
[SPARK-8930] [SQL] Throw a AnalysisException with meaningful messages if
DataFrame#explode takes a star in expressions
Author: Yijie Shen henry.yijies...@gmail.com
Closes #8057 from yjshen/explode_star and
Repository: spark
Updated Branches:
refs/heads/master e9c36938b - 68ccc6e18
[SPARK-8930] [SQL] Throw a AnalysisException with meaningful messages if
DataFrame#explode takes a star in expressions
Author: Yijie Shen henry.yijies...@gmail.com
Closes #8057 from yjshen/explode_star and squashes
Repository: spark
Updated Branches:
refs/heads/master a863348fd - 23cf5af08
[SPARK-9703] [SQL] Refactor EnsureRequirements to avoid certain unnecessary
shuffles
This pull request refactors the `EnsureRequirements` planning rule in order to
avoid the addition of certain unnecessary shuffles.
Repository: spark
Updated Branches:
refs/heads/branch-1.5 1ce5061bb - 323d68606
[SPARK-9703] [SQL] Refactor EnsureRequirements to avoid certain unnecessary
shuffles
This pull request refactors the `EnsureRequirements` planning rule in order to
avoid the addition of certain unnecessary
Repository: spark
Updated Branches:
refs/heads/branch-1.5 f75c64b0c - 94b2f5b32
[SPARK-9743] [SQL] Fixes JSONRelation refreshing
PR #7696 added two `HadoopFsRelation.refresh()` calls ([this] [1], and [this]
[2]) in `DataSourceStrategy` to make test case `InsertSuite.save directly to
the
Repository: spark
Updated Branches:
refs/heads/master 5a5bbc299 - afa757c98
[SPARK-9849] [SQL] DirectParquetOutputCommitter qualified name should be
backward compatible
DirectParquetOutputCommitter was moved in SPARK-9763. However, users can
explicitly set the class as a config option, so
Repository: spark
Updated Branches:
refs/heads/branch-1.5 b7497e3a2 - ec7a4b9b0
[SPARK-9849] [SQL] DirectParquetOutputCommitter qualified name should be
backward compatible
DirectParquetOutputCommitter was moved in SPARK-9763. However, users can
explicitly set the class as a config option,
/Spark-Master-SBT/3088/AMPLAB_JENKINS_BUILD_PROFILE=hadoop1.0,label=centos/console
Author: Yin Huai yh...@databricks.com
Closes #7702 from yhuai/SPARK-9385 and squashes the following commits:
146e6ef [Yin Huai] Comment out Python style check because of error shown in
https://amplab.cs.berkeley.edu
...@databricks.com
Closes #7704 from yhuai/SPARK-9385 and squashes the following commits:
0056359 [Yin Huai] Enable PEP8 but disable installing pylint.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dafe8d85
Tree: http://git-wip
Repository: spark
Updated Branches:
refs/heads/master 8ddfa52c2 - ce89ff477
[SPARK-9386] [SQL] Feature flag for metastore partition pruning
Since we have been seeing a lot of failures related to this new feature, lets
put it behind a flag and turn it off by default.
Author: Michael Armbrust
Repository: spark
Updated Branches:
refs/heads/master b55a36bc3 - 76520955f
[SPARK-9082] [SQL] Filter using non-deterministic expressions should not be
pushed down
Author: Wenchen Fan cloud0...@outlook.com
Closes #7446 from cloud-fan/filter and squashes the following commits:
330021e
Repository: spark
Updated Branches:
refs/heads/branch-1.4 ff5e5f228 - 712e13bba
[SPARK-9254] [BUILD] [HOTFIX] sbt-launch-lib.bash should support HTTP/HTTPS
redirection
Target file(s) can be hosted on CDN nodes. HTTP/HTTPS redirection must be
supported to download these files.
Author: Cheng
Repository: spark
Updated Branches:
refs/heads/master 26ed22aec - 52ef76de2
[SPARK-9082] [SQL] [FOLLOW-UP] use `partition` in `PushPredicateThroughProject`
a follow up of https://github.com/apache/spark/pull/7446
Author: Wenchen Fan cloud0...@outlook.com
Closes #7607 from cloud-fan/tmp and
Repository: spark
Updated Branches:
refs/heads/master b536d5dc6 - 43dac2c88
[SPARK-6941] [SQL] Provide a better error message to when inserting into RDD
based table
JIRA: https://issues.apache.org/jira/browse/SPARK-6941
Author: Yijie Shen henry.yijies...@gmail.com
Closes #7342 from
Repository: spark
Updated Branches:
refs/heads/master fb1d06fc2 - 4b5cfc988
[SPARK-8800] [SQL] Fix inaccurate precision/scale of Decimal division operation
JIRA: https://issues.apache.org/jira/browse/SPARK-8800
Previously, we turn to Java BigDecimal's divide with specified ROUNDING_MODE to
/31bd30687bc29c0e457c37308d489ae2b6e5b72a
(SPARK-8359)
*
https://github.com/apache/spark/commit/24fda7381171738cbbbacb5965393b660763e562
(SPARK-8677)
*
https://github.com/apache/spark/commit/4b5cfc988f23988c2334882a255d494fc93d252e
(SPARK-8800)
Author: Yin Huai yh...@databricks.com
Closes #7426 from yhuai/SPARK-9060 and squashes
Repository: spark
Updated Branches:
refs/heads/master ba3309684 - e27212317
[SPARK-8972] [SQL] Incorrect result for rollup
We don't support the complex expression keys in the rollup/cube, and we even
will not report it if we have the complex group by keys, that will cause very
Repository: spark
Updated Branches:
refs/heads/master 111c05538 - 3f6d28a5c
[SPARK-9102] [SQL] Improve project collapse with nondeterministic expressions
Currently we will stop project collapse when the lower projection has
nondeterministic expressions. However it's overkill sometimes, we
Repository: spark
Updated Branches:
refs/heads/master 04c1b49f5 - a9a0d0ceb
[SPARK-8638] [SQL] Window Function Performance Improvements
## Description
Performance improvements for Spark Window functions. This PR will also serve as
the basis for moving away from Hive UDAFs to Spark UDAFs. See
Repository: spark
Updated Branches:
refs/heads/master a803ac3e0 - 7a8124534
[SPARK-8638] [SQL] Window Function Performance Improvements - Cleanup
This PR contains a few clean-ups that are a part of SPARK-8638: a few style
issues got fixed, and a few tests were moved.
Git commit message is
Repository: spark
Updated Branches:
refs/heads/master 9ce0c7ad3 - 662bb9667
[SPARK-10144] [UI] Actually show peak execution memory by default
The peak execution memory metric was introduced in SPARK-8735. That was before
Tungsten was enabled by default, so it assumed that
Repository: spark
Updated Branches:
refs/heads/branch-1.5 43dcf95e4 - 831f78ee5
[SPARK-10144] [UI] Actually show peak execution memory by default
The peak execution memory metric was introduced in SPARK-8735. That was before
Tungsten was enabled by default, so it assumed that
Repository: spark
Updated Branches:
refs/heads/master 87f82a5fb -> 07ced4342
[SPARK-11253] [SQL] reset all accumulators in physical operators before execute
an action
With this change, our query execution listener can get the metrics correctly.
The UI still looks good after this change.
Repository: spark
Updated Branches:
refs/heads/master 4bb2b3698 -> d4c397a64
[SPARK-11325] [SQL] Alias 'alias' in Scala's DataFrame API
Author: Nong Li
Closes #9286 from nongli/spark-11325.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/master dc3220ce1 -> a150e6c1b
[SPARK-10562] [SQL] support mixed case partitionBy column names for tables
stored in metastore
https://issues.apache.org/jira/browse/SPARK-10562
Author: Wenchen Fan
Closes #9226 from
Repository: spark
Updated Branches:
refs/heads/master d4c397a64 -> 82464fb2e
[SPARK-10947] [SQL] With schema inference from JSON into a Dataframe, add
option to infer all primitive object types as strings
Currently, when a schema is inferred from a JSON file using
sqlContext.read.json, the
Repository: spark
Updated Branches:
refs/heads/branch-1.5 9e3197aaa -> 76d742386
[SPARK-11246] [SQL] Table cache for Parquet broken in 1.5
The root cause is that when spark.sql.hive.convertMetastoreParquet=true by
default, the cached InMemoryRelation of the ParquetRelation can not be looked
Repository: spark
Updated Branches:
refs/heads/branch-1.5 76d742386 -> bb3b3627a
[SPARK-11032] [SQL] correctly handle having
We should not stop resolving having when the having condtion is resolved, or
something like `count(1)` will crash.
Author: Wenchen Fan
Closes
Repository: spark
Updated Branches:
refs/heads/master 5e4581250 -> ffed00493
[SPARK-11125] [SQL] Uninformative exception when running spark-sql withoâ¦
â¦ut building with -Phive-thriftserver and SPARK_PREPEND_CLASSES is set
This is the exception after this patch. Please help review.
```
[SPARK-11347] [SQL] Support for joinWith in Datasets
This PR adds a new operation `joinWith` to a `Dataset`, which returns a `Tuple`
for each pair where a given `condition` evaluates to true.
```scala
case class ClassData(a: String, b: Int)
val ds1 = Seq(ClassData("a", 1), ClassData("b",
Repository: spark
Updated Branches:
refs/heads/master 3bdbbc6c9 -> 5a5f65905
http://git-wip-us.apache.org/repos/asf/spark/blob/5a5f6590/sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
--
diff --git
Repository: spark
Updated Branches:
refs/heads/master b960a8905 -> d9c603989
[SPARK-10484] [SQL] Optimize the cartesian join with broadcast join for some
cases
In some cases, we can broadcast the smaller relation in cartesian join, which
improve the performance significantly.
Author: Cheng
Repository: spark
Updated Branches:
refs/heads/master f92b7b98e -> 032748bb9
[SPARK-11377] [SQL] withNewChildren should not convert StructType to Seq
This is minor, but I ran into while writing Datasets and while it wasn't needed
for the final solution, it was super confusing so we should
Repository: spark
Updated Branches:
refs/heads/master 032748bb9 -> 5aa052191
[SPARK-11292] [SQL] Python API for text data source
Adds DataFrameReader.text and DataFrameWriter.text.
Author: Reynold Xin
Closes #9259 from rxin/SPARK-11292.
Project:
Repository: spark
Updated Branches:
refs/heads/master 5aa052191 -> 20dfd4674
[SPARK-11363] [SQL] LeftSemiJoin should be LeftSemi in SparkStrategies
JIRA: https://issues.apache.org/jira/browse/SPARK-11363
In SparkStrategies some places use LeftSemiJoin. It should be LeftSemi.
cc
Repository: spark
Updated Branches:
refs/heads/master 4e38defae -> e1a897b65
[SPARK-11274] [SQL] Text data source support for Spark SQL.
This adds API for reading and writing text files, similar to
SparkContext.textFile and RDD.saveAsTextFile.
```
Repository: spark
Updated Branches:
refs/heads/master e1a897b65 -> 4725cb988
[SPARK-11194] [SQL] Use MutableURLClassLoader for the classLoader in
IsolatedClientLoader.
https://issues.apache.org/jira/browse/SPARK-11194
Author: Yin Huai <yh...@databricks.com>
Closes #9170 from yh
Repository: spark
Updated Branches:
refs/heads/branch-1.6 6e2e84f3e -> 5ccc1eb08
[SPARK-11590][SQL] use native json_tuple in lateral view
Author: Wenchen Fan
Closes #9562 from cloud-fan/json-tuple.
(cherry picked from commit 53600854c270d4c953fe95fbae528740b5cf6603)
Repository: spark
Updated Branches:
refs/heads/master dfcfcbcc0 -> 53600854c
[SPARK-11590][SQL] use native json_tuple in lateral view
Author: Wenchen Fan
Closes #9562 from cloud-fan/json-tuple.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/branch-1.6 7de8abd6f -> 0d637571d
[SPARK-10371][SQL][FOLLOW-UP] fix code style
Author: Wenchen Fan
Closes #9627 from cloud-fan/follow.
(cherry picked from commit 1510c527b4f5ee0953ae42313ef9e16d2f5864c4)
Signed-off-by:
Repository: spark
Updated Branches:
refs/heads/master 1bc41125e -> 1510c527b
[SPARK-10371][SQL][FOLLOW-UP] fix code style
Author: Wenchen Fan
Closes #9627 from cloud-fan/follow.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
nts:
* Fix for a potential bug in distinct child expression and attribute alignment.
* Improved handling of duplicate distinct child expressions.
* Added test for distinct UDAF with multiple children.
cc yhuai
Author: Herman van Hovell <hvanhov...@questtec.nl>
Closes #9566 from hvanhovell/S
Repository: spark
Updated Branches:
refs/heads/branch-1.6 7b3736098 -> 41b2bb1c3
[SPARK-11451][SQL] Support single distinct count on multiple columns.
This PR adds support for multiple column in a single count distinct aggregate
to the new aggregation path.
cc yhuai
Author: Herman
Repository: spark
Updated Branches:
refs/heads/master 5c4e6d7ec -> 30c8ba71a
[SPARK-11451][SQL] Support single distinct count on multiple columns.
This PR adds support for multiple column in a single count distinct aggregate
to the new aggregation path.
cc yhuai
Author: Herman van Hov
Repository: spark
Updated Branches:
refs/heads/branch-1.6 fddf0c413 -> 7eaf48eeb
[SPARK-11453][SQL] append data to partitioned table will messes up the result
The reason is that:
1. For partitioned hive table, we will move the partitioned columns after data
columns. (e.g. `` partition by
Repository: spark
Updated Branches:
refs/heads/master 97b7080cf -> d8b50f702
[SPARK-11453][SQL] append data to partitioned table will messes up the result
The reason is that:
1. For partitioned hive table, we will move the partitioned columns after data
columns. (e.g. `` partition by `a`
Repository: spark
Updated Branches:
refs/heads/master 99693fef0 -> a24477996
[SPARK-11690][PYSPARK] Add pivot to python api
This PR adds pivot to the python api of GroupedData with the same syntax as
Scala/Java.
Author: Andrew Ray
Closes #9653 from
Repository: spark
Updated Branches:
refs/heads/branch-1.6 4a1bcb26d -> 6459a6747
[SPARK-11690][PYSPARK] Add pivot to python api
This PR adds pivot to the python api of GroupedData with the same syntax as
Scala/Java.
Author: Andrew Ray
Closes #9653 from
Repository: spark
Updated Branches:
refs/heads/branch-1.6 a0f9cd77a -> c37ed52ec
[SPARK-11522][SQL] input_file_name() returns "" for external tables
When computing partition for non-parquet relation, `HadoopRDD.compute` is used.
but it does not set the thread local variable `inputFileName`
ors, and is
not an appropriate solution. This new change does kerberos login during hive
client initialization, which will make credentials ready for the particular
hive client instance.
yhuai Please take a look and let me know. If you are not the right person to
talk to, could you point me to some
Repository: spark
Updated Branches:
refs/heads/branch-1.5 330961bbf -> b767ceeb2
[SPARK-11191][SPARK-11311][SQL] Backports #9664 and #9277 to branch-1.5
The main purpose of this PR is to backport #9664, which depends on #9277.
Author: Cheng Lian
Closes #9671 from
Repository: spark
Updated Branches:
refs/heads/master d22fc1088 -> 64e555110
[SPARK-11672][ML] set active SQLContext in JavaDefaultReadWriteSuite
The same as #9694, but for Java test suite. yhuai
Author: Xiangrui Meng <m...@databricks.com>
Closes #9719 from mengxr/SPARK-11672.4.
Repository: spark
Updated Branches:
refs/heads/branch-1.6 6f98d47f8 -> 07af78221
[SPARK-11672][ML] set active SQLContext in JavaDefaultReadWriteSuite
The same as #9694, but for Java test suite. yhuai
Author: Xiangrui Meng <m...@databricks.com>
Closes #9719 from mengxr/SPAR
ors, and is
not an appropriate solution. This new change does kerberos login during hive
client initialization, which will make credentials ready for the particular
hive client instance.
yhuai Please take a look and let me know. If you are not the right person to
talk to, could you point me to some
ors, and is
not an appropriate solution. This new change does kerberos login during hive
client initialization, which will make credentials ready for the particular
hive client instance.
yhuai Please take a look and let me know. If you are not the right person to
talk to, could you point me to someone responsi
Repository: spark
Updated Branches:
refs/heads/branch-1.6 b56aaa9be -> eced2766b
[SPARK-9928][SQL] Removal of LogicalLocalTable
LogicalLocalTable in ExistingRDD.scala is replaced by localRelation in
LocalRelation.scala?
Do you know any reason why we still keep this class?
Author:
Repository: spark
Updated Branches:
refs/heads/master 835a79d78 -> b58765caa
[SPARK-9928][SQL] Removal of LogicalLocalTable
LogicalLocalTable in ExistingRDD.scala is replaced by localRelation in
LocalRelation.scala?
Do you know any reason why we still keep this class?
Author: gatorsmile
Repository: spark
Updated Branches:
refs/heads/master 1a21be15f -> b8ff6888e
[SPARK-8992][SQL] Add pivot to dataframe api
This adds a pivot method to the dataframe api.
Following the lead of cube and rollup this adds a Pivot operator that is
translated into an Aggregate by the analyzer.
Repository: spark
Updated Branches:
refs/heads/branch-1.6 4151afbf5 -> 5940fc71d
[SPARK-8992][SQL] Add pivot to dataframe api
This adds a pivot method to the dataframe api.
Following the lead of cube and rollup this adds a Pivot operator that is
translated into an Aggregate by the analyzer.
Repository: spark
Updated Branches:
refs/heads/branch-1.5 6e823b4d7 -> b478ee374
[SPARK-11595][SQL][BRANCH-1.5] Fixes ADD JAR when the input path contains URL
scheme
This PR backports #9569 to branch-1.5.
Author: Cheng Lian
Closes #9570 from
Fix for a potential bug in distinct child expression and attribute alignment.
* Improved handling of duplicate distinct child expressions.
* Added test for distinct UDAF with multiple children.
cc yhuai
Author: Herman van Hovell <hvanhov...@questtec.nl>
Closes #9566 from hvanhovell/SPARK-9241
Repository: spark
Updated Branches:
refs/heads/master 9cf56c96b -> c34c27fe9
[SPARK-9034][SQL] Reflect field names defined in GenericUDTF
Hive GenericUDTF#initialize() defines field names in a returned schema though,
the current HiveGenericUDTF drops these names.
We might need to reflect
<yh...@databricks.com>
Closes #9393 from yhuai/udfNondeterministic.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9cf56c96
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9cf56c96
Diff: http://git-wip-us.apache.
Repository: spark
Updated Branches:
refs/heads/master 45029bfde -> e8ec2a7b0
Revert "[SPARK-11236][CORE] Update Tachyon dependency from 0.7.1 -> 0.8.0."
This reverts commit 4f5e60c647d7d6827438721b7fabbc3a57b81023.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/master f8d93edec -> 3e770a64a
[SPARK-9298][SQL] Add pearson correlation aggregation function
JIRA: https://issues.apache.org/jira/browse/SPARK-9298
This patch adds pearson correlation aggregation function based on
`AggregateExpression2`.
m>
Closes #9387 from yhuai/SPARK-11434.
(cherry picked from commit 3c471885dc4f86bea95ab542e0d48d22ae748404)
Signed-off-by: Yin Huai <yh...@databricks.com>
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c9ac
Repository: spark
Updated Branches:
refs/heads/master de289bf27 -> abf5e4285
[SPARK-11504][SQL] API audit for distributeBy and localSort
1. Renamed localSort -> sortWithinPartitions to avoid ambiguity in "local"
2. distributeBy -> repartition to match the existing repartition.
Author:
Repository: spark
Updated Branches:
refs/heads/master 411ff6afb -> b6e0a5ae6
[SPARK-11510][SQL] Remove SQL aggregation tests for higher order statistics
We have some aggregate function tests in both DataFrameAggregateSuite and
SQLQuerySuite. The two have almost the same coverage and we
Repository: spark
Updated Branches:
refs/heads/master b2e4b314d -> ebf8b0b48
[SPARK-10978][SQL] Allow data sources to eliminate filters
This PR adds a new method `unhandledFilters` to `BaseRelation`. Data sources
which implement this method properly may avoid the overhead of defensive
<yh...@databricks.com>
Closes #9498 from yhuai/OracleDialect-1.4.
(cherry picked from commit 6c5e9a3a056cc8ee660a2b22a0a5ff17d674b68d)
Signed-off-by: Yin Huai <yh...@databricks.com>
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/a
ant if
expression in the non-distinct aggregation path and adds a multiple distinct
test to the AggregationQuerySuite.
cc yhuai marmbrus
Author: Herman van Hovell <hvanhov...@questtec.nl>
Closes #9541 from hvanhovell/SPARK-9241-followup.
Project: http://git-wip-us.apache.org/repos/asf/s
ant if
expression in the non-distinct aggregation path and adds a multiple distinct
test to the AggregationQuerySuite.
cc yhuai marmbrus
Author: Herman van Hovell <hvanhov...@questtec.nl>
Closes #9541 from hvanhovell/SPARK-9241-followup.
(cherry picked fr
Repository: spark
Updated Branches:
refs/heads/master d648a4ad5 -> e352de0db
[SPARK-11329] [SQL] Cleanup from spark-11329 fix.
Author: Nong
Closes #9442 from nongli/spark-11483.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/master e0fc9c7e5 -> 3bd6f5d2a
[SPARK-11490][SQL] variance should alias var_samp instead of var_pop.
stddev is an alias for stddev_samp. variance should be consistent with stddev.
Also took the chance to remove internal Stddev and Variance, and
Repository: spark
Updated Branches:
refs/heads/master e352de0db -> 2692bdb7d
[SPARK-11455][SQL] fix case sensitivity of partition by
depend on `caseSensitive` to do column name equality check, instead of just `==`
Author: Wenchen Fan
Closes #9410 from
Repository: spark
Updated Branches:
refs/heads/master 987df4bfc -> de289bf27
[SPARK-10304][SQL] Following up checking valid dir structure for partition
discovery
This patch follows up #8840.
Author: Liang-Chi Hsieh
Closes #9459 from
Repository: spark
Updated Branches:
refs/heads/master 33ae7a35d -> db11ee5e5
[SPARK-11371] Make "mean" an alias for "avg" operator
>From Reynold in the thread 'Exception when using some aggregate operators'
>(http://search-hadoop.com/m/q3RTt0xFr22nXB4/):
I don't think these are bugs. The
Repository: spark
Updated Branches:
refs/heads/master 2cef1bb0b -> 9cb5c731d
[SPARK-11329][SQL] Support star expansion for structs.
1. Supporting expanding structs in Projections. i.e.
"SELECT s.*" where s is a struct type.
This is fixed by allowing the expand function to handle structs
101 - 200 of 774 matches
Mail list logo