spark git commit: [SPARK-25019][BUILD] Fix orc dependency to use the same exclusion rules

2018-08-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 51e2b38d9 -> 278984d5a [SPARK-25019][BUILD] Fix orc dependency to use the same exclusion rules ## What changes were proposed in this pull request? During upgrading Apache ORC to 1.5.2

spark git commit: [SPARK-24895] Remove spotbugs plugin

2018-07-24 Thread yhuai
Repository: spark Updated Branches: refs/heads/master d4a277f0c -> fc21f192a [SPARK-24895] Remove spotbugs plugin ## What changes were proposed in this pull request? Spotbugs maven plugin was a recently added plugin before 2.4.0 snapshot artifacts were broken. To ensure it does not affect

svn commit: r25324 - /dev/spark/v2.3.0-rc5-bin/ /release/spark/spark-2.3.0/

2018-02-27 Thread yhuai
Author: yhuai Date: Wed Feb 28 07:25:53 2018 New Revision: 25324 Log: Releasing Apache Spark 2.3.0 Added: release/spark/spark-2.3.0/ - copied from r25323, dev/spark/v2.3.0-rc5-bin/ Removed: dev/spark/v2.3.0-rc5-bin

[3/3] spark-website git commit: Add the news about spark-summit-eu-2017 agenda

2017-08-28 Thread yhuai
Add the news about spark-summit-eu-2017 agenda Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/35eb1471 Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/35eb1471 Diff:

[1/3] spark-website git commit: Add the news about spark-summit-eu-2017 agenda

2017-08-28 Thread yhuai
Repository: spark-website Updated Branches: refs/heads/asf-site cca972e7f -> 35eb14717 http://git-wip-us.apache.org/repos/asf/spark-website/blob/35eb1471/site/releases/spark-release-1-3-0.html -- diff --git

[2/3] spark-website git commit: Add the news about spark-summit-eu-2017 agenda

2017-08-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/35eb1471/site/news/spark-accepted-into-apache-incubator.html -- diff --git a/site/news/spark-accepted-into-apache-incubator.html

spark git commit: [SPARK-21111][TEST][2.2] Fix the test failure of describe.sql

2017-06-15 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.2 76ee41fd7 -> a585c870a [SPARK-2][TEST][2.2] Fix the test failure of describe.sql ## What changes were proposed in this pull request? Test failed in `describe.sql`. We need to fix the related bug introduced in

spark git commit: Revert "[SPARK-20946][SQL] simplify the config setting logic in SparkSession.getOrCreate"

2017-06-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.2 6c628e75e -> b560c975b Revert "[SPARK-20946][SQL] simplify the config setting logic in SparkSession.getOrCreate" This reverts commit e11d90bf8deb553fd41b8837e3856c11486c2503. Project:

spark git commit: Revert "[SPARK-20311][SQL] Support aliases for table value functions"

2017-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.2 9e8d23b3a -> d191b962d Revert "[SPARK-20311][SQL] Support aliases for table value functions" This reverts commit 714811d0b5bcb5d47c39782ff74f898d276ecc59. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: Revert "[SPARK-20311][SQL] Support aliases for table value functions"

2017-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ac1ab6b9d -> f79aa285c Revert "[SPARK-20311][SQL] Support aliases for table value functions" This reverts commit 714811d0b5bcb5d47c39782ff74f898d276ecc59. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-20661][SPARKR][TEST] SparkR tableNames() test fails

2017-05-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.2 23681e9ca -> 4179ffc03 [SPARK-20661][SPARKR][TEST] SparkR tableNames() test fails ## What changes were proposed in this pull request? Cleaning existing temp tables before running tableNames tests ## How was this patch tested? SparkR

spark git commit: [SPARK-20661][SPARKR][TEST] SparkR tableNames() test fails

2017-05-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 829cd7b8b -> 2abfee18b [SPARK-20661][SPARKR][TEST] SparkR tableNames() test fails ## What changes were proposed in this pull request? Cleaning existing temp tables before running tableNames tests ## How was this patch tested? SparkR Unit

spark git commit: [SPARK-20217][CORE] Executor should not fail stage if killed task throws non-interrupted exception

2017-04-05 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 4000f128b -> 5142e5d4e [SPARK-20217][CORE] Executor should not fail stage if killed task throws non-interrupted exception ## What changes were proposed in this pull request? If tasks throw non-interrupted exceptions on kill (e.g.

spark git commit: [SPARK-19620][SQL] Fix incorrect exchange coordinator id in the physical plan

2017-03-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/master fcb68e0f5 -> dd9049e04 [SPARK-19620][SQL] Fix incorrect exchange coordinator id in the physical plan ## What changes were proposed in this pull request? When adaptive execution is enabled, an exchange coordinator is used in the Exchange

spark git commit: [SPARK-19816][SQL][TESTS] Fix an issue that DataFrameCallbackSuite doesn't recover the log level

2017-03-03 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.1 da04d45c2 -> 664c9795c [SPARK-19816][SQL][TESTS] Fix an issue that DataFrameCallbackSuite doesn't recover the log level ## What changes were proposed in this pull request? "DataFrameCallbackSuite.execute callback functions when a

spark git commit: [SPARK-19604][TESTS] Log the start of every Python test

2017-02-15 Thread yhuai
lso log the start of a test. So, if a test is hanging, we can tell which test file is running. ## How was this patch tested? This is a change for python tests. Author: Yin Huai <yh...@databricks.com> Closes #16935 from yhuai/SPARK-19604. (cherry picked fr

spark git commit: [SPARK-19295][SQL] IsolatedClientLoader's downloadVersion should log the location of downloaded metastore client jars

2017-01-19 Thread yhuai
ion of those downloaded jars when `spark.sql.hive.metastore.jars` is set to `maven`. ## How was this patch tested? jenkins Author: Yin Huai <yh...@databricks.com> Closes #16649 from yhuai/SPARK-19295. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.a

spark git commit: Update known_translations for contributor names

2017-01-18 Thread yhuai
uai <yh...@databricks.com> Closes #16628 from yhuai/known_translations. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0c923185 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0c923185 Diff: http://git-wip-us.a

spark git commit: [SPARK-18885][SQL] unify CREATE TABLE syntax for data source and hive serde tables

2017-01-05 Thread yhuai
Repository: spark Updated Branches: refs/heads/master f5d18af6a -> cca945b6a [SPARK-18885][SQL] unify CREATE TABLE syntax for data source and hive serde tables ## What changes were proposed in this pull request? Today we have different syntax to create data source or hive serde tables, we

[1/3] spark-website git commit: Spark Summit East (Feb 7-9th, 2017, Boston) agenda posted

2017-01-04 Thread yhuai
Repository: spark-website Updated Branches: refs/heads/asf-site 426a68ba8 -> 46a7a8027 http://git-wip-us.apache.org/repos/asf/spark-website/blob/46a7a802/site/screencasts/1-first-steps-with-spark.html -- diff --git

[3/3] spark-website git commit: Spark Summit East (Feb 7-9th, 2017, Boston) agenda posted

2017-01-04 Thread yhuai
Spark Summit East (Feb 7-9th, 2017, Boston) agenda posted Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/46a7a802 Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/46a7a802 Diff:

[2/3] spark-website git commit: Spark Summit East (Feb 7-9th, 2017, Boston) agenda posted

2017-01-04 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/46a7a802/site/news/spark-summit-2014-videos-posted.html -- diff --git a/site/news/spark-summit-2014-videos-posted.html b/site/news/spark-summit-2014-videos-posted.html

spark git commit: [SPARK-19072][SQL] codegen of Literal should not output boxed value

2017-01-03 Thread yhuai
Repository: spark Updated Branches: refs/heads/master b67b35f76 -> cbd11d235 [SPARK-19072][SQL] codegen of Literal should not output boxed value ## What changes were proposed in this pull request? In https://github.com/apache/spark/pull/16402 we made a mistake that, when double/float is

spark git commit: Update known_translations for contributor names and also fix a small issue in translate-contributors.py

2016-12-29 Thread yhuai
ons to add more contributor name mapping. It also fixes a small issue in translate-contributors.py ## How was this patch tested? manually tested Author: Yin Huai <yh...@databricks.com> Closes #16423 from yhuai/contributors. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: h

spark-website git commit: Fix the list of previous spark summits.

2016-12-29 Thread yhuai
Repository: spark-website Updated Branches: refs/heads/asf-site e10180e67 -> 426a68ba8 Fix the list of previous spark summits. Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/426a68ba Tree:

[2/5] spark-website git commit: Update Spark website for the release of Apache Spark 2.1.0

2016-12-29 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/e10180e6/site/releases/spark-release-0-9-1.html -- diff --git a/site/releases/spark-release-0-9-1.html b/site/releases/spark-release-0-9-1.html index 80401c4..5b08a0b

[4/5] spark-website git commit: Update Spark website for the release of Apache Spark 2.1.0

2016-12-29 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/e10180e6/site/mailing-lists.html -- diff --git a/site/mailing-lists.html b/site/mailing-lists.html index c113cdd..3e2334f 100644 --- a/site/mailing-lists.html +++

[1/5] spark-website git commit: Update Spark website for the release of Apache Spark 2.1.0

2016-12-29 Thread yhuai
Repository: spark-website Updated Branches: refs/heads/asf-site d2bcf1854 -> e10180e67 http://git-wip-us.apache.org/repos/asf/spark-website/blob/e10180e6/site/releases/spark-release-2-1-0.html -- diff --git

[5/5] spark-website git commit: Update Spark website for the release of Apache Spark 2.1.0

2016-12-29 Thread yhuai
Update Spark website for the release of Apache Spark 2.1.0 Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/e10180e6 Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/e10180e6 Diff:

[3/5] spark-website git commit: Update Spark website for the release of Apache Spark 2.1.0

2016-12-29 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/e10180e6/site/news/spark-2.0.0-preview.html -- diff --git a/site/news/spark-2.0.0-preview.html b/site/news/spark-2.0.0-preview.html index 64acf16..f135bf2 100644 ---

spark git commit: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelectCommand

2016-12-28 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 93f35569f -> 7d19b6ab7 [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelectCommand ## What changes were proposed in this pull request? The `CreateDataSourceTableAsSelectCommand` is quite complex now, as it has a lot of work to do if

[06/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/sql-programming-guide.html -- diff --git a/site/docs/2.1.0/sql-programming-guide.html b/site/docs/2.1.0/sql-programming-guide.html index

[19/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/ml-migration-guides.html -- diff --git a/site/docs/2.1.0/ml-migration-guides.html b/site/docs/2.1.0/ml-migration-guides.html index

[01/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
Repository: spark-website Updated Branches: refs/heads/asf-site ecf94f284 -> d2bcf1854 http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/submitting-applications.html -- diff --git

[25/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294 This version is built from the docs source code generated by applying https://github.com/apache/spark/pull/16294 to v2.1.0 (so, other changes in branch 2.1 will not affect the doc). Project:

[11/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/mllib-linear-methods.html -- diff --git a/site/docs/2.1.0/mllib-linear-methods.html b/site/docs/2.1.0/mllib-linear-methods.html index

[23/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/hadoop-provided.html -- diff --git a/site/docs/2.1.0/hadoop-provided.html b/site/docs/2.1.0/hadoop-provided.html index ff7afb7..9d77cf0 100644 ---

[12/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/mllib-feature-extraction.html -- diff --git a/site/docs/2.1.0/mllib-feature-extraction.html b/site/docs/2.1.0/mllib-feature-extraction.html index

[17/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/mllib-clustering.html -- diff --git a/site/docs/2.1.0/mllib-clustering.html b/site/docs/2.1.0/mllib-clustering.html index 9667606..1b50dab 100644

[13/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/mllib-evaluation-metrics.html -- diff --git a/site/docs/2.1.0/mllib-evaluation-metrics.html b/site/docs/2.1.0/mllib-evaluation-metrics.html index

[08/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/quick-start.html -- diff --git a/site/docs/2.1.0/quick-start.html b/site/docs/2.1.0/quick-start.html index 76e67e1..9d5fad7 100644 ---

[14/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/mllib-decision-tree.html -- diff --git a/site/docs/2.1.0/mllib-decision-tree.html b/site/docs/2.1.0/mllib-decision-tree.html index

[20/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/ml-features.html -- diff --git a/site/docs/2.1.0/ml-features.html b/site/docs/2.1.0/ml-features.html index 64463de..a2f102b 100644 ---

[22/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/ml-classification-regression.html -- diff --git a/site/docs/2.1.0/ml-classification-regression.html

[05/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/storage-openstack-swift.html -- diff --git a/site/docs/2.1.0/storage-openstack-swift.html b/site/docs/2.1.0/storage-openstack-swift.html index

[21/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/ml-clustering.html -- diff --git a/site/docs/2.1.0/ml-clustering.html b/site/docs/2.1.0/ml-clustering.html index e225281..df38605 100644 ---

[02/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/structured-streaming-programming-guide.html -- diff --git a/site/docs/2.1.0/structured-streaming-programming-guide.html

[10/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/mllib-pmml-model-export.html -- diff --git a/site/docs/2.1.0/mllib-pmml-model-export.html b/site/docs/2.1.0/mllib-pmml-model-export.html index

[09/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/programming-guide.html -- diff --git a/site/docs/2.1.0/programming-guide.html b/site/docs/2.1.0/programming-guide.html index 12458af..0e06e86

[07/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/sparkr.html -- diff --git a/site/docs/2.1.0/sparkr.html b/site/docs/2.1.0/sparkr.html index 0a1a347..e861a01 100644 ---

[18/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/ml-tuning.html -- diff --git a/site/docs/2.1.0/ml-tuning.html b/site/docs/2.1.0/ml-tuning.html index 0c36a98..2246cc2 100644 ---

[04/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/streaming-programming-guide.html -- diff --git a/site/docs/2.1.0/streaming-programming-guide.html

[15/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/mllib-data-types.html -- diff --git a/site/docs/2.1.0/mllib-data-types.html b/site/docs/2.1.0/mllib-data-types.html index 546d921..f7b5358 100644

[03/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/structured-streaming-kafka-integration.html -- diff --git a/site/docs/2.1.0/structured-streaming-kafka-integration.html

[16/25] spark-website git commit: Update 2.1.0 docs to include https://github.com/apache/spark/pull/16294

2016-12-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/d2bcf185/site/docs/2.1.0/mllib-collaborative-filtering.html -- diff --git a/site/docs/2.1.0/mllib-collaborative-filtering.html

[spark] Git Push Summary

2016-12-28 Thread yhuai
Repository: spark Updated Tags: refs/tags/v2.1.0 [created] cd0a08361 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: Revert "[SPARK-18990][SQL] make DatasetBenchmark fairer for Dataset"

2016-12-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master a05cc425a -> 2404d8e54 Revert "[SPARK-18990][SQL] make DatasetBenchmark fairer for Dataset" This reverts commit a05cc425a0a7d18570b99883993a04ad175aa071. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-18951] Upgrade com.thoughtworks.paranamer/paranamer to 2.6

2016-12-21 Thread yhuai
mer/paranamer 2.6, I suggests that we upgrade paranamer to 2.6. Author: Yin Huai <yh...@databricks.com> Closes #16359 from yhuai/SPARK-18951. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1a643889 Tree: http://git-wip-us.a

spark git commit: [SPARK-18928][BRANCH-2.0] Check TaskContext.isInterrupted() in FileScanRDD, JDBCRDD & UnsafeSorter

2016-12-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 678d91c1d -> 2aae220b5 [SPARK-18928][BRANCH-2.0] Check TaskContext.isInterrupted() in FileScanRDD, JDBCRDD & UnsafeSorter This is a branch-2.0 backport of #16340; the original description follows: ## What changes were proposed in

spark git commit: [SPARK-18761][BRANCH-2.0] Introduce "task reaper" to oversee task killing in executors

2016-12-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 1f0c5fa75 -> 678d91c1d [SPARK-18761][BRANCH-2.0] Introduce "task reaper" to oversee task killing in executors Branch-2.0 backport of #16189; original description follows: ## What changes were proposed in this pull request? Spark's

spark git commit: [SPARK-18761][CORE] Introduce "task reaper" to oversee task killing in executors

2016-12-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 5857b9ac2 -> fa829ce21 [SPARK-18761][CORE] Introduce "task reaper" to oversee task killing in executors ## What changes were proposed in this pull request? Spark's current task cancellation / task killing mechanism is "best effort"

spark git commit: [SPARK-18921][SQL] check database existence with Hive.databaseExists instead of getDatabase

2016-12-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.1 fc1b25660 -> c1a26b458 [SPARK-18921][SQL] check database existence with Hive.databaseExists instead of getDatabase ## What changes were proposed in this pull request? It's weird that we use `Hive.getDatabase` to check the existence

spark git commit: [SPARK-18921][SQL] check database existence with Hive.databaseExists instead of getDatabase

2016-12-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 24482858e -> 7a75ee1c9 [SPARK-18921][SQL] check database existence with Hive.databaseExists instead of getDatabase ## What changes were proposed in this pull request? It's weird that we use `Hive.getDatabase` to check the existence of a

spark git commit: [SPARK-13747][CORE] Fix potential ThreadLocal leaks in RPC when using ForkJoinPool

2016-12-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master d53f18cae -> fb3081d3b [SPARK-13747][CORE] Fix potential ThreadLocal leaks in RPC when using ForkJoinPool ## What changes were proposed in this pull request? Some places in SQL may call `RpcEndpointRef.askWithRetry` (e.g.,

spark git commit: [SPARK-18675][SQL] CTAS for hive serde table should work for all hive versions

2016-12-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 096f868b7 -> d53f18cae [SPARK-18675][SQL] CTAS for hive serde table should work for all hive versions ## What changes were proposed in this pull request? Before hive 1.1, when inserting into a table, hive will create the staging

spark git commit: [SPARK-18631][SQL] Changed ExchangeCoordinator re-partitioning to avoid more data skew

2016-11-29 Thread yhuai
Repository: spark Updated Branches: refs/heads/master d57a594b8 -> f8878a4c6 [SPARK-18631][SQL] Changed ExchangeCoordinator re-partitioning to avoid more data skew ## What changes were proposed in this pull request? Re-partitioning logic in ExchangeCoordinator changed so that adding another

spark git commit: [SPARK-18602] Set the version of org.codehaus.janino:commons-compiler to 3.0.0 to match the version of org.codehaus.janino:janino

2016-11-28 Thread yhuai
tps://github.com/apache/spark/blob/branch-2.1/pom.xml#L1759). So, this PR upgrades org.codehaus.janino:commons-compiler to 3.0.0. ## How was this patch tested? jenkins Author: Yin Huai <yh...@databricks.com> Closes #16025 from yhuai/janino-commons-compile. (cherry picked fr

spark git commit: [SPARK-18602] Set the version of org.codehaus.janino:commons-compiler to 3.0.0 to match the version of org.codehaus.janino:janino

2016-11-28 Thread yhuai
tps://github.com/apache/spark/blob/branch-2.1/pom.xml#L1759). So, this PR upgrades org.codehaus.janino:commons-compiler to 3.0.0. ## How was this patch tested? jenkins Author: Yin Huai <yh...@databricks.com> Closes #16025 from yhuai/janino-commons-compile. Project: http://git-wip-us.apache.org/

spark git commit: [SPARK-18360][SQL] default table path of tables in default database should depend on the location of default database

2016-11-17 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.1 978798880 -> fc466be4f [SPARK-18360][SQL] default table path of tables in default database should depend on the location of default database ## What changes were proposed in this pull request? The current semantic of the warehouse

spark git commit: [SPARK-18360][SQL] default table path of tables in default database should depend on the location of default database

2016-11-17 Thread yhuai
Repository: spark Updated Branches: refs/heads/master b0aa1aa1a -> ce13c2672 [SPARK-18360][SQL] default table path of tables in default database should depend on the location of default database ## What changes were proposed in this pull request? The current semantic of the warehouse

spark git commit: [SPARK-18186] Migrate HiveUDAFFunction to TypedImperativeAggregate for partial aggregation support

2016-11-16 Thread yhuai
Repository: spark Updated Branches: refs/heads/master a36a76ac4 -> 2ca8ae9aa [SPARK-18186] Migrate HiveUDAFFunction to TypedImperativeAggregate for partial aggregation support ## What changes were proposed in this pull request? While being evaluated in Spark SQL, Hive UDAFs don't support

spark git commit: [SPARK-18379][SQL] Make the parallelism of parallelPartitionDiscovery configurable.

2016-11-15 Thread yhuai
Repository: spark Updated Branches: refs/heads/master f14ae4900 -> 745ab8bc5 [SPARK-18379][SQL] Make the parallelism of parallelPartitionDiscovery configurable. ## What changes were proposed in this pull request? The largest parallelism in PartitioningAwareFileIndex

spark git commit: [SPARK-18368][SQL] Fix regexp replace when serialized

2016-11-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 c8628e877 -> 6e7310590 [SPARK-18368][SQL] Fix regexp replace when serialized ## What changes were proposed in this pull request? This makes the result value both transient and lazy, so that if the RegExpReplace object is initialized

spark git commit: [SPARK-18368][SQL] Fix regexp replace when serialized

2016-11-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.1 626f6d6d4 -> 80f58510a [SPARK-18368][SQL] Fix regexp replace when serialized ## What changes were proposed in this pull request? This makes the result value both transient and lazy, so that if the RegExpReplace object is initialized

spark git commit: [SPARK-18368][SQL] Fix regexp replace when serialized

2016-11-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 47636618a -> d4028de97 [SPARK-18368][SQL] Fix regexp replace when serialized ## What changes were proposed in this pull request? This makes the result value both transient and lazy, so that if the RegExpReplace object is initialized then

spark git commit: Revert "[SPARK-18368] Fix regexp_replace with task serialization."

2016-11-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 06a13ecca -> 47636618a Revert "[SPARK-18368] Fix regexp_replace with task serialization." This reverts commit b9192bb3ffc319ebee7dbd15c24656795e454749. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-18338][SQL][TEST-MAVEN] Fix test case initialization order under Maven builds

2016-11-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 02c5325b8 -> 205e6d586 [SPARK-18338][SQL][TEST-MAVEN] Fix test case initialization order under Maven builds ## What changes were proposed in this pull request? Test case initialization order under Maven and SBT are different. Maven

spark git commit: [SPARK-18256] Improve the performance of event log replay in HistoryServer

2016-11-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 4cee2ce25 -> 0e3312ee7 [SPARK-18256] Improve the performance of event log replay in HistoryServer ## What changes were proposed in this pull request? This patch significantly improves the performance of event log replay in the

spark git commit: [SPARK-18167] Re-enable the non-flaky parts of SQLQuerySuite

2016-11-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.1 e51978c3d -> 0a303a694 [SPARK-18167] Re-enable the non-flaky parts of SQLQuerySuite ## What changes were proposed in this pull request? It seems the proximate cause of the test failures is that `cast(str as decimal)` in derby will

spark git commit: [SPARK-18167] Re-enable the non-flaky parts of SQLQuerySuite

2016-11-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 550cd56e8 -> 4cee2ce25 [SPARK-18167] Re-enable the non-flaky parts of SQLQuerySuite ## What changes were proposed in this pull request? It seems the proximate cause of the test failures is that `cast(str as decimal)` in derby will raise

spark git commit: [SPARK-17949][SQL] A JVM object based aggregate operator

2016-11-03 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 66a99f4a4 -> 27daf6bcd [SPARK-17949][SQL] A JVM object based aggregate operator ## What changes were proposed in this pull request? This PR adds a new hash-based aggregate operator named `ObjectHashAggregateExec` that supports

spark git commit: [SPARK-17470][SQL] unify path for data source table and locationUri for hive serde table

2016-11-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.1 2aff2ea81 -> 5ea2f9e5e [SPARK-17470][SQL] unify path for data source table and locationUri for hive serde table ## What changes were proposed in this pull request? Due to a limitation of hive metastore(table location must be

spark git commit: [SPARK-17470][SQL] unify path for data source table and locationUri for hive serde table

2016-11-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/master fd90541c3 -> 3a1bc6f47 [SPARK-17470][SQL] unify path for data source table and locationUri for hive serde table ## What changes were proposed in this pull request? Due to a limitation of hive metastore(table location must be directory

spark git commit: [SPARK-18167][SQL] Also log all partitions when the SQLQuerySuite test flakes

2016-10-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/master de3f87fa7 -> 6633b97b5 [SPARK-18167][SQL] Also log all partitions when the SQLQuerySuite test flakes ## What changes were proposed in this pull request? One possibility for this test flaking is that we have corrupted the partition schema

spark git commit: [SPARK-17972][SQL] Add Dataset.checkpoint() to truncate large query plans

2016-10-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 26b07f190 -> 8bfc3b7aa [SPARK-17972][SQL] Add Dataset.checkpoint() to truncate large query plans ## What changes were proposed in this pull request? ### Problem Iterative ML code may easily create query plans that grow exponentially. We

[1/2] spark git commit: [SPARK-17970][SQL] store partition spec in metastore for data source table

2016-10-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 79fd0cc05 -> ccb115430 http://git-wip-us.apache.org/repos/asf/spark/blob/ccb11543/sql/hive/src/test/scala/org/apache/spark/sql/hive/PartitionProviderCompatibilitySuite.scala

[2/2] spark git commit: [SPARK-17970][SQL] store partition spec in metastore for data source table

2016-10-27 Thread yhuai
[SPARK-17970][SQL] store partition spec in metastore for data source table ## What changes were proposed in this pull request? We should follow hive table and also store partition spec in metastore for data source table. This brings 2 benefits: 1. It's more flexible to manage the table data

spark git commit: [SPARK-18132] Fix checkstyle

2016-10-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 dcf2f090c -> 1a4be51d6 [SPARK-18132] Fix checkstyle This PR fixes checkstyle. Author: Yin Huai <yh...@databricks.com> Closes #15656 from yhuai/fix-format. (cherry picked from commit d3b4831d009905185ad74096ce3ecfa934bc191

spark git commit: [SPARK-18132] Fix checkstyle

2016-10-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/master dd4f088c1 -> d3b4831d0 [SPARK-18132] Fix checkstyle This PR fixes checkstyle. Author: Yin Huai <yh...@databricks.com> Closes #15656 from yhuai/fix-format. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: h

spark git commit: [SPARK-18070][SQL] binary operator should not consider nullability when comparing input types

2016-10-25 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 1c1e847bc -> 7c8d9a557 [SPARK-18070][SQL] binary operator should not consider nullability when comparing input types ## What changes were proposed in this pull request? Binary operator requires its inputs to be of same type, but it

spark git commit: [SPARK-18070][SQL] binary operator should not consider nullability when comparing input types

2016-10-25 Thread yhuai
Repository: spark Updated Branches: refs/heads/master c5fe3dd4f -> a21791e31 [SPARK-18070][SQL] binary operator should not consider nullability when comparing input types ## What changes were proposed in this pull request? Binary operator requires its inputs to be of same type, but it

spark git commit: [SPARK-17926][SQL][STREAMING] Added json for statuses

2016-10-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 78458a7eb -> af2e6e0c9 [SPARK-17926][SQL][STREAMING] Added json for statuses ## What changes were proposed in this pull request? StreamingQueryStatus exposed through StreamingQueryListener often needs to be recorded (similar to

spark git commit: [SPARK-17926][SQL][STREAMING] Added json for statuses

2016-10-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master e371040a0 -> 7a531e305 [SPARK-17926][SQL][STREAMING] Added json for statuses ## What changes were proposed in this pull request? StreamingQueryStatus exposed through StreamingQueryListener often needs to be recorded (similar to

spark git commit: [SPARK-17873][SQL] ALTER TABLE RENAME TO should allow users to specify database in destination table name(but have to be same as source table)

2016-10-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 2629cd746 -> 4329c5cea [SPARK-17873][SQL] ALTER TABLE RENAME TO should allow users to specify database in destination table name(but have to be same as source table) ## What changes were proposed in this pull request? Unlike Hive, in

spark git commit: [SPARK-17863][SQL] should not add column into Distinct

2016-10-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 d7fa3e324 -> c53b83749 [SPARK-17863][SQL] should not add column into Distinct ## What changes were proposed in this pull request? We are trying to resolve the attribute in sort by pulling up some column for grandchild into child, but

spark git commit: [SPARK-17863][SQL] should not add column into Distinct

2016-10-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 522dd0d0e -> da9aeb0fd [SPARK-17863][SQL] should not add column into Distinct ## What changes were proposed in this pull request? We are trying to resolve the attribute in sort by pulling up some column for grandchild into child, but

spark git commit: Revert "[SPARK-17620][SQL] Determine Serde by hive.default.fileformat when Creating Hive Serde Tables"

2016-10-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 7ab86244e -> 522dd0d0e Revert "[SPARK-17620][SQL] Determine Serde by hive.default.fileformat when Creating Hive Serde Tables" This reverts commit 7ab86244e30ca81eb4fa40ea77b4c2b8881cbab2. Project:

spark git commit: [SPARK-17758][SQL] Last returns wrong result in case of empty partition

2016-10-05 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 221b418b1 -> 5fd54b994 [SPARK-17758][SQL] Last returns wrong result in case of empty partition ## What changes were proposed in this pull request? The result of the `Last` function can be wrong when the last partition processed is empty.

spark git commit: [SPARK-17758][SQL] Last returns wrong result in case of empty partition

2016-10-05 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 b8df2e53c -> 3b6463a79 [SPARK-17758][SQL] Last returns wrong result in case of empty partition ## What changes were proposed in this pull request? The result of the `Last` function can be wrong when the last partition processed is

spark git commit: [SPARK-17699] Support for parsing JSON string columns

2016-09-29 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 027dea8f2 -> fe33121a5 [SPARK-17699] Support for parsing JSON string columns Spark SQL has great support for reading text files that contain JSON data. However, in many cases the JSON data is just one column amongst others. This is

  1   2   3   4   5   6   7   8   >