[GitHub] spark issue #20973: [SPARK-20114][ML] spark.ml parity for sequential pattern...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20973 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88873/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20973: [SPARK-20114][ML] spark.ml parity for sequential pattern...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20973 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20973: [SPARK-20114][ML] spark.ml parity for sequential pattern...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20973 **[Test build #88873 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88873/testReport)** for PR 20973 at commit [`d563c8f`](https://github.com/apache/spark/commit/d563c8fab0cb718b511ac78bc38e712a65148d17). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20953 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88871/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20953 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20886: [SPARK-19724][SQL]create a managed table with an existed...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20886 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20953 **[Test build #88871 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88871/testReport)** for PR 20953 at commit [`a06ad5e`](https://github.com/apache/spark/commit/a06ad5e0451c3ff8bf7104512f32161bf66ed696). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20886: [SPARK-19724][SQL]create a managed table with an existed...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20886 **[Test build #88874 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88874/testReport)** for PR 20886 at commit [`2b2973a`](https://github.com/apache/spark/commit/2b2973a9db7a8fa228bfc939604feca4cc2c6a59). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20886: [SPARK-19724][SQL]create a managed table with an existed...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20886 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1944/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88870/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20971 **[Test build #88870 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88870/testReport)** for PR 20971 at commit [`36fa1bd`](https://github.com/apache/spark/commit/36fa1bdc847f0b5ffb61284a35f3183751255705). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20611: [SPARK-23425][SQL]Support wildcard in HDFS path f...
Github user sujith71955 commented on a diff in the pull request: https://github.com/apache/spark/pull/20611#discussion_r179030611 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -385,7 +385,9 @@ case class LoadDataCommand( val hadoopConf = sparkSession.sessionState.newHadoopConf() val srcPath = new Path(hdfsUri) val fs = srcPath.getFileSystem(hadoopConf) -if (!fs.exists(srcPath)) { +// Check if the path exists or there are matched paths if it's a path with wildcard. +// For HDFS path, we support wildcard in directory name and file name. +if (null == fs.globStatus(srcPath) || fs.globStatus(srcPath).isEmpty) { --- End diff -- I will update the PR as such we can use fs.globStatus() API in both local and hdfs file path scenarios. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20611: [SPARK-23425][SQL]Support wildcard in HDFS path f...
Github user sujith71955 commented on a diff in the pull request: https://github.com/apache/spark/pull/20611#discussion_r179030399 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -385,7 +385,9 @@ case class LoadDataCommand( val hadoopConf = sparkSession.sessionState.newHadoopConf() val srcPath = new Path(hdfsUri) val fs = srcPath.getFileSystem(hadoopConf) -if (!fs.exists(srcPath)) { +// Check if the path exists or there are matched paths if it's a path with wildcard. +// For HDFS path, we support wildcard in directory name and file name. +if (null == fs.globStatus(srcPath) || fs.globStatus(srcPath).isEmpty) { --- End diff -- @wzhfy @HyukjinKwon @dongjoon-hyun i verified the scenario by updating the code by using fs.globStatus() API for both local and hdfs path. for local path its working fine --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20974: [SPARK-23862][SQL] Spark ExpressionEncoder should suppor...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20974 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20974: [SPARK-23862][SQL] Spark ExpressionEncoder should suppor...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20974 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20974: [SPARK-23862][SQL] Spark ExpressionEncoder should...
GitHub user fangshil opened a pull request: https://github.com/apache/spark/pull/20974 [SPARK-23862][SQL] Spark ExpressionEncoder should support java enum type in scala ## What changes were proposed in this pull request? In SPARK-21255, spark upstream adds support for creating encoders for java enum types, but the support is only added to Java API(for enum working within Java Beans). Since the java enum can come from third-party java library, we have use case that requires 1. using java enum types as field of scala case class 2. using java enum as the type T in Dataset[T] Spark ExpressionEncoder already supports ser/de many java types in ScalaReflection, so we propose to add support for java enum as well, as a follow up of SPARK-21255. ## How was this patch tested? Tested the patch in our production cluster. Added unit test. Since: 1. it is not possible to define a java enum in scala directly, since the defined enum class in scala will miss method like valueOf which is added by java compiler 2. it is not possible to define a test enum java class and use in scala test because the compilation of single scala test(-DwildcardSuites=org.apache.spark.sql.DatasetSuite) won't compile the test java class first As a result, I use the Spark SQL public java enum API(SaveMode.java) in the test. Please advise if there is a better way to test You can merge this pull request into a Git repository by running: $ git pull https://github.com/fangshil/spark SPARK-23862 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20974.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20974 commit 90effb21375a2ec0e93426efcaae092ad3f59e26 Author: Fangshi LiDate: 2018-04-04T04:52:36Z SPARK-23862: Spark ExpressionEncoder should support java enum type in scala --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20973: [SPARK-20114][ML] spark.ml parity for sequential pattern...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20973 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20973: [SPARK-20114][ML] spark.ml parity for sequential pattern...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20973 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1943/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20973: [SPARK-20114][ML] spark.ml parity for sequential pattern...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20973 **[Test build #88873 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88873/testReport)** for PR 20973 at commit [`d563c8f`](https://github.com/apache/spark/commit/d563c8fab0cb718b511ac78bc38e712a65148d17). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20810: [SPARK-20114][ML] spark.ml parity for sequential ...
Github user WeichenXu123 closed the pull request at: https://github.com/apache/spark/pull/20810 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20810: [SPARK-20114][ML] spark.ml parity for sequential pattern...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/20810 According to @jkbradley 's opinion. I create a new PR which only use a static method. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20973: [SPARK-20114][ML] spark.ml parity for sequential ...
GitHub user WeichenXu123 opened a pull request: https://github.com/apache/spark/pull/20973 [SPARK-20114][ML] spark.ml parity for sequential pattern mining - PrefixSpan ## What changes were proposed in this pull request? PrefixSpan API for spark.ml. New implementation instead of #20810 ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/WeichenXu123/spark prefixSpan2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20973.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20973 commit d563c8fab0cb718b511ac78bc38e712a65148d17 Author: WeichenXuDate: 2018-04-04T04:42:05Z init pr --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20786 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88868/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20786 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20786 **[Test build #88868 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88868/testReport)** for PR 20786 at commit [`48c17d4`](https://github.com/apache/spark/commit/48c17d4dff6a4e82b86d70f3845e6d524b4807e5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `sealed trait ClassificationNode extends Node ` * `sealed trait RegressionNode extends Node ` * `sealed trait LeafNode extends Node ` * `sealed trait InternalNode extends Node ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20969: [SPARK-23826] [TEST] TestHiveSparkSession should set def...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20969 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1942/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20969: [SPARK-23826] [TEST] TestHiveSparkSession should set def...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20969 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20969: [SPARK-23826] [TEST] TestHiveSparkSession should set def...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20969 **[Test build #88872 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88872/testReport)** for PR 20969 at commit [`f7e0b03`](https://github.com/apache/spark/commit/f7e0b034026691872c905ab4d5d09c381c56b7b0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20969: [SPARK-23826] [TEST] TestHiveSparkSession should ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20969#discussion_r179020152 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala --- @@ -159,9 +159,10 @@ private[hive] class TestHiveSparkSession( private val loadTestTables: Boolean) extends SparkSession(sc) with Logging { self => - // TODO(SPARK-23826): TestHiveSparkSession should set default session the same way as - // TestSparkSession, but doing this the same way breaks many tests in the package. We need - // to investigate and find a different strategy. + // The base spark session does this in getOrCreate(), here we emulate that behavior for tests. + if (SparkSession.getDefaultSession.isEmpty) { +SparkSession.setDefaultSession(this) + } --- End diff -- This is not needed after we merge https://github.com/apache/spark/pull/20927 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1941/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20971 **[Test build #88870 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88870/testReport)** for PR 20971 at commit [`36fa1bd`](https://github.com/apache/spark/commit/36fa1bdc847f0b5ffb61284a35f3183751255705). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20953 **[Test build #88871 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88871/testReport)** for PR 20953 at commit [`a06ad5e`](https://github.com/apache/spark/commit/a06ad5e0451c3ff8bf7104512f32161bf66ed696). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20953 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20797: [SPARK-23583][SQL] Invoke should support interpreted exe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20797 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1940/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20971 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20797: [SPARK-23583][SQL] Invoke should support interpreted exe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20797 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20797: [SPARK-23583][SQL] Invoke should support interpreted exe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20797 **[Test build #88869 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88869/testReport)** for PR 20797 at commit [`c568944`](https://github.com/apache/spark/commit/c568944a98ce35c79809283a68ec95454029d0ea). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88867/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20971 **[Test build #88867 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88867/testReport)** for PR 20971 at commit [`36fa1bd`](https://github.com/apache/spark/commit/36fa1bdc847f0b5ffb61284a35f3183751255705). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20953 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88866/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20953 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20953 **[Test build #88866 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88866/testReport)** for PR 20953 at commit [`a06ad5e`](https://github.com/apache/spark/commit/a06ad5e0451c3ff8bf7104512f32161bf66ed696). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20972: Fixes misspelling in configuration.md
Github user bradurani closed the pull request at: https://github.com/apache/spark/pull/20972 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20886: [SPARK-19724][SQL]create a managed table with an existed...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20886 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19968: [SPARK-22769][CORE] When driver stopping, there i...
Github user KaiXinXiaoLei closed the pull request at: https://github.com/apache/spark/pull/19968 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20886: [SPARK-19724][SQL]create a managed table with an existed...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20886 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88860/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19968: [SPARK-22769][CORE] When driver stopping, there is error...
Github user KaiXinXiaoLei commented on the issue: https://github.com/apache/spark/pull/19968 Now this problem, i don't work. Now i close it . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20886: [SPARK-19724][SQL]create a managed table with an existed...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20886 **[Test build #88860 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88860/testReport)** for PR 20886 at commit [`7a3311c`](https://github.com/apache/spark/commit/7a3311c2cbd3d9f7399abb38bd877bbd23ca836e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20928: [MINOR][DOC] Fix some typos and grammar issues
Github user dsakuma commented on the issue: https://github.com/apache/spark/pull/20928 @HyukjinKwon I've fixed the title format :D --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20640: [SPARK-19755][Mesos] Blacklist is always active f...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20640#discussion_r179013270 --- Diff: resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackendSuite.scala --- @@ -108,6 +108,28 @@ class MesosCoarseGrainedSchedulerBackendSuite extends SparkFunSuite verifyTaskLaunched(driver, "o2") } + test("mesos declines offers from blacklisted slave") { +setBackend() + +// launches a task on a valid offer on slave s1 +val minMem = backend.executorMemory(sc) + 1024 +val minCpu = 4 +val offer1 = Resources(minMem, minCpu) +offerResources(List(offer1)) +verifyTaskLaunched(driver, "o1") + +// for any reason executor(aka mesos task) failed on s1 +val status = createTaskStatus("0", "s1", TaskState.TASK_FAILED) +backend.statusUpdate(driver, status) +when(taskScheduler.nodeBlacklist()).thenReturn(Set("hosts1")) --- End diff -- just to re-iterate my point above -- in many cases, having an executor fail will *not* lead to `taskScheduler.nodeBlacklist()` changing as you're doing here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20640: [SPARK-19755][Mesos] Blacklist is always active f...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20640#discussion_r179012299 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -648,14 +645,8 @@ private[spark] class MesosCoarseGrainedSchedulerBackend( totalGpusAcquired -= gpus gpusByTaskId -= taskId } -// If it was a failure, mark the slave as failed for blacklisting purposes if (TaskState.isFailed(state)) { - slave.taskFailures += 1 - - if (slave.taskFailures >= MAX_SLAVE_FAILURES) { -logInfo(s"Blacklisting Mesos slave $slaveId due to too many failures; " + -"is Spark installed on it?") - } + logError(s"Task $taskId failed on Mesos slave $slaveId.") --- End diff -- minor: I think it would be nice to say "Mesos task $taskId...". Maybe its obvious for those spending more time with mesos, but I find I'm easily confused by the difference between a mesos task and a spark task. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20640: [SPARK-19755][Mesos] Blacklist is always active f...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20640#discussion_r179012891 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -571,7 +568,7 @@ private[spark] class MesosCoarseGrainedSchedulerBackend( cpus + totalCoresAcquired <= maxCores && mem <= offerMem && numExecutors < executorLimit && - slaves.get(slaveId).map(_.taskFailures).getOrElse(0) < MAX_SLAVE_FAILURES && + !scheduler.nodeBlacklist().contains(offerHostname) && --- End diff -- I just want to make really sure everybody understands the big change in behavior here -- `nodeBlacklist()` currently *only* gets updated based on failures in *spark* tasks. If a mesos task fails to even start -- that is, if a spark executor fails to launch on a node -- `nodeBlacklist` does not get updated. So you could have a node that is misconfigured somehow, and you might end up repeatedly trying to launch executors on it after this changed, with the executor even failing to start each time. That is even if you have blacklisting on. This is SPARK-16630 for the non-mesos case. That is being actively worked on now -- however the work there will probably have to be yarn-specific, so there will still be followup work to get the same thing for mesos after that is in. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20933 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88859/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20933 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20933 **[Test build #88859 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88859/testReport)** for PR 20933 at commit [`ffbf2f8`](https://github.com/apache/spark/commit/ffbf2f88c224fcafce003121695ab91774db0776). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20249: [SPARK-23057][SPARK-19235][SQL] SET LOCATION should chan...
Github user xubo245 commented on the issue: https://github.com/apache/spark/pull/20249 It's belong to TODO work @tgravescs --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20886: [SPARK-19724][SQL]create a managed table with an ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20886#discussion_r179008019 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -298,15 +299,32 @@ class SessionCatalog( makeQualifiedPath(tableDefinition.storage.locationUri.get) tableDefinition.copy( storage = tableDefinition.storage.copy(locationUri = Some(qualifiedTableLocation)), -identifier = TableIdentifier(table, Some(db))) +identifier = tableIdentifier) } else { - tableDefinition.copy(identifier = TableIdentifier(table, Some(db))) + tableDefinition.copy(identifier = tableIdentifier) } requireDbExists(db) +if (!ignoreIfExists) { + validateTableLocation(newTableDefinition) +} externalCatalog.createTable(newTableDefinition, ignoreIfExists) } + def validateTableLocation(table: CatalogTable): Unit = { +// SPARK-19724: the default location of a managed table should be non-existent or empty. +if (table.tableType == CatalogTableType.MANAGED && !conf.allowNonemptyManagedTableLocation) { + val tableLocation = +new Path(table.storage.locationUri.getOrElse(defaultTablePath(table.identifier))) + val fs = tableLocation.getFileSystem(hadoopConf) + + if (fs.exists(tableLocation) && fs.listStatus(tableLocation).nonEmpty) { +throw new AnalysisException(s"Can not create the managed table('${table.identifier}')" + --- End diff -- `Can not` -> `Not allowed to` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20945: [SPARK-23790][Mesos] fix metastore connection iss...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20945#discussion_r179007924 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -506,6 +506,10 @@ private[spark] class MesosClusterScheduler( options ++= Seq("--class", desc.command.mainClass) } +desc.conf.getOption("spark.mesos.proxyUser").foreach { v => + options ++= Seq("--proxy-user", v) --- End diff -- > Yes because the assumption was client mode was safe. There is no warning about this Could probably use something in the documentation - warnings printed to logs are easily ignored. Still, there are legitimate uses for client mode + proxy user, but I don't think this is one of them. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20886: [SPARK-19724][SQL]create a managed table with an ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20886#discussion_r179007898 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1152,6 +1152,13 @@ object SQLConf { .booleanConf .createWithDefault(false) + val ALLOW_NONEMPTY_MANAGED_TABLE_LOCATION = +buildConf("spark.sql.allowNonemptyManagedTableLocation") --- End diff -- `spark.sql.allowCreateManagedTableUsingNonemptyLocation` Also this should be an internal conf --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20786 **[Test build #88868 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88868/testReport)** for PR 20786 at commit [`48c17d4`](https://github.com/apache/spark/commit/48c17d4dff6a4e82b86d70f3845e6d524b4807e5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20786 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1939/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20786 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/20786 @jkbradley Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20928: Fix small typo in configuration doc
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20928 @dsakuma, mind if I ask to fix the PR title to .. `[MINOR][DOC] ...` just to consistent with other PRs? It's not a small typo anymore :). Thanks for your effort. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20972: Fixes misspelling in configuration.md
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20972 We can close this just for clarification. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20972: Fixes misspelling in configuration.md
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20972 Please do a quick search before opening a PR.. there are two duplicated PRs - https://github.com/apache/spark/pull/20948 and https://github.com/apache/spark/pull/20928 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20972: Fixes misspelling in configuration.md
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20972 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20972: Fixes misspelling in configuration.md
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20972 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20923: [SPARK-23807][BUILD][WIP] Add Hadoop 3 profile with rele...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20923 Hi @steveloughran , I think you missed this comment. You need to create a deps file under dev/deps and change the related script. > Also I think we need to create a related spark-deps-hadoop-3.x under dev/deps and make dependency check work for Hadoop 3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20971 **[Test build #88867 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88867/testReport)** for PR 20971 at commit [`36fa1bd`](https://github.com/apache/spark/commit/36fa1bdc847f0b5ffb61284a35f3183751255705). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1938/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20972: Fixes misspelling in configuration.md
GitHub user bradurani opened a pull request: https://github.com/apache/spark/pull/20972 Fixes misspelling in configuration.md ## What changes were proposed in this pull request? Fixes a misspelling in configuration.md. Changes `spark-defalut.conf` to `spark-default.conf` ## How was this patch tested? Viewed the new markdown in Github You can merge this pull request into a Git repository by running: $ git pull https://github.com/bradurani/spark bu/fix_docs_misspelling Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20972.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20972 commit e346e677cd2b783b4fa39e7bf6a59eee0a40eb1a Author: Brad UraniDate: 2018-04-04T00:44:23Z Fixes misspelling in configuration.md spark-defalut.conf -> spark-default.conf --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88865/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20971 **[Test build #88865 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88865/testReport)** for PR 20971 at commit [`e429af1`](https://github.com/apache/spark/commit/e429af1e9a5a2f8ed3e90ee215d561c05aeb33b3). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1937/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20971 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20971 **[Test build #88865 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88865/testReport)** for PR 20971 at commit [`e429af1`](https://github.com/apache/spark/commit/e429af1e9a5a2f8ed3e90ee215d561c05aeb33b3). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20953 **[Test build #88866 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88866/testReport)** for PR 20953 at commit [`a06ad5e`](https://github.com/apache/spark/commit/a06ad5e0451c3ff8bf7104512f32161bf66ed696). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20953 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20971: [SPARK-23809][SQL][backport] Active SparkSession ...
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/20971 [SPARK-23809][SQL][backport] Active SparkSession should be set by getOrCreate This backports https://github.com/apache/spark/pull/20927 to branch-2.3 ## What changes were proposed in this pull request? Currently, the active spark session is set inconsistently (e.g., in createDataFrame, prior to query execution). Many places in spark also incorrectly query active session when they should be calling activeSession.getOrElse(defaultSession) and so might get None even if a Spark session exists. The semantics here can be cleaned up if we also set the active session when the default session is set. Related: https://github.com/apache/spark/pull/20926/files ## How was this patch tested? Unit test, existing test. Note that if https://github.com/apache/spark/pull/20926 merges first we should also update the tests there. Author: Eric LiangCloses #20927 from ericl/active-session-cleanup. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ericl/spark backport-23809 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20971.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20971 commit f2303dcef61660dabfd08be5568b7da10cf1b117 Author: Eric Liang Date: 2018-04-04T00:09:12Z [SPARK-23809][SQL] Active SparkSession should be set by getOrCreate ## What changes were proposed in this pull request? Currently, the active spark session is set inconsistently (e.g., in createDataFrame, prior to query execution). Many places in spark also incorrectly query active session when they should be calling activeSession.getOrElse(defaultSession) and so might get None even if a Spark session exists. The semantics here can be cleaned up if we also set the active session when the default session is set. Related: https://github.com/apache/spark/pull/20926/files ## How was this patch tested? Unit test, existing test. Note that if https://github.com/apache/spark/pull/20926 merges first we should also update the tests there. Author: Eric Liang Closes #20927 from ericl/active-session-cleanup. commit e429af1e9a5a2f8ed3e90ee215d561c05aeb33b3 Author: Eric Liang Date: 2018-04-04T00:30:50Z Tue Apr 3 17:30:50 PDT 2018 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user ericl commented on the issue: https://github.com/apache/spark/pull/20971 @gatorsmile here's the patch --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20953 LGTM pending Jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20914: [SPARK-23802][SQL] PropagateEmptyRelation can lea...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20914 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20953: [SPARK-23822][SQL] Improve error message for Parquet sch...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20953 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20914: [SPARK-23802][SQL] PropagateEmptyRelation can leave quer...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20914 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20939: [SPARK-23823][SQL] ResolveReferences should preserve tre...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20939 https://github.com/apache/spark/pull/20961 looks good to me. Could you close this PR? Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20961: [SPARK-23823][SQL] Keep origin in transformExpression
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20961 This sounds right to me. cc @hvanhovell @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20961: [SPARK-23823][SQL] Keep origin in transformExpres...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20961#discussion_r178997765 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala --- @@ -103,7 +103,7 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]] extends TreeNode[PlanT var changed = false @inline def transformExpression(e: Expression): Expression = { - val newE = f(e) + val newE = CurrentOrigin.withOrigin(e.origin) { f(e) } --- End diff -- Nit: style issue: ```Scala val newE = CurrentOrigin.withOrigin(e.origin) { f(e) } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20909 **[Test build #88863 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88863/testReport)** for PR 20909 at commit [`db14acb`](https://github.com/apache/spark/commit/db14acbb3a90c9da184fc9c909640e07100c38fa). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88863/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20910: [SPARK-22839] [K8s] Refactor to unify driver and executo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20910 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20910: [SPARK-22839] [K8s] Refactor to unify driver and executo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20910 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88864/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20910: [SPARK-22839] [K8s] Refactor to unify driver and executo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20910 **[Test build #88864 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88864/testReport)** for PR 20910 at commit [`7d65875`](https://github.com/apache/spark/commit/7d65875266ba94f27d2cc8ec992e8e1cb8f593b5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20910: [SPARK-22839] [K8s] Refactor to unify driver and executo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20910 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1903/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org