[GitHub] [spark] SparkQA commented on pull request #30864: [SPARK-33857][SQL] Unify the default seed of random functions
SparkQA commented on pull request #30864: URL: https://github.com/apache/spark/pull/30864#issuecomment-749995396 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37865/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30538: [SPARK-33591][SQL] Recognize `null` in partition spec values
SparkQA removed a comment on pull request #30538: URL: https://github.com/apache/spark/pull/30538#issuecomment-749988674 **[Test build #133273 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133273/testReport)** for PR 30538 at commit [`343d15d`](https://github.com/apache/spark/commit/343d15d44c67f2139b9a555dbbaa10f684ddb094). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30900: [SPARK-33886][SQL] UnresolvedTable should retain SQL text position for DDL commands
SparkQA removed a comment on pull request #30900: URL: https://github.com/apache/spark/pull/30900#issuecomment-749968313 **[Test build #133265 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133265/testReport)** for PR 30900 at commit [`7771bb2`](https://github.com/apache/spark/commit/7771bb2700b66d50bf3140f16142b56b13f95b83). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 commented on pull request #30900: [SPARK-33886][SQL] UnresolvedTable should retain SQL text position for DDL commands
imback82 commented on pull request #30900: URL: https://github.com/apache/spark/pull/30900#issuecomment-749994396 cc @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
AmplabJenkins commented on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-749994083 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37864/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
SparkQA commented on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-749994064 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37864/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30893: [SPARK-33881][SQL][TESTS] Check null and empty string as partition values in DS v1 and v2 tests
SparkQA commented on pull request #30893: URL: https://github.com/apache/spark/pull/30893#issuecomment-749993904 **[Test build #133279 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133279/testReport)** for PR 30893 at commit [`ee4c365`](https://github.com/apache/spark/commit/ee4c36540f481afc98f90ea3e84e17d9d0033d77). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30893: [SPARK-33881][SQL][TESTS] Check null and empty string as partition values in DS v1 and v2 tests
dongjoon-hyun commented on a change in pull request #30893: URL: https://github.com/apache/spark/pull/30893#discussion_r547771511 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/command/ShowPartitionsSuite.scala ## @@ -17,6 +17,31 @@ package org.apache.spark.sql.hive.execution.command +import org.apache.spark.sql.{Row, SaveMode} import org.apache.spark.sql.execution.command.v1 -class ShowPartitionsSuite extends v1.ShowPartitionsSuiteBase with CommandSuiteBase +class ShowPartitionsSuite extends v1.ShowPartitionsSuiteBase with CommandSuiteBase { + test("null and empty string as partition values") { Review comment: Oh, shall we proceed that NPE fix PR first, @MaxGekk ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30900: [SPARK-33886][SQL] UnresolvedTable should retain SQL text position for DDL commands
AmplabJenkins commented on pull request #30900: URL: https://github.com/apache/spark/pull/30900#issuecomment-749993486 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37863/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30900: [SPARK-33886][SQL] UnresolvedTable should retain SQL text position for DDL commands
SparkQA commented on pull request #30900: URL: https://github.com/apache/spark/pull/30900#issuecomment-749993474 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37863/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on pull request #29966: URL: https://github.com/apache/spark/pull/29966#issuecomment-749992829 Hi, @AngersZh. This PR looks mostly ready as @maropu wrote. I left a few comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547767161 ## File path: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ## @@ -445,6 +445,29 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu } } + test("SPARK-33084: add jar when path contains comma") { Review comment: Thanks. Yes. Please remove this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #30893: [SPARK-33881][SQL][TESTS] Check null and empty string as partition values in DS v1 and v2 tests
MaxGekk commented on a change in pull request #30893: URL: https://github.com/apache/spark/pull/30893#discussion_r547766855 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/command/ShowPartitionsSuite.scala ## @@ -17,6 +17,31 @@ package org.apache.spark.sql.hive.execution.command +import org.apache.spark.sql.{Row, SaveMode} import org.apache.spark.sql.execution.command.v1 -class ShowPartitionsSuite extends v1.ShowPartitionsSuiteBase with CommandSuiteBase +class ShowPartitionsSuite extends v1.ShowPartitionsSuiteBase with CommandSuiteBase { + test("null and empty string as partition values") { Review comment: It fails with NPE :-( ```scala test("null and empty string as partition values") { import testImplicits._ withNamespaceAndTable("ns", "tbl") { t => val df = Seq((0, ""), (1, null)).toDF("a", "part") df.write .partitionBy("part") .format("parquet") .mode(SaveMode.Overwrite) .saveAsTable(t) runShowPartitionsSql(s"SHOW PARTITIONS $t", Row("part=null") :: Nil) } } ``` ``` java.lang.NullPointerException was thrown. java.lang.NullPointerException at org.apache.spark.sql.execution.datasources.v2.ShowPartitionsExec.$anonfun$run$3(ShowPartitionsExec.scala:58) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) ``` Let me fix this separately since this PR is about tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
AmplabJenkins commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-749991501 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37872/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on pull request #30716: [SPARK-33747][CORE] Avoid calling unregisterMapOutput when the map stage is being rerunning.
Ngone51 commented on pull request #30716: URL: https://github.com/apache/spark/pull/30716#issuecomment-749991166 After taking a look at the updated UT, I think the problem mentioned by @weixiuli is: 1) task X of Stage B encountered FetchFailure while fetching the shuffle data of Stage A at executor-0; Stage A unregisters the MapOutput for the mapIndex-0. 2) Stage A and Stage B both marked as failed and Stage A starts to rerun after a while. 3) task Y (run at executor-0 too) of rerun Stage A success and register its MapOutput for the same mapIndex-0 4) task Z of Stage B encountered FetchFailure at executor-0 too and Stage A unregister the MapOutput of mapIndex-1 again I think what @weixiuli is trying to do here is to avoid unregistering the MapOutput in step 4 again. However, I wonder whether it could happen in a real job. Or say, it's really a corner case. I think tasks like task Z usually failed early before the rerun task Y success. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30538: [SPARK-33591][SQL] Recognize `null` in partition spec values
AmplabJenkins commented on pull request #30538: URL: https://github.com/apache/spark/pull/30538#issuecomment-749991113 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133273/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30538: [SPARK-33591][SQL] Recognize `null` in partition spec values
SparkQA commented on pull request #30538: URL: https://github.com/apache/spark/pull/30538#issuecomment-749991086 **[Test build #133273 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133273/testReport)** for PR 30538 at commit [`343d15d`](https://github.com/apache/spark/commit/343d15d44c67f2139b9a555dbbaa10f684ddb094). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30763: [SPARK-31801][API][SHUFFLE] Register map output metadata
SparkQA commented on pull request #30763: URL: https://github.com/apache/spark/pull/30763#issuecomment-749990686 **[Test build #133278 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133278/testReport)** for PR 30763 at commit [`b468869`](https://github.com/apache/spark/commit/b46886998e5161484cac5bad17c79fdd7e91d91b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547763450 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ## @@ -1348,6 +1348,7 @@ private[spark] object SparkSubmitUtils { coordinates: String, ivySettings: IvySettings, exclusions: Seq[String] = Nil, + transitive: Boolean = true, Review comment: What happens if we remove this default value? If it doesn't make many changes, let's remove this default value. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547761656 ## File path: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala ## @@ -159,6 +161,13 @@ class SessionResourceLoader(session: SparkSession) extends FunctionResourceLoade } } + protected def resolveJars(path: URI): Seq[String] = { Review comment: Although this class is `SessionResourceLoader `, `resolveJars` looks like more a utility function. Do we need to add here as `protected def`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation
SparkQA commented on pull request #29087: URL: https://github.com/apache/spark/pull/29087#issuecomment-749989168 **[Test build #133277 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133277/testReport)** for PR 29087 at commit [`8df104b`](https://github.com/apache/spark/commit/8df104beb258fe02a0a9f7ad01bf71bf8309a5f8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
SparkQA commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-749988991 **[Test build #133274 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133274/testReport)** for PR 30387 at commit [`4101603`](https://github.com/apache/spark/commit/4101603d92ea1ffe92e4205777c7fdb8492c2969). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29982: [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql
SparkQA commented on pull request #29982: URL: https://github.com/apache/spark/pull/29982#issuecomment-749988953 **[Test build #133275 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133275/testReport)** for PR 29982 at commit [`5cbd308`](https://github.com/apache/spark/commit/5cbd30850b1403e34e489ba1adbbafc8cf82894a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30900: [SPARK-33886][SQL] UnresolvedTable should retain SQL text position for DDL commands
AmplabJenkins commented on pull request #30900: URL: https://github.com/apache/spark/pull/30900#issuecomment-749988933 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133265/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30900: [SPARK-33886][SQL] UnresolvedTable should retain SQL text position for DDL commands
SparkQA commented on pull request #30900: URL: https://github.com/apache/spark/pull/30900#issuecomment-749988765 **[Test build #133265 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133265/testReport)** for PR 30900 at commit [`7771bb2`](https://github.com/apache/spark/commit/7771bb2700b66d50bf3140f16142b56b13f95b83). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30538: [SPARK-33591][SQL] Recognize `null` in partition spec values
SparkQA commented on pull request #30538: URL: https://github.com/apache/spark/pull/30538#issuecomment-749988674 **[Test build #133273 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133273/testReport)** for PR 30538 at commit [`343d15d`](https://github.com/apache/spark/commit/343d15d44c67f2139b9a555dbbaa10f684ddb094). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547759415 ## File path: docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md ## @@ -33,15 +33,30 @@ ADD JAR file_name * **file_name** -The name of the JAR file to be added. It could be either on a local file system or a distributed file system. +The name of the JAR file to be added. It could be either on a local file system or a distributed file system or an ivy URL. +Apache Ivy is a popular dependency manager focusing on flexibility and simplicity. Now we support two parameter in URL query string: + * transitive: whether to download dependent jars related to your ivy URL. It is case-sensitive and only take last one if multiple transitive parameters are specified. + * exclude: exclusion list when download ivy URL jar and dependent jars. + +User can write ivy URL such as: + + ivy://group:module:version + ivy://group:module:version?transitive=[true|false] + ivy://group:module:version?exclude=group:module,group:module + ivy://group:module:version?exclude=group:module,group:module&transitive=[true|false] + ### Examples ```sql ADD JAR /tmp/test.jar; ADD JAR "/path/to/some.jar"; ADD JAR '/some/other.jar'; ADD JAR "/path with space/abc.jar"; +ADD JAR "ivy://org.apache.hive:hive-contrib:2.3.7"; +ADD JAR "ivy://org.apache.hive:hive-contrib:2.3.7?transitive=false" Review comment: Just a question. Without `transitive=true`, does `hive-contrib` work correctly? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
SparkQA commented on pull request #29966: URL: https://github.com/apache/spark/pull/29966#issuecomment-749988240 **[Test build #133276 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133276/testReport)** for PR 29966 at commit [`6bd41cd`](https://github.com/apache/spark/commit/6bd41cd85462a6bef0e8be61e9a61c2d628bc8fe). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547758123 ## File path: docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md ## @@ -33,15 +33,30 @@ ADD JAR file_name * **file_name** -The name of the JAR file to be added. It could be either on a local file system or a distributed file system. +The name of the JAR file to be added. It could be either on a local file system or a distributed file system or an ivy URL. +Apache Ivy is a popular dependency manager focusing on flexibility and simplicity. Now we support two parameter in URL query string: + * transitive: whether to download dependent jars related to your ivy URL. It is case-sensitive and only take last one if multiple transitive parameters are specified. + * exclude: exclusion list when download ivy URL jar and dependent jars. + +User can write ivy URL such as: + + ivy://group:module:version + ivy://group:module:version?transitive=[true|false] + ivy://group:module:version?exclude=group:module,group:module Review comment: `exclude` seems to be misleading here because `transitive` is false by default. Maybe, shall we revise like the following? ``` ivy://group:module:version ivy://group:module:version?transitive=[true|false] ivy://group:module:version?transitive=[true|false]&exclude=group:module,group:module ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation
AmplabJenkins removed a comment on pull request #29087: URL: https://github.com/apache/spark/pull/29087#issuecomment-749987581 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30899: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
AmplabJenkins removed a comment on pull request #30899: URL: https://github.com/apache/spark/pull/30899#issuecomment-749987586 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133261/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
AmplabJenkins removed a comment on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-749987584 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30902: [SPARK-33888][SQL] Add support for TimeMillis logicalType to TimestampType
AmplabJenkins commented on pull request #30902: URL: https://github.com/apache/spark/pull/30902#issuecomment-749987729 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29982: [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql
SparkQA commented on pull request #29982: URL: https://github.com/apache/spark/pull/29982#issuecomment-749987717 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37867/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30901: [SPARK-33887][SQL] Allow insert overwrite same table with static partition if using dynamic partition overwrite mode
AmplabJenkins commented on pull request #30901: URL: https://github.com/apache/spark/pull/30901#issuecomment-749987753 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30899: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
AmplabJenkins commented on pull request #30899: URL: https://github.com/apache/spark/pull/30899#issuecomment-749987586 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133261/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation
AmplabJenkins commented on pull request #29087: URL: https://github.com/apache/spark/pull/29087#issuecomment-749987581 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
AmplabJenkins commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-749987584 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547755940 ## File path: docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md ## @@ -33,15 +33,30 @@ ADD JAR file_name * **file_name** -The name of the JAR file to be added. It could be either on a local file system or a distributed file system. +The name of the JAR file to be added. It could be either on a local file system or a distributed file system or an ivy URL. +Apache Ivy is a popular dependency manager focusing on flexibility and simplicity. Now we support two parameter in URL query string: + * transitive: whether to download dependent jars related to your ivy URL. It is case-sensitive and only take last one if multiple transitive parameters are specified. + * exclude: exclusion list when download ivy URL jar and dependent jars. Review comment: `when download` -> `during downloading` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547752608 ## File path: core/src/main/scala/org/apache/spark/util/DependencyUtils.scala ## @@ -25,12 +25,140 @@ import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.{FileSystem, Path} import org.apache.spark.{SecurityManager, SparkConf, SparkException} +import org.apache.spark.deploy.SparkSubmitUtils import org.apache.spark.internal.Logging -import org.apache.spark.util.{MutableURLClassLoader, Utils} -private[deploy] object DependencyUtils extends Logging { +case class IvyProperties( +packagesExclusions: String, +packages: String, +repositories: String, +ivyRepoPath: String, +ivySettingsPath: String) + +private[spark] object DependencyUtils extends Logging { + + def getIvyProperties(): IvyProperties = { +val Seq(packagesExclusions, packages, repositories, ivyRepoPath, ivySettingsPath) = Seq( + "spark.jars.excludes", + "spark.jars.packages", + "spark.jars.repositories", + "spark.jars.ivy", + "spark.jars.ivySettings" +).map(sys.props.get(_).orNull) +IvyProperties(packagesExclusions, packages, repositories, ivyRepoPath, ivySettingsPath) + } + + private def isInvalidQueryString(tokens: Array[String]): Boolean = { +tokens.length != 2 || StringUtils.isBlank(tokens(0)) || StringUtils.isBlank(tokens(1)) + } + + /** + * Parse URI query string's parameter value of `transitive` and `exclude`. + * Other invalid parameters will be ignored. + * + * @param uri Ivy URI need to be downloaded. + * @return Tuple value of parameter `transitive` and `exclude` value. + * + * 1. transitive: whether to download dependency jar of Ivy URI, default value is false + *and this parameter value is case-sensitive. Invalid value will be treat as false. + *Example: Input: exclude=org.mortbay.jetty:jetty&transitive=true + *Output: true + * + * 2. exclude: comma separated exclusions to apply when resolving transitive dependencies, + *consists of `group:module` pairs separated by commas. + *Example: Input: excludeorg.mortbay.jetty:jetty,org.eclipse.jetty:jetty-http + *Output: [org.mortbay.jetty:jetty,org.eclipse.jetty:jetty-http] + */ + private def parseQueryParams(uri: URI): (Boolean, String) = { +val uriQuery = uri.getQuery +if (uriQuery == null) { + (false, "") +} else { + val mapTokens = uriQuery.split("&").map(_.split("=")) + if (mapTokens.exists(isInvalidQueryString)) { +throw new IllegalArgumentException( + s"Invalid query string in Ivy URI ${uri.toString}: $uriQuery") + } + val groupedParams = mapTokens.map(kv => (kv(0), kv(1))).groupBy(_._1) + + // Parse transitive parameters (e.g., transitive=true) in an Ivy URI, default value is false + val transitiveParams = groupedParams.get("transitive") + if (transitiveParams.map(_.size).getOrElse(0) > 1) { +logWarning("It's best to specify `transitive` parameter in ivy URI query only once." + + " If there are multiple `transitive` parameter, we will select the last one") + } + val transitive = +transitiveParams.flatMap(_.takeRight(1).map(_._2 == "true").headOption).getOrElse(false) Review comment: Oh, never mind. I found the test cases. ``` test("SPARK-33084: Add jar support Ivy URI -- test param case sensitive") { test("SPARK-33084: Add jar support Ivy URI -- test transitive value case sensitive") { ``` ## File path: core/src/main/scala/org/apache/spark/util/DependencyUtils.scala ## @@ -25,12 +25,140 @@ import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.{FileSystem, Path} import org.apache.spark.{SecurityManager, SparkConf, SparkException} +import org.apache.spark.deploy.SparkSubmitUtils import org.apache.spark.internal.Logging -import org.apache.spark.util.{MutableURLClassLoader, Utils} -private[deploy] object DependencyUtils extends Logging { +case class IvyProperties( +packagesExclusions: String, +packages: String, +repositories: String, +ivyRepoPath: String, +ivySettingsPath: String) + +private[spark] object DependencyUtils extends Logging { + + def getIvyProperties(): IvyProperties = { +val Seq(packagesExclusions, packages, repositories, ivyRepoPath, ivySettingsPath) = Seq( + "spark.jars.excludes", + "spark.jars.packages", + "spark.jars.repositories", + "spark.jars.ivy", + "spark.jars.ivySettings" +).map(sys.props.get(_).orNull) +IvyProperties(packagesExclusions, packages, repositories, ivyRepoPath, ivySettingsPath) + } + + private def isInvalidQueryString(tokens: Array[String]): Boolean = { +tokens.length != 2 || StringUtils.isBlank(tokens(0)) || StringUtils.isBlank(tokens(1)) + } + + /**
[GitHub] [spark] saikocat opened a new pull request #30902: [SPARK-33888][SQL] Add support for TimeMillis logicalType to TimestampType
saikocat opened a new pull request #30902: URL: https://github.com/apache/spark/pull/30902 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30864: [SPARK-33857][SQL] Unify the default seed of random functions
SparkQA commented on pull request #30864: URL: https://github.com/apache/spark/pull/30864#issuecomment-749985874 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37865/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
AngersZh commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547752259 ## File path: core/src/main/scala/org/apache/spark/util/DependencyUtils.scala ## @@ -25,12 +25,140 @@ import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.{FileSystem, Path} import org.apache.spark.{SecurityManager, SparkConf, SparkException} +import org.apache.spark.deploy.SparkSubmitUtils import org.apache.spark.internal.Logging -import org.apache.spark.util.{MutableURLClassLoader, Utils} -private[deploy] object DependencyUtils extends Logging { +case class IvyProperties( +packagesExclusions: String, +packages: String, +repositories: String, +ivyRepoPath: String, +ivySettingsPath: String) + +private[spark] object DependencyUtils extends Logging { + + def getIvyProperties(): IvyProperties = { +val Seq(packagesExclusions, packages, repositories, ivyRepoPath, ivySettingsPath) = Seq( + "spark.jars.excludes", + "spark.jars.packages", + "spark.jars.repositories", + "spark.jars.ivy", + "spark.jars.ivySettings" +).map(sys.props.get(_).orNull) +IvyProperties(packagesExclusions, packages, repositories, ivyRepoPath, ivySettingsPath) + } + + private def isInvalidQueryString(tokens: Array[String]): Boolean = { +tokens.length != 2 || StringUtils.isBlank(tokens(0)) || StringUtils.isBlank(tokens(1)) + } + + /** + * Parse URI query string's parameter value of `transitive` and `exclude`. + * Other invalid parameters will be ignored. + * + * @param uri Ivy URI need to be downloaded. + * @return Tuple value of parameter `transitive` and `exclude` value. + * + * 1. transitive: whether to download dependency jar of Ivy URI, default value is false + *and this parameter value is case-sensitive. Invalid value will be treat as false. + *Example: Input: exclude=org.mortbay.jetty:jetty&transitive=true + *Output: true + * + * 2. exclude: comma separated exclusions to apply when resolving transitive dependencies, + *consists of `group:module` pairs separated by commas. + *Example: Input: excludeorg.mortbay.jetty:jetty,org.eclipse.jetty:jetty-http + *Output: [org.mortbay.jetty:jetty,org.eclipse.jetty:jetty-http] + */ + private def parseQueryParams(uri: URI): (Boolean, String) = { +val uriQuery = uri.getQuery +if (uriQuery == null) { + (false, "") +} else { + val mapTokens = uriQuery.split("&").map(_.split("=")) + if (mapTokens.exists(isInvalidQueryString)) { +throw new IllegalArgumentException( + s"Invalid query string in Ivy URI ${uri.toString}: $uriQuery") + } + val groupedParams = mapTokens.map(kv => (kv(0), kv(1))).groupBy(_._1) + + // Parse transitive parameters (e.g., transitive=true) in an Ivy URI, default value is false + val transitiveParams = groupedParams.get("transitive") + if (transitiveParams.map(_.size).getOrElse(0) > 1) { +logWarning("It's best to specify `transitive` parameter in ivy URI query only once." + + " If there are multiple `transitive` parameter, we will select the last one") + } + val transitive = +transitiveParams.flatMap(_.takeRight(1).map(_._2 == "true").headOption).getOrElse(false) Review comment: > This `true` looks like case sensitive. Shall we do this in a case-insensitive way? We discuss this before, since hive is case-sensitive and we always write ivy uri in lowercase so make it case-sensitive This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30852: [SPARK-33847][SQL] Simplify CaseWhen if elseValue is None
SparkQA commented on pull request #30852: URL: https://github.com/apache/spark/pull/30852#issuecomment-749985416 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37866/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547750850 ## File path: core/src/main/scala/org/apache/spark/util/DependencyUtils.scala ## @@ -25,12 +25,140 @@ import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.{FileSystem, Path} import org.apache.spark.{SecurityManager, SparkConf, SparkException} +import org.apache.spark.deploy.SparkSubmitUtils import org.apache.spark.internal.Logging -import org.apache.spark.util.{MutableURLClassLoader, Utils} -private[deploy] object DependencyUtils extends Logging { +case class IvyProperties( +packagesExclusions: String, +packages: String, +repositories: String, +ivyRepoPath: String, +ivySettingsPath: String) + +private[spark] object DependencyUtils extends Logging { + + def getIvyProperties(): IvyProperties = { +val Seq(packagesExclusions, packages, repositories, ivyRepoPath, ivySettingsPath) = Seq( + "spark.jars.excludes", + "spark.jars.packages", + "spark.jars.repositories", + "spark.jars.ivy", + "spark.jars.ivySettings" +).map(sys.props.get(_).orNull) +IvyProperties(packagesExclusions, packages, repositories, ivyRepoPath, ivySettingsPath) + } + + private def isInvalidQueryString(tokens: Array[String]): Boolean = { +tokens.length != 2 || StringUtils.isBlank(tokens(0)) || StringUtils.isBlank(tokens(1)) + } + + /** + * Parse URI query string's parameter value of `transitive` and `exclude`. + * Other invalid parameters will be ignored. + * + * @param uri Ivy URI need to be downloaded. + * @return Tuple value of parameter `transitive` and `exclude` value. + * + * 1. transitive: whether to download dependency jar of Ivy URI, default value is false + *and this parameter value is case-sensitive. Invalid value will be treat as false. + *Example: Input: exclude=org.mortbay.jetty:jetty&transitive=true + *Output: true + * + * 2. exclude: comma separated exclusions to apply when resolving transitive dependencies, + *consists of `group:module` pairs separated by commas. + *Example: Input: excludeorg.mortbay.jetty:jetty,org.eclipse.jetty:jetty-http + *Output: [org.mortbay.jetty:jetty,org.eclipse.jetty:jetty-http] + */ + private def parseQueryParams(uri: URI): (Boolean, String) = { +val uriQuery = uri.getQuery +if (uriQuery == null) { + (false, "") +} else { + val mapTokens = uriQuery.split("&").map(_.split("=")) + if (mapTokens.exists(isInvalidQueryString)) { +throw new IllegalArgumentException( + s"Invalid query string in Ivy URI ${uri.toString}: $uriQuery") + } + val groupedParams = mapTokens.map(kv => (kv(0), kv(1))).groupBy(_._1) + + // Parse transitive parameters (e.g., transitive=true) in an Ivy URI, default value is false + val transitiveParams = groupedParams.get("transitive") + if (transitiveParams.map(_.size).getOrElse(0) > 1) { +logWarning("It's best to specify `transitive` parameter in ivy URI query only once." + + " If there are multiple `transitive` parameter, we will select the last one") + } + val transitive = +transitiveParams.flatMap(_.takeRight(1).map(_._2 == "true").headOption).getOrElse(false) Review comment: This `true` looks like case sensitive. Shall we do this in a case-insensitive way? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
AngersZh commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547749884 ## File path: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ## @@ -445,6 +445,29 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu } } + test("SPARK-33084: add jar when path contains comma") { Review comment: > Do we really need to support this `comma`? This is a UT added earlier. In current code seems we don't need it? Just remove? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
AngersZh commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547749884 ## File path: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ## @@ -445,6 +445,29 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu } } + test("SPARK-33084: add jar when path contains comma") { Review comment: > Do we really need to support this `comma`? This is a UT added earlier. In current code seems we don't need it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
SparkQA commented on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-749984196 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37864/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30900: [SPARK-33886][SQL] UnresolvedTable should retain SQL text position for DDL commands
SparkQA commented on pull request #30900: URL: https://github.com/apache/spark/pull/30900#issuecomment-749984098 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37863/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547747201 ## File path: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ## @@ -445,6 +445,29 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu } } + test("SPARK-33084: add jar when path contains comma") { Review comment: Do we really need to support this `comma`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
beliefer commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-749983136 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547745481 ## File path: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ## @@ -445,6 +445,29 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu } } + test("SPARK-33084: add jar when path contains comma") { Review comment: For a negative test case, could you be more specific? Otherwise, this test case name can be misleading. ```scala - test("SPARK-33084: add jar when path contains comma") { + test("SPARK-33084: add jar should fail when path contains comma") { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #30888: [SPARK-33877][SQL] SQL reference documents for INSERT w/ a column list
yaooqinn commented on a change in pull request #30888: URL: https://github.com/apache/spark/pull/30888#discussion_r547743509 ## File path: docs/sql-ref-syntax-dml-insert-into.md ## @@ -40,11 +40,20 @@ INSERT INTO [ TABLE ] table_identifier [ partition_spec ] * **partition_spec** -An optional parameter that specifies a comma separated list of key and value pairs +An optional parameter that specifies a comma-separated list of key and value pairs for partitions. **Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] )` +* **column_list** + +An optional parameter that specifies a comma-separated list of columns belonging to the `table_identifier` table. + +**Note:**The current behaviour has some limitations: +- All specified columns should exist in the table and not be duplicated from each other. It includes all columns except the static partition columns. +- The size of the column list should be exactly the size of the data from `VALUES` clause or query. +- The order of the column list is alterable and determines how the data from `VALUES` clause or query to be inserted by position. Review comment: how about removing: `The current behavior has some limitations:` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation
SparkQA commented on pull request #29087: URL: https://github.com/apache/spark/pull/29087#issuecomment-74998 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37862/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30899: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
SparkQA removed a comment on pull request #30899: URL: https://github.com/apache/spark/pull/30899#issuecomment-749928251 **[Test build #133261 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133261/testReport)** for PR 30899 at commit [`03c7aac`](https://github.com/apache/spark/commit/03c7aac25282f2d75ed9bd4b82f45babc308f2d7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on pull request #30887: [SPARK-33879][SQL] Char Varchar values fails w/ match error as partition columns
yaooqinn commented on pull request #30887: URL: https://github.com/apache/spark/pull/30887#issuecomment-749980684 thanks for merging and review This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30899: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
SparkQA commented on pull request #30899: URL: https://github.com/apache/spark/pull/30899#issuecomment-749980617 **[Test build #133261 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133261/testReport)** for PR 30899 at commit [`03c7aac`](https://github.com/apache/spark/commit/03c7aac25282f2d75ed9bd4b82f45babc308f2d7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class ContextAwareIterator[IN](iter: Iterator[IN], context: TaskContext) extends Iterator[IN] ` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #30887: [SPARK-33879][SQL] Char Varchar values fails w/ match error as partition columns
HyukjinKwon closed pull request #30887: URL: https://github.com/apache/spark/pull/30887 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #30887: [SPARK-33879][SQL] Char Varchar values fails w/ match error as partition columns
HyukjinKwon commented on pull request #30887: URL: https://github.com/apache/spark/pull/30887#issuecomment-749980442 Merged to master and branch-3.1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547739630 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ## @@ -3771,6 +3792,40 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark } }) } + + test("SPARK-33084: Add jar support Ivy URI in SQL -- jar contains udf class") { +val sumFuncClass = "org.apache.spark.examples.sql.Spark33084" +val functionName = "test_udf" +withTempDir { dir => { Review comment: Do we need this ending `{`? I guess we can remove this. ```scala - withTempDir { dir => { + withTempDir { dir => ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation
SparkQA removed a comment on pull request #29087: URL: https://github.com/apache/spark/pull/29087#issuecomment-749968936 **[Test build #133269 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133269/testReport)** for PR 29087 at commit [`1a4262b`](https://github.com/apache/spark/commit/1a4262b40d711aff02e457661d755d4d84beb9a3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on pull request #29982: [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql
turboFei commented on pull request #29982: URL: https://github.com/apache/spark/pull/29982#issuecomment-749978769 > **[Test build #133268 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133268/testReport)** for PR 29982 at commit [`5a752d4`](https://github.com/apache/spark/commit/5a752d4a988a927323f70eab0b2d3e6b4720a3f5). > > * This patch **fails to build**. > * This patch merges cleanly. > * This patch adds no public classes. Jenkins error: [error] java.io.IOException: No space left on device This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation
SparkQA commented on pull request #29087: URL: https://github.com/apache/spark/pull/29087#issuecomment-749978475 **[Test build #133269 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133269/testReport)** for PR 29087 at commit [`1a4262b`](https://github.com/apache/spark/commit/1a4262b40d711aff02e457661d755d4d84beb9a3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
AngersZh commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547735935 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ## @@ -304,8 +304,8 @@ private[spark] class SparkSubmit extends Logging { // Resolve maven dependencies if there are any and add classpath to jars. Add them to py-files // too for packages that include Python code val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies( -args.packagesExclusions, args.packages, args.repositories, args.ivyRepoPath, -args.ivySettingsPath) +true, args.packagesExclusions, args.packages, Review comment: > nit. `true,` -> `transitive = true,`. Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangshuo128 edited a comment on pull request #30901: [SPARK-33887][SQL] Allow insert overwrite same table with static partition if using dynamic partition overwrite mode
wangshuo128 edited a comment on pull request #30901: URL: https://github.com/apache/spark/pull/30901#issuecomment-749975050 AFAIK, when `spark.sql.sources.partitionOverwriteMode` is set to `DYNAMIC`, Spark will replace partitions in runtime, instead of deleting relevant partitions paths ahead. So we can use `DYNAMIC` overwrite mode if we overwrite a static partition. Gentle ping @cloud-fan @viirya, would you help to review this? Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
SparkQA removed a comment on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-749973781 **[Test build #133272 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133272/testReport)** for PR 30387 at commit [`4101603`](https://github.com/apache/spark/commit/4101603d92ea1ffe92e4205777c7fdb8492c2969). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
SparkQA commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-749976551 **[Test build #133272 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133272/testReport)** for PR 30387 at commit [`4101603`](https://github.com/apache/spark/commit/4101603d92ea1ffe92e4205777c7fdb8492c2969). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547732346 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ## @@ -304,8 +304,8 @@ private[spark] class SparkSubmit extends Logging { // Resolve maven dependencies if there are any and add classpath to jars. Add them to py-files // too for packages that include Python code val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies( -args.packagesExclusions, args.packages, args.repositories, args.ivyRepoPath, -args.ivySettingsPath) +true, args.packagesExclusions, args.packages, Review comment: nit. `true,` -> `transitive = true,`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547732346 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ## @@ -304,8 +304,8 @@ private[spark] class SparkSubmit extends Logging { // Resolve maven dependencies if there are any and add classpath to jars. Add them to py-files // too for packages that include Python code val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies( -args.packagesExclusions, args.packages, args.repositories, args.ivyRepoPath, -args.ivySettingsPath) +true, args.packagesExclusions, args.packages, Review comment: nit. `true` -> `transitive = true`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
AngersZh commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547732189 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -1959,47 +1959,58 @@ class SparkContext(config: SparkConf) extends Logging { throw new IllegalArgumentException( s"Directory ${path} is not allowed for addJar") } - path + Seq(path) } catch { case NonFatal(e) => logError(s"Failed to add $path to Spark environment", e) -null +Nil } } else { -path +Seq(path) } } if (path == null || path.isEmpty) { logWarning("null or empty path specified as parameter to addJar") } else { - val key = if (path.contains("\\") && Utils.isWindows) { + val (keys, schema) = if (path.contains("\\") && Utils.isWindows) { Review comment: > If we want to use `plural`, it's `schemes`. Here scheme is singular. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
AngersZh commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547731808 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -1959,47 +1959,58 @@ class SparkContext(config: SparkConf) extends Logging { throw new IllegalArgumentException( s"Directory ${path} is not allowed for addJar") } - path + Seq(path) } catch { case NonFatal(e) => logError(s"Failed to add $path to Spark environment", e) -null +Nil } } else { -path +Seq(path) } } if (path == null || path.isEmpty) { logWarning("null or empty path specified as parameter to addJar") } else { - val key = if (path.contains("\\") && Utils.isWindows) { + val (keys, schema) = if (path.contains("\\") && Utils.isWindows) { // For local paths with backslashes on Windows, URI throws an exception -addLocalJarFile(new File(path)) +(addLocalJarFile(new File(path)), "local") } else { val uri = new Path(path).toUri // SPARK-17650: Make sure this is a valid URL before adding it to the list of dependencies Utils.validateURL(uri) -uri.getScheme match { +val uriSchema = uri.getScheme Review comment: Done ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -1959,47 +1959,58 @@ class SparkContext(config: SparkConf) extends Logging { throw new IllegalArgumentException( s"Directory ${path} is not allowed for addJar") } - path + Seq(path) } catch { case NonFatal(e) => logError(s"Failed to add $path to Spark environment", e) -null +Nil } } else { -path +Seq(path) } } if (path == null || path.isEmpty) { logWarning("null or empty path specified as parameter to addJar") } else { - val key = if (path.contains("\\") && Utils.isWindows) { + val (keys, schema) = if (path.contains("\\") && Utils.isWindows) { Review comment: > `schema` -> `scheme` because `scheme` is different from `schema`. Very embarrassed, updated This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30893: [SPARK-33881][SQL][TESTS] Check null and empty string as partition values in DS v1 and v2 tests
cloud-fan commented on a change in pull request #30893: URL: https://github.com/apache/spark/pull/30893#discussion_r547730493 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/command/ShowPartitionsSuite.scala ## @@ -17,6 +17,31 @@ package org.apache.spark.sql.hive.execution.command +import org.apache.spark.sql.{Row, SaveMode} import org.apache.spark.sql.execution.command.v1 -class ShowPartitionsSuite extends v1.ShowPartitionsSuiteBase with CommandSuiteBase +class ShowPartitionsSuite extends v1.ShowPartitionsSuiteBase with CommandSuiteBase { + test("null and empty string as partition values") { Review comment: shall we add the same test for v2? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547730366 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -1959,47 +1959,58 @@ class SparkContext(config: SparkConf) extends Logging { throw new IllegalArgumentException( s"Directory ${path} is not allowed for addJar") } - path + Seq(path) } catch { case NonFatal(e) => logError(s"Failed to add $path to Spark environment", e) -null +Nil } } else { -path +Seq(path) } } if (path == null || path.isEmpty) { logWarning("null or empty path specified as parameter to addJar") } else { - val key = if (path.contains("\\") && Utils.isWindows) { + val (keys, schema) = if (path.contains("\\") && Utils.isWindows) { Review comment: If we want to use `plural`, it's `schemes`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangshuo128 commented on pull request #30901: [SPARK-33887][SQL] Allow insert overwrite same table with static partition if using dynamic partition overwrite mode
wangshuo128 commented on pull request #30901: URL: https://github.com/apache/spark/pull/30901#issuecomment-749975050 AFAIK, when `spark.sql.sources.partitionOverwriteMode` is set to `DYNAMIC`, Spark will replace partitions in runtime, instead of deleting relevant partitions paths ahead. So we can use `DYNAMIC` overwrite mode if we overwrite a static partition. Gentle ping @cloud-fan @viirya, would help to review this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30893: [SPARK-33881][SQL][TESTS] Check null and empty string as partition values in DS v1 and v2 tests
cloud-fan commented on a change in pull request #30893: URL: https://github.com/apache/spark/pull/30893#discussion_r547729365 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/command/ShowPartitionsSuite.scala ## @@ -17,6 +17,31 @@ package org.apache.spark.sql.hive.execution.command +import org.apache.spark.sql.{Row, SaveMode} import org.apache.spark.sql.execution.command.v1 -class ShowPartitionsSuite extends v1.ShowPartitionsSuiteBase with CommandSuiteBase +class ShowPartitionsSuite extends v1.ShowPartitionsSuiteBase with CommandSuiteBase { + test("null and empty string as partition values") { +import testImplicits._ +withSQLConf("hive.exec.dynamic.partition.mode" -> "nonstrict") { + withTable("t") { +val df = Seq( + (0, ""), + (1, null) +).toDF("a", "part") +df.write + .partitionBy("part") + .format("hive") + .mode(SaveMode.Overwrite) + .saveAsTable("t") + +runShowPartitionsSql( + "SHOW PARTITIONS t", + Row("part=__HIVE_DEFAULT_PARTITION__") :: Nil) +checkAnswer(spark.table("t"), + Row(0, "__HIVE_DEFAULT_PARTITION__") :: + Row(1, "__HIVE_DEFAULT_PARTITION__") :: Nil) Review comment: this is inconsistent with the in-memory catalog. Let's fix this later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30893: [SPARK-33881][SQL][TESTS] Check null and empty string as partition values in DS v1 and v2 tests
cloud-fan commented on a change in pull request #30893: URL: https://github.com/apache/spark/pull/30893#discussion_r547728220 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowPartitionsSuite.scala ## @@ -93,4 +93,26 @@ class ShowPartitionsSuite extends ShowPartitionsSuiteBase with CommandSuiteBase assert(sql("SHOW PARTITIONS part_datasrc").count() == 3) } } + + test("null and empty string as partition values") { +import testImplicits._ +withTable("t") { + val df = Seq( +(0, ""), +(1, null) + ).toDF("a", "part") + df.write +.partitionBy("part") +.format("parquet") +.mode(SaveMode.Overwrite) +.saveAsTable("t") + + runShowPartitionsSql( +"SHOW PARTITIONS t", +Row("part=__HIVE_DEFAULT_PARTITION__") :: Nil) Review comment: ah, if the behavior is the same as Hive, we may need to keep it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547727882 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -1959,47 +1959,58 @@ class SparkContext(config: SparkConf) extends Logging { throw new IllegalArgumentException( s"Directory ${path} is not allowed for addJar") } - path + Seq(path) } catch { case NonFatal(e) => logError(s"Failed to add $path to Spark environment", e) -null +Nil } } else { -path +Seq(path) } } if (path == null || path.isEmpty) { logWarning("null or empty path specified as parameter to addJar") } else { - val key = if (path.contains("\\") && Utils.isWindows) { + val (keys, schema) = if (path.contains("\\") && Utils.isWindows) { // For local paths with backslashes on Windows, URI throws an exception -addLocalJarFile(new File(path)) +(addLocalJarFile(new File(path)), "local") } else { val uri = new Path(path).toUri // SPARK-17650: Make sure this is a valid URL before adding it to the list of dependencies Utils.validateURL(uri) -uri.getScheme match { +val uriSchema = uri.getScheme Review comment: `uriSchema` -> `uriScheme` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30893: [SPARK-33881][SQL][TESTS] Check null and empty string as partition values in DS v1 and v2 tests
cloud-fan commented on a change in pull request #30893: URL: https://github.com/apache/spark/pull/30893#discussion_r547727845 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowPartitionsSuite.scala ## @@ -93,4 +93,26 @@ class ShowPartitionsSuite extends ShowPartitionsSuiteBase with CommandSuiteBase assert(sql("SHOW PARTITIONS part_datasrc").count() == 3) } } + + test("null and empty string as partition values") { +import testImplicits._ +withTable("t") { + val df = Seq( +(0, ""), +(1, null) + ).toDF("a", "part") + df.write +.partitionBy("part") +.format("parquet") +.mode(SaveMode.Overwrite) +.saveAsTable("t") + + runShowPartitionsSql( +"SHOW PARTITIONS t", +Row("part=__HIVE_DEFAULT_PARTITION__") :: Nil) Review comment: This is a bug... We shouldn't expose the internal stuff. We should return null as `df.show` did. Let's fix it later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547727612 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -1959,47 +1959,58 @@ class SparkContext(config: SparkConf) extends Logging { throw new IllegalArgumentException( s"Directory ${path} is not allowed for addJar") } - path + Seq(path) } catch { case NonFatal(e) => logError(s"Failed to add $path to Spark environment", e) -null +Nil } } else { -path +Seq(path) } } if (path == null || path.isEmpty) { logWarning("null or empty path specified as parameter to addJar") } else { - val key = if (path.contains("\\") && Utils.isWindows) { + val (keys, schema) = if (path.contains("\\") && Utils.isWindows) { Review comment: `schema` -> `scheme` because `scheme` is different from `schema`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #30887: [SPARK-33879][SQL] Char Varchar values fails w/ match error as partition columns
HyukjinKwon commented on pull request #30887: URL: https://github.com/apache/spark/pull/30887#issuecomment-749974030 ``` org.scalatest.exceptions.TestFailedException: transform.sql Expected "...h status 127. Error:[ /bin/bash: some_non_existent_command: command not found]", but got "...h status 127. Error:[]" Result did not match for query #2 SELECT TRANSFORM(a) USING 'some_non_existent_command' AS (a) FROM t ``` is being fixed in https://github.com/apache/spark/pull/30896#issuecomment-749954219 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
SparkQA commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-749973781 **[Test build #133272 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133272/testReport)** for PR 30387 at commit [`4101603`](https://github.com/apache/spark/commit/4101603d92ea1ffe92e4205777c7fdb8492c2969). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30893: [SPARK-33881][SQL][TESTS] Check null and empty string as partition values in DS v1 and v2 tests
cloud-fan commented on a change in pull request #30893: URL: https://github.com/apache/spark/pull/30893#discussion_r547726487 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowPartitionsSuite.scala ## @@ -93,4 +93,26 @@ class ShowPartitionsSuite extends ShowPartitionsSuiteBase with CommandSuiteBase assert(sql("SHOW PARTITIONS part_datasrc").count() == 3) } } + + test("null and empty string as partition values") { +import testImplicits._ +withTable("t") { + val df = Seq( +(0, ""), +(1, null) + ).toDF("a", "part") Review comment: +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
dongjoon-hyun commented on a change in pull request #29966: URL: https://github.com/apache/spark/pull/29966#discussion_r547726508 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -1959,47 +1959,58 @@ class SparkContext(config: SparkConf) extends Logging { throw new IllegalArgumentException( s"Directory ${path} is not allowed for addJar") } - path + Seq(path) } catch { case NonFatal(e) => logError(s"Failed to add $path to Spark environment", e) -null +Nil } } else { -path +Seq(path) } } if (path == null || path.isEmpty) { logWarning("null or empty path specified as parameter to addJar") } else { - val key = if (path.contains("\\") && Utils.isWindows) { + val (keys, schema) = if (path.contains("\\") && Utils.isWindows) { Review comment: `schema` -> `scheme` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29982: [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql
AmplabJenkins removed a comment on pull request #29982: URL: https://github.com/apache/spark/pull/29982#issuecomment-749971157 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133268/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30864: [SPARK-33857][SQL] Unify the default seed of random functions
AmplabJenkins removed a comment on pull request #30864: URL: https://github.com/apache/spark/pull/30864#issuecomment-749971573 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133266/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29982: [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql
SparkQA removed a comment on pull request #29982: URL: https://github.com/apache/spark/pull/29982#issuecomment-749968433 **[Test build #133268 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133268/testReport)** for PR 29982 at commit [`5a752d4`](https://github.com/apache/spark/commit/5a752d4a988a927323f70eab0b2d3e6b4720a3f5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation
AmplabJenkins removed a comment on pull request #29087: URL: https://github.com/apache/spark/pull/29087#issuecomment-749971532 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37868/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30864: [SPARK-33857][SQL] Unify the default seed of random functions
SparkQA removed a comment on pull request #30864: URL: https://github.com/apache/spark/pull/30864#issuecomment-749967997 **[Test build #133266 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133266/testReport)** for PR 30864 at commit [`8d39f5a`](https://github.com/apache/spark/commit/8d39f5a264ef8a6bebd1d62b3ff6ae082b6deee7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangshuo128 opened a new pull request #30901: [SPARK-33887][SQL] Allow insert overwrite same table with static partition if using dynamic partition overwrite mode
wangshuo128 opened a new pull request #30901: URL: https://github.com/apache/spark/pull/30901 ### What changes were proposed in this pull request? ```sql CREATE TABLE insertTable(i int, part int) USING PARQUET PARTITIONED BY (part); INSERT INTO TABLE insertTable PARTITION(part=1) SELECT 1; SET spark.sql.sources.partitionOverwriteMode=DYNAMIC; INSERT OVERWRITE TABLE insertTable PARTITION(part=1) SELECT i + 1 FROM insertTable WHERE part = 1; ``` The SQL will fail because Spark complains "Cannot overwrite a path that is also being read from." ### Why are the changes needed? Allow users to insert overwrite the same table with static partition if `spark.sql.sources.partitionOverwriteMode` is set to `DYNAMIC`. ### Does this PR introduce _any_ user-facing change? Yes. ### How was this patch tested? New UT. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30852: [SPARK-33847][SQL] Simplify CaseWhen if elseValue is None
cloud-fan commented on a change in pull request #30852: URL: https://github.com/apache/spark/pull/30852#discussion_r547723985 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyConditionalSuite.scala ## @@ -215,4 +215,24 @@ class SimplifyConditionalSuite extends PlanTest with ExpressionEvalHelper with P If(GreaterThan(Rand(0), UnresolvedAttribute("a")), FalseLiteral, TrueLiteral), LessThanOrEqual(Rand(0), UnresolvedAttribute("a"))) } + + test("SPARK-33847: Remove the CaseWhen if elseValue is empty and other outputs are null") { +assertEquivalent( + CaseWhen((GreaterThan('a, 1), Literal.create(null, IntegerType)) :: Nil, Review comment: ditto, `Seq(GreaterThan('a, 1), GreaterThan(Rand(0), 1)).foreach...` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30852: [SPARK-33847][SQL] Simplify CaseWhen if elseValue is None
cloud-fan commented on a change in pull request #30852: URL: https://github.com/apache/spark/pull/30852#discussion_r547722720 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicateSuite.scala ## @@ -380,6 +382,52 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest { testProjection(originalExpr = column, expectedExpr = column) } + test("replace None of elseValue inside CaseWhen if all branches are FalseLiteral") { +val allFalseBranches = Seq( + (UnresolvedAttribute("i") < Literal(10)) -> FalseLiteral, + (UnresolvedAttribute("i") > Literal(40)) -> FalseLiteral) +val allFalseCond = CaseWhen(allFalseBranches) + +val nonAllFalseBranches = Seq( + (UnresolvedAttribute("i") < Literal(10)) -> FalseLiteral, + (UnresolvedAttribute("i") > Literal(40)) -> TrueLiteral) +val nonAllFalseCond = CaseWhen(nonAllFalseBranches, FalseLiteral) + +testFilter(allFalseCond, FalseLiteral) +testJoin(allFalseCond, FalseLiteral) +testDelete(allFalseCond, FalseLiteral) +testUpdate(allFalseCond, FalseLiteral) + +testFilter(nonAllFalseCond, nonAllFalseCond) +testJoin(nonAllFalseCond, nonAllFalseCond) +testDelete(nonAllFalseCond, nonAllFalseCond) +testUpdate(nonAllFalseCond, nonAllFalseCond) + } + + test("replace None of elseValue inside CaseWhen if all branches are null") { +val allFalseBranches = Seq( Review comment: `allNullBranches` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30864: [SPARK-33857][SQL] Unify the default seed of random functions
AmplabJenkins commented on pull request #30864: URL: https://github.com/apache/spark/pull/30864#issuecomment-749971573 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133266/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30852: [SPARK-33847][SQL] Simplify CaseWhen if elseValue is None
cloud-fan commented on a change in pull request #30852: URL: https://github.com/apache/spark/pull/30852#discussion_r547721686 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicateSuite.scala ## @@ -222,10 +223,14 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest { Literal(null, IntegerType), Literal(3)), FalseLiteral) -testFilter(originalCond = condition, expectedCond = condition) -testJoin(originalCond = condition, expectedCond = condition) -testDelete(originalCond = condition, expectedCond = condition) -testUpdate(originalCond = condition, expectedCond = condition) +val expectedCond = If( Review comment: or we change this test to preserve the original purpose. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30864: [SPARK-33857][SQL] Unify the default seed of random functions
SparkQA commented on pull request #30864: URL: https://github.com/apache/spark/pull/30864#issuecomment-749971548 **[Test build #133266 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133266/testReport)** for PR 30864 at commit [`8d39f5a`](https://github.com/apache/spark/commit/8d39f5a264ef8a6bebd1d62b3ff6ae082b6deee7). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation
AmplabJenkins commented on pull request #29087: URL: https://github.com/apache/spark/pull/29087#issuecomment-749971532 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37868/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29982: [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql
AmplabJenkins commented on pull request #29982: URL: https://github.com/apache/spark/pull/29982#issuecomment-749971157 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133268/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29982: [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql
SparkQA commented on pull request #29982: URL: https://github.com/apache/spark/pull/29982#issuecomment-749971136 **[Test build #133268 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133268/testReport)** for PR 29982 at commit [`5a752d4`](https://github.com/apache/spark/commit/5a752d4a988a927323f70eab0b2d3e6b4720a3f5). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30852: [SPARK-33847][SQL] Simplify CaseWhen if elseValue is None
cloud-fan commented on a change in pull request #30852: URL: https://github.com/apache/spark/pull/30852#discussion_r547719655 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PushFoldableIntoBranchesSuite.scala ## @@ -258,4 +258,22 @@ class PushFoldableIntoBranchesSuite EqualTo(CaseWhen(Seq((a, Literal(1)), (c, Literal(2))), None).cast(StringType), Literal("4")), CaseWhen(Seq((a, FalseLiteral), (c, FalseLiteral)), None)) } + + test("SPARK-33847: Remove the CaseWhen if elseValue is empty and other outputs are null") { +assertEquivalent( Review comment: nit: to make the test more readable ``` Seq(a, LessThan(Rand(1), Literal(0.5)).foreach { condition => assertEquivalent(EqualTo(CaseWhen(Seq((condition, ... assertEquivalent(// with cast ... } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30852: [SPARK-33847][SQL] Simplify CaseWhen if elseValue is None
cloud-fan commented on a change in pull request #30852: URL: https://github.com/apache/spark/pull/30852#discussion_r547719655 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PushFoldableIntoBranchesSuite.scala ## @@ -258,4 +258,22 @@ class PushFoldableIntoBranchesSuite EqualTo(CaseWhen(Seq((a, Literal(1)), (c, Literal(2))), None).cast(StringType), Literal("4")), CaseWhen(Seq((a, FalseLiteral), (c, FalseLiteral)), None)) } + + test("SPARK-33847: Remove the CaseWhen if elseValue is empty and other outputs are null") { +assertEquivalent( Review comment: nit: to make the test more readable ``` Seq(a, LessThan(Rand(1), Literal(0.5)).foreach { condition => assertEquivalent(EqualTo(CaseWhen(Seq((condition, ... } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30864: [SPARK-33857][SQL] Unify the default seed of random functions
SparkQA commented on pull request #30864: URL: https://github.com/apache/spark/pull/30864#issuecomment-749970193 **[Test build #133271 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133271/testReport)** for PR 30864 at commit [`edf0b03`](https://github.com/apache/spark/commit/edf0b03bc59c7aa9d2838391ff197d51a649d6cf). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org