[GitHub] [spark] gengliangwang commented on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP] Increase it in build/mvn script
gengliangwang commented on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872738426 @dongjoon-hyun @LuciferYang Awesome, hopefully the issue is resolved this time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xuanyuanking commented on a change in pull request #32933: [SPARK-35785][SS] Cleanup support for RocksDB instance
xuanyuanking commented on a change in pull request #32933: URL: https://github.com/apache/spark/pull/32933#discussion_r662755860 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/RocksDBSuite.scala ## @@ -207,6 +273,133 @@ class RocksDBSuite extends SparkFunSuite { } } + test("disallow concurrent updates to the same RocksDB instance") { Review comment: Ah yea, this is the test for rollback. Actually the original plan is expose `rollback` and `cleanup` in this PR. It should be a mistake for the last PR, I introduced the `rollback` without tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP] Increase it in build/mvn script
LuciferYang commented on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872737709 Yes, the `catalyst` module often has this problem -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1
dongjoon-hyun edited a comment on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-872736822 Oh, if you are using ORC, please try to bring SPARK-35783. It's irrelevant to this Hadoop topic, but it helps you reduce the traffic. - https://github.com/apache/spark/pull/32923 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-872736822 Oh, if you are using ORC, please try to bring SPARK-35783. It's irrelevant to Hadoop, but it helps you reduce the traffic. - https://github.com/apache/spark/pull/32923 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec
cloud-fan commented on a change in pull request #33140: URL: https://github.com/apache/spark/pull/33140#discussion_r662754669 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -122,7 +123,8 @@ case class AdaptiveSparkPlanExec( val origins = inputPlan.collect { case s: ShuffleExchangeLike => s.shuffleOrigin } -val allRules = queryStageOptimizerRules ++ postStageCreationRules Review comment: I think we need to revisit all the phases in the AQE loop, and think about which phases need to accept custom rules for columnar execution. At the beginning, the input plan comes in, we run `state preparation rules` first, to get the initial plan which contains shuffles. Then we create query stages on leaf shuffles, and submit query stages after running `stage optimization rules` and `post stage creation rules`. If one query stage finishes, we start the loop: 1. generate the logical plan with query stage result 2. re-optimize the logical plan by running `AQEOptimizer`, planner and `state preparation rules` 3. compare the cost, pick the re-optimized plan or the old plan according to the cost 4. create more stages and submit them, wait for next query stage to finish At the end, we need to optimize the final stage by running `stage optimization rules` and `post stage creation rules`. It looks to me that, we can put the columnar execution custom rules in `post stage creation rules`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #32933: [SPARK-35785][SS] Cleanup support for RocksDB instance
viirya commented on a change in pull request #32933: URL: https://github.com/apache/spark/pull/32933#discussion_r662754308 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -253,6 +253,13 @@ class RocksDB( logInfo(s"Rolled back to $loadedVersion") } + def cleanup(): Unit = { Review comment: okay. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xuanyuanking commented on a change in pull request #32933: [SPARK-35785][SS] Cleanup support for RocksDB instance
xuanyuanking commented on a change in pull request #32933: URL: https://github.com/apache/spark/pull/32933#discussion_r662753878 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -253,6 +253,13 @@ class RocksDB( logInfo(s"Rolled back to $loadedVersion") } + def cleanup(): Unit = { Review comment: It will be called in the `RocksDBStateStoreProvider.doMaintenace`. I'll submit the state store provider PR (the last one) today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre
viirya commented on pull request #29326: URL: https://github.com/apache/spark/pull/29326#issuecomment-872735918 Hmm, from the failed tests below: org.apache.spark.sql.hive.DataSourceWithHiveMetastoreCatalogSuite org.apache.spark.sql.hive.HiveExternalCatalogSuite org.apache.spark.sql.hive.StatisticsSuite Since Guava 20, `com.google.common.collect.Iterators.emptyIterator()` is not public anymore. But I don't get it because Hive 2.3.8/2.3.9 shaded guava. Why it will use the newer guava upgraded here? ``` java.lang.IllegalAccessError: tried to access method com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator; from class org.apache.hadoop.hive.ql.exec.FetchOperator at org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108) at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$runHive$1(HiveClientImpl.scala:831) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP] Increase it in build/mvn script
dongjoon-hyun commented on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872735624 Oh! Thank you for sharing that, @LuciferYang . Ya, I saw this on `catalyst` mostly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP] Increase it in build/mvn script
LuciferYang commented on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872735039 @dongjoon-hyun @gengliangwang It seems to work, I have compile and test catalyst and related modules for many times in my compilation environment, no `StackOverflowError` was thrown at the moment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing
SparkQA commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872732859 **[Test build #140562 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140562/testReport)** for PR 33172 at commit [`9b67b51`](https://github.com/apache/spark/commit/9b67b51174151d3211be06478d0faa9669c1cf24). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing
AmplabJenkins removed a comment on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872731983 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45069/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec
AmplabJenkins removed a comment on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-872731982 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45070/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33160: [SPARK-35959][BUILD][test-maven][test-hadoop3.2][test-java11] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop vers
AmplabJenkins removed a comment on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872731980 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
AmplabJenkins removed a comment on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872731984 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140550/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
AmplabJenkins removed a comment on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-872711527 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
AmplabJenkins removed a comment on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872731981 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
AmplabJenkins commented on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-872731988 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45073/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec
AmplabJenkins commented on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-872731982 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45070/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
AmplabJenkins commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872731985 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
AmplabJenkins commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872731984 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140550/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing
AmplabJenkins commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872731983 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45069/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33160: [SPARK-35959][BUILD][test-maven][test-hadoop3.2][test-java11] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop versions old
AmplabJenkins commented on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872731980 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] arghya18 commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1
arghya18 commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-872730959 @dongjoon-hyun Thanks.. I am testing more jobs for further statistics. BDW I am testing this on ORC. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
SparkQA commented on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-872729734 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45073/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33160: [SPARK-35959][BUILD][test-maven][test-hadoop3.2][test-java11] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop versions o
SparkQA removed a comment on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872708897 **[Test build #140560 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140560/testReport)** for PR 33160 at commit [`b1e0583`](https://github.com/apache/spark/commit/b1e0583c05f67ab2d599568697d7137163cbb5fc). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33160: [SPARK-35959][BUILD][test-maven][test-hadoop3.2][test-java11] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop versions older tha
SparkQA commented on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872729629 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45072/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-872729358 Thank you for sharing, @arghya18 . It's interesting. The read statistic increase is also observed in my environment, but TPCDS 1TB on S3 parquet performance was faster for me. I'll keep tracking HADOOP-17755 together. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33160: [SPARK-35959][BUILD][test-maven][test-hadoop3.2][test-java11] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop versions older tha
SparkQA commented on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872729293 **[Test build #140560 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140560/testReport)** for PR 33160 at commit [`b1e0583`](https://github.com/apache/spark/commit/b1e0583c05f67ab2d599568697d7137163cbb5fc). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33160: [SPARK-35959][BUILD][test-maven][test-hadoop3.2][test-java11] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop versions older tha
SparkQA commented on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872728701 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45072/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #33164: [SPARK-35958][CORE] Refactor SparkError.scala to SparkThrowable.java
gengliangwang commented on a change in pull request #33164: URL: https://github.com/apache/spark/pull/33164#discussion_r662746235 ## File path: core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryError.java ## @@ -14,8 +14,11 @@ * See the License for the specific language governing permissions and * limitations under the License. */ + Review comment: Nit: Unnecessary change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33164: [SPARK-35958][CORE] Refactor SparkError.scala to SparkThrowable.java
cloud-fan commented on a change in pull request #33164: URL: https://github.com/apache/spark/pull/33164#discussion_r662745025 ## File path: core/src/main/resources/error/error-classes.json ## @@ -15,6 +15,10 @@ "message" : [ "The second argument of '%s' function needs to be an integer." ], "sqlState" : "22023" }, + "UNABLE_TO_ACQUIRE_MEMORY" : { +"message" : [ "Unable to acquire %s bytes of memory, got %s" ], +"sqlState" : null Review comment: Can't we just omit the `sqlState` field? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33164: [SPARK-35958][CORE] Refactor SparkError.scala to SparkThrowable.java
cloud-fan commented on a change in pull request #33164: URL: https://github.com/apache/spark/pull/33164#discussion_r662744867 ## File path: core/src/main/java/org/apache/spark/SparkThrowable.java ## @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark; + +/** + * Interface mixed into Throwables thrown from Spark. + * + * - For backwards compatibility, existing throwable types can be thrown with an arbitrary error + * message with no error class. See [[SparkException]]. + * - To promote standardization, throwables should be thrown with an error class and message + * parameters to construct an error message with SparkThrowableHelper.getMessage(). New throwable + * types should not accept arbitrary error messages. See [[SparkArithmeticException]]. + */ +public interface SparkThrowable { +// Succinct, human-readable, unique, and consistent representation of the error category Review comment: Shall we use java Option? or document the null semantic and say that null means no error class? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #33180: [SPARK-35825][INFRA][FOLLOWUP] Increase it in build/mvn script
dongjoon-hyun closed pull request #33180: URL: https://github.com/apache/spark/pull/33180 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script
dongjoon-hyun commented on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872724889 I'll merge this. Please let us know your result when you have some time, @LuciferYang ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872723745 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45071/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script
dongjoon-hyun commented on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872723431 Since this is a flaky compilation issue, the above two Maven runs might be insufficient for verification. However, I believe this patch is no harm for the build and only provides the consistency for Maven. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script
dongjoon-hyun commented on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872722372 The above Maven run actually passed the compilation on `catalyst` and `sql` which we see the StackOverflowError frequently. Only fails with the following. It seems to be a flaky test. ``` - driver side SQL metrics *** FAILED *** Map(573099 -> "total (min, med, max (stageId: taskId)) 0 ms (0 ms, 0 ms, 0 ms (stage 4.0: task 8))", 573101 -> "2", 573100 -> "1") did not contain key 573169 (SQLAppStatusListenerSuite.scala:590) ``` And, there is another Maven run is still running. - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140551/testReport -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec
SparkQA commented on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-872717213 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45070/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
SparkQA removed a comment on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-872709101 **[Test build #140561 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140561/testReport)** for PR 32816 at commit [`9c985da`](https://github.com/apache/spark/commit/9c985da9870e107311cef0383a2a40703e4f4f07). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
SparkQA removed a comment on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872634594 **[Test build #140550 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140550/testReport)** for PR 33093 at commit [`eb83b68`](https://github.com/apache/spark/commit/eb83b684fdc7d62846b3860f90d26ec119c136c5). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872715758 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45068/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
SparkQA commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872714879 **[Test build #140550 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140550/testReport)** for PR 33093 at commit [`eb83b68`](https://github.com/apache/spark/commit/eb83b684fdc7d62846b3860f90d26ec119c136c5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing
SparkQA commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872714477 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45069/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
AmplabJenkins commented on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-872711527 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140561/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
SparkQA commented on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-872711498 **[Test build #140561 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140561/testReport)** for PR 32816 at commit [`9c985da`](https://github.com/apache/spark/commit/9c985da9870e107311cef0383a2a40703e4f4f07). * This patch **fails to build**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class SkewJoinAwareCost(` * `case class SkewJoinAwareCostEvaluator(forceOptimizeSkewJoin: Boolean) extends CostEvaluator ` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872709712 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45071/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
SparkQA commented on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-872709101 **[Test build #140561 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140561/testReport)** for PR 32816 at commit [`9c985da`](https://github.com/apache/spark/commit/9c985da9870e107311cef0383a2a40703e4f4f07). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33160: [SPARK-35959][BUILD][test-maven][test-hadoop3.2][test-java11] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop versions older tha
SparkQA commented on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872708897 **[Test build #140560 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140560/testReport)** for PR 33160 at commit [`b1e0583`](https://github.com/apache/spark/commit/b1e0583c05f67ab2d599568697d7137163cbb5fc). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script
AmplabJenkins removed a comment on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872708365 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140552/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script
SparkQA removed a comment on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872636572 **[Test build #140552 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140552/testReport)** for PR 33180 at commit [`4c4fcec`](https://github.com/apache/spark/commit/4c4fcec9d3002c7486930c49b65a57d4ace72288). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang closed pull request #33177: [SPARK-35955][SQL] Check for overflow in Average in ANSI mode
gengliangwang closed pull request #33177: URL: https://github.com/apache/spark/pull/33177 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script
AmplabJenkins commented on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872708365 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140552/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
AmplabJenkins removed a comment on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872708148 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140559/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script
SparkQA commented on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872708254 **[Test build #140552 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140552/testReport)** for PR 33180 at commit [`4c4fcec`](https://github.com/apache/spark/commit/4c4fcec9d3002c7486930c49b65a57d4ace72288). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32944: [SPARK-35794][SQL] Allow custom plugin for AQE cost evaluator
AmplabJenkins removed a comment on pull request #32944: URL: https://github.com/apache/spark/pull/32944#issuecomment-872708150 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140547/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33070: [SPARK-35551][SQL] Handle the COUNT bug for lateral subqueries
AmplabJenkins removed a comment on pull request #33070: URL: https://github.com/apache/spark/pull/33070#issuecomment-872708152 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140546/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33070: [SPARK-35551][SQL] Handle the COUNT bug for lateral subqueries
AmplabJenkins commented on pull request #33070: URL: https://github.com/apache/spark/pull/33070#issuecomment-872708152 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140546/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
AmplabJenkins commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872708148 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140559/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #33177: [SPARK-35955][SQL] Check for overflow in Average in ANSI mode
gengliangwang commented on pull request #33177: URL: https://github.com/apache/spark/pull/33177#issuecomment-872708153 Thanks, merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32944: [SPARK-35794][SQL] Allow custom plugin for AQE cost evaluator
AmplabJenkins commented on pull request #32944: URL: https://github.com/apache/spark/pull/32944#issuecomment-872708150 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140547/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec
SparkQA commented on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-872704800 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45070/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872704188 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45068/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing
SparkQA commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872703815 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45069/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sunchao commented on pull request #33160: [SPARK-35959][BUILD][test-maven][test-hadoop3.2][test-java11] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop versions older tha
sunchao commented on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872703193 That's unfortunate... maybe for testing purpose I'll just change Hadoop version directly in the `pom.xml` to work around the sbt + maven property issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] arghya18 edited a comment on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1
arghya18 edited a comment on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-872701489 @dongjoon-hyun @steveloughran I was able to test my use case with Hadoop 3.3.1 and posted the result [HADOOP-17755](https://issues.apache.org/jira/browse/HADOOP-17755?focusedCommentId=17373213=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17373213) To my surprise the read is slower(with same resource and same config) in Hadoop 3.3.1 than Hadoop 3.2.0 without the mentioned issue. It is possible I am missing something. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] arghya18 commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1
arghya18 commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-872701489 @dongjoon-hyun I was able to test my use case with Hadoop 3.3.1 and posted the result [HADOOP-17755](https://issues.apache.org/jira/browse/HADOOP-17755?focusedCommentId=17373213=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17373213) To my surprise the read is slower(with same resource and same config) in Hadoop 3.3.1 than Hadoop 3.2.0 without the mentioned issue. It is possible I am missing something. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
ulysses-you commented on a change in pull request #32816: URL: https://github.com/apache/spark/pull/32816#discussion_r662725098 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -252,17 +275,26 @@ case class AdaptiveSparkPlanExec( // plans are updated, we can clear the query stage list because at this point the two plans // are semantically and physically in sync again. val logicalPlan = replaceWithQueryStagesInLogicalPlan(currentLogicalPlan, stagesToReplace) -val (newPhysicalPlan, newLogicalPlan) = reOptimize(logicalPlan) +val (reOptimizePhysicalPlan, newLogicalPlan) = reOptimize(logicalPlan) +val planWithExtraShuffle = rePlanWithExtraShuffle(reOptimizePhysicalPlan) val origCost = costEvaluator.evaluateCost(currentPhysicalPlan) -val newCost = costEvaluator.evaluateCost(newPhysicalPlan) -if (newCost < origCost || -(newCost == origCost && currentPhysicalPlan != newPhysicalPlan)) { +val newCost = costEvaluator.evaluateCost(reOptimizePhysicalPlan) +val extraShuffleCost = costEvaluator.evaluateCost(planWithExtraShuffle) +def updateCurrentPlan(newPhysicalPlan: SparkPlan): Unit = { logOnLevel(s"Plan changed from $currentPhysicalPlan to $newPhysicalPlan") cleanUpTempTags(newPhysicalPlan) currentPhysicalPlan = newPhysicalPlan currentLogicalPlan = newLogicalPlan stagesToReplace = Seq.empty[QueryStageExec] } + +if (extraShuffleCost < newCost || + (extraShuffleCost == newCost && planWithExtraShuffle != reOptimizePhysicalPlan)) { + updateCurrentPlan(planWithExtraShuffle) +} else if (newCost < origCost || + (newCost == origCost && currentPhysicalPlan != reOptimizePhysicalPlan)) { + updateCurrentPlan(reOptimizePhysicalPlan) +} Review comment: @cloud-fan here use 3 costs to find the better plan 1. plan with skew join if force optimize skew join 2. plan with reOptimize if not force optimize skew join and has no extra shuffle 3. origin plan if reOptimize has extra shuffle -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script
gengliangwang commented on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872698339 LGTM. @LuciferYang what is the local test result? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33070: [SPARK-35551][SQL] Handle the COUNT bug for lateral subqueries
SparkQA removed a comment on pull request #33070: URL: https://github.com/apache/spark/pull/33070#issuecomment-872614787 **[Test build #140546 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140546/testReport)** for PR 33070 at commit [`a87e0df`](https://github.com/apache/spark/commit/a87e0dfb9e35fba4da5348b382550495b557e685). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33070: [SPARK-35551][SQL] Handle the COUNT bug for lateral subqueries
SparkQA commented on pull request #33070: URL: https://github.com/apache/spark/pull/33070#issuecomment-872697943 **[Test build #140546 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140546/testReport)** for PR 33070 at commit [`a87e0df`](https://github.com/apache/spark/commit/a87e0dfb9e35fba4da5348b382550495b557e685). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA removed a comment on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872694042 **[Test build #140559 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140559/testReport)** for PR 33182 at commit [`16a0791`](https://github.com/apache/spark/commit/16a0791312af88341c6d2d5907e38644f838e113). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872695942 **[Test build #140559 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140559/testReport)** for PR 33182 at commit [`16a0791`](https://github.com/apache/spark/commit/16a0791312af88341c6d2d5907e38644f838e113). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32944: [SPARK-35794][SQL] Allow custom plugin for AQE cost evaluator
SparkQA removed a comment on pull request #32944: URL: https://github.com/apache/spark/pull/32944#issuecomment-872614842 **[Test build #140547 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140547/testReport)** for PR 32944 at commit [`404fe35`](https://github.com/apache/spark/commit/404fe35020f2cad93022d5110da8194822c28b9d). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] linhongliu-db commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
linhongliu-db commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872695771 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32944: [SPARK-35794][SQL] Allow custom plugin for AQE cost evaluator
SparkQA commented on pull request #32944: URL: https://github.com/apache/spark/pull/32944#issuecomment-872694868 **[Test build #140547 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140547/testReport)** for PR 32944 at commit [`404fe35`](https://github.com/apache/spark/commit/404fe35020f2cad93022d5110da8194822c28b9d). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` .doc(\"The custom cost evaluator class to be used for adaptive execution. If not being set,\" +` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32286: [SPARK-35181][CORE] Use zstd for spark.io.compression.codec by default
AmplabJenkins commented on pull request #32286: URL: https://github.com/apache/spark/pull/32286#issuecomment-872694269 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45066/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32286: [SPARK-35181][CORE] Use zstd for spark.io.compression.codec by default
SparkQA commented on pull request #32286: URL: https://github.com/apache/spark/pull/32286#issuecomment-872694251 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45066/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872694042 **[Test build #140559 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140559/testReport)** for PR 33182 at commit [`16a0791`](https://github.com/apache/spark/commit/16a0791312af88341c6d2d5907e38644f838e113). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang closed pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
gengliangwang closed pull request #33093: URL: https://github.com/apache/spark/pull/33093 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
gengliangwang commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872692395 Thanks, merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec
SparkQA commented on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-872689929 **[Test build #140558 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140558/testReport)** for PR 33140 at commit [`5dcf102`](https://github.com/apache/spark/commit/5dcf102d533da1916e910f91feafed9f626dad46). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-872689863 **[Test build #140556 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140556/testReport)** for PR 33182 at commit [`24b39a9`](https://github.com/apache/spark/commit/24b39a9667365428a3bbd5f2fe9a92face499420). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing
SparkQA commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872689889 **[Test build #140557 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140557/testReport)** for PR 33172 at commit [`0d6b0c1`](https://github.com/apache/spark/commit/0d6b0c15e43e5e831a4ffd0063c0d6e20b032f03). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33181: [SPARK-35982][SQL] Allow from_json/to_json for map types where value types are year-month intervals
AmplabJenkins removed a comment on pull request #33181: URL: https://github.com/apache/spark/pull/33181#issuecomment-872689084 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45067/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests
AmplabJenkins removed a comment on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-872689081 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script
AmplabJenkins removed a comment on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872689083 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140548/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
AmplabJenkins removed a comment on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872689085 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140543/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
AmplabJenkins commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872689085 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140543/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests
AmplabJenkins commented on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-872689082 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33180: [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script
AmplabJenkins commented on pull request #33180: URL: https://github.com/apache/spark/pull/33180#issuecomment-872689083 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140548/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33181: [SPARK-35982][SQL] Allow from_json/to_json for map types where value types are year-month intervals
AmplabJenkins commented on pull request #33181: URL: https://github.com/apache/spark/pull/33181#issuecomment-872689084 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45067/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33181: [SPARK-35982][SQL] Allow from_json/to_json for map types where value types are year-month intervals
SparkQA commented on pull request #33181: URL: https://github.com/apache/spark/pull/33181#issuecomment-872688574 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45067/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests
SparkQA removed a comment on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-872652027 **[Test build #140553 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140553/testReport)** for PR 33174 at commit [`78388a9`](https://github.com/apache/spark/commit/78388a945aacf2c8aaa00cef2cc1e510d3646834). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests
SparkQA commented on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-872686045 **[Test build #140553 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140553/testReport)** for PR 33174 at commit [`78388a9`](https://github.com/apache/spark/commit/78388a945aacf2c8aaa00cef2cc1e510d3646834). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `for _, _class in inspect.getmembers(module, inspect.isclass):` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] linhongliu-db opened a new pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
linhongliu-db opened a new pull request #33182: URL: https://github.com/apache/spark/pull/33182 ### What changes were proposed in this pull request? Add a config `spark.sql.join.forceApplyShuffledHashJoin` to force applying shuffled hash join during the join selection. ### Why are the changes needed? In the `SQLQueryTestSuite`, we want to cover 3 kinds of join (BHJ, SHJ, SMJ) in join.sql. But even if the `spark.sql.join.preferSortMergeJoin` is set to `false`, shuffled hash join is still not guaranteed. Thus, we need another config to force the selection. ### Does this PR introduce _any_ user-facing change? No, only for testing ### How was this patch tested? newly added tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
SparkQA removed a comment on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872593557 **[Test build #140543 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140543/testReport)** for PR 33093 at commit [`b8c70ab`](https://github.com/apache/spark/commit/b8c70abbe5a462e2804f203e6a718e8ac0edd47c). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
SparkQA commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872685340 **[Test build #140543 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140543/testReport)** for PR 33093 at commit [`b8c70ab`](https://github.com/apache/spark/commit/b8c70abbe5a462e2804f203e6a718e8ac0edd47c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32286: [SPARK-35181][CORE] Use zstd for spark.io.compression.codec by default
SparkQA commented on pull request #32286: URL: https://github.com/apache/spark/pull/32286#issuecomment-872685080 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45066/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org