Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#issuecomment-2221242177 Merged. Thanks @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya merged PR #613: URL: https://github.com/apache/datafusion-comet/pull/613 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672723713 ## .github/workflows/benchmark.yml: ## @@ -138,7 +138,7 @@ jobs: - name: Run TPC-DS queries (Sort merge join) if: matrix.join == 'sort_merge' run: | - SPARK_HOME=`pwd` SPARK_TPCDS_DATA=`pwd`/tpcds-sf-1 ./mvnw -B -Prelease -Dsuites=org.apache.spark.sql.CometTPCDSQuerySuite test + SPARK_HOME=`pwd` SPARK_TPCDS_DATA=`pwd`/tpcds-sf-1 ./mvnw -e -X -B -Prelease -Dsuites=org.apache.spark.sql.CometTPCDSQuerySuite test Review Comment: ```suggestion SPARK_HOME=`pwd` SPARK_TPCDS_DATA=`pwd`/tpcds-sf-1 ./mvnw -B -Prelease -Dsuites=org.apache.spark.sql.CometTPCDSQuerySuite test ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672724044 ## .github/workflows/benchmark.yml: ## @@ -138,7 +138,7 @@ jobs: - name: Run TPC-DS queries (Sort merge join) if: matrix.join == 'sort_merge' run: | - SPARK_HOME=`pwd` SPARK_TPCDS_DATA=`pwd`/tpcds-sf-1 ./mvnw -B -Prelease -Dsuites=org.apache.spark.sql.CometTPCDSQuerySuite test + SPARK_HOME=`pwd` SPARK_TPCDS_DATA=`pwd`/tpcds-sf-1 ./mvnw -e -X -B -Prelease -Dsuites=org.apache.spark.sql.CometTPCDSQuerySuite test Review Comment: Removing the debugging flags I added. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672521533 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -157,9 +191,11 @@ class CometTPCDSQuerySuite conf.set(CometConf.COMET_EXEC_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true") -conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g") +conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "15g") +conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true") +conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true") Review Comment: Created #648 for this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672516941 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -157,9 +191,11 @@ class CometTPCDSQuerySuite conf.set(CometConf.COMET_EXEC_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true") -conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g") +conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "15g") +conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true") +conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true") Review Comment: Let me create another issue for this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672515755 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -112,7 +108,9 @@ class CometTPCDSQuerySuite "q69", "q70", "q71", -"q72", +// TODO: unknown failure (seems memory usage over Github action runner) in CI with q72 in +// https://github.com/apache/datafusion-comet/pull/613. +// "q72", Review Comment: > I do think we should still test with the original q72 as a separate exercise though, because if Spark can run it then Comet should be able to as well (with the same memory configuration). Yea. As I mentioned earlier, I will investigate q72 further to see why it requires extra memory in Comet. Just disable it to unblock this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672513653 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -112,7 +108,9 @@ class CometTPCDSQuerySuite "q69", "q70", "q71", -"q72", +// TODO: unknown failure (seems memory usage over Github action runner) in CI with q72 in +// https://github.com/apache/datafusion-comet/pull/613. +// "q72", Review Comment: > I am +1 on skipping running the official q72 query by default (because it is so ridiculous), especially in CI. However, maybe we should consider running an optimized version where the join order is sensible, which makes it at least 10x faster and uses far less memory. I will file a follow on issue to discuss this. Sounds good to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672512250 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -112,7 +108,9 @@ class CometTPCDSQuerySuite "q69", "q70", "q71", -"q72", +// TODO: unknown failure (seems memory usage over Github action runner) in CI with q72 in +// https://github.com/apache/datafusion-comet/pull/613. +// "q72", Review Comment: > The purpose of q72 is to test vendors join reordering rules, and that isn't really very relevant to Spark or Comet since Spark queries typically don't have access to statistics. Btw, Spark has the capacity to do join reordering if statistics are available but it relies on enabling CBO features which are disabled by default. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
andygrove commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672494104 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -157,9 +191,11 @@ class CometTPCDSQuerySuite conf.set(CometConf.COMET_EXEC_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true") -conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g") +conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "15g") +conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true") +conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true") Review Comment: This can be a separate PR but we should not close the issue when we merge this PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
andygrove commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672490835 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -112,7 +108,9 @@ class CometTPCDSQuerySuite "q69", "q70", "q71", -"q72", +// TODO: unknown failure (seems memory usage over Github action runner) in CI with q72 in +// https://github.com/apache/datafusion-comet/pull/613. +// "q72", Review Comment: This is the version I have been using locally. Since we are not aiming to run the official TPC-DS benchmarks, but just our derived benchmarks, and also given that we are comparing Spark to Comet for the same queries, I think this would be fine to use by default as long it is well documented in our benchmarking guide. I do think we should still test with the original q72 as a separate exercise though, because if Spark can run it then Comet should be able to as well (with the same memory configuration). ```sql select i_item_desc ,w_warehouse_name ,d1.d_week_seq ,sum(case when p_promo_sk is null then 1 else 0 end) no_promo ,sum(case when p_promo_sk is not null then 1 else 0 end) promo ,count(*) total_cnt from catalog_sales join date_dim d1 on (cs_sold_date_sk = d1.d_date_sk) join customer_demographics on (cs_bill_cdemo_sk = cd_demo_sk) join household_demographics on (cs_bill_hdemo_sk = hd_demo_sk) join item on (i_item_sk = cs_item_sk) join inventory on (cs_item_sk = inv_item_sk) join warehouse on (w_warehouse_sk=inv_warehouse_sk) join date_dim d2 on (inv_date_sk = d2.d_date_sk) join date_dim d3 on (cs_ship_date_sk = d3.d_date_sk) left outer join promotion on (cs_promo_sk=p_promo_sk) left outer join catalog_returns on (cr_item_sk = cs_item_sk and cr_order_number = cs_order_number) where d1.d_week_seq = d2.d_week_seq and inv_quantity_on_hand < cs_quantity and d3.d_date > d1.d_date + 5 and hd_buy_potential = '501-1000' and d1.d_year = 1999 and cd_marital_status = 'S' group by i_item_desc,w_warehouse_name,d1.d_week_seq order by total_cnt desc, i_item_desc, w_warehouse_name, d_week_seq LIMIT 100; ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
andygrove commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672479688 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -112,7 +108,9 @@ class CometTPCDSQuerySuite "q69", "q70", "q71", -"q72", +// TODO: unknown failure (seems memory usage over Github action runner) in CI with q72 in +// https://github.com/apache/datafusion-comet/pull/613. +// "q72", Review Comment: The purpose of q72 is to test vendors join reordering rules, and that isn't really very relevant to Spark or Comet since Spark queries typically don't have access to statistics. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
andygrove commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672476021 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -157,9 +191,11 @@ class CometTPCDSQuerySuite conf.set(CometConf.COMET_EXEC_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true") -conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g") +conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "15g") +conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true") +conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true") Review Comment: Before we can close https://github.com/apache/datafusion-comet/issues/387 we should either change the default for `COMET_SHUFFLE_ENFORCE_MODE_ENABLED` or remove it completely. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
andygrove commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672471830 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -112,7 +108,9 @@ class CometTPCDSQuerySuite "q69", "q70", "q71", -"q72", +// TODO: unknown failure (seems memory usage over Github action runner) in CI with q72 in +// https://github.com/apache/datafusion-comet/pull/613. +// "q72", Review Comment: I am +1 on skipping running the official q72 query by default (because it is so ridiculous), especially in CI. However, maybe we should consider running an optimized version where the join order is sensible, which makes it at least 10x faster and uses far less memory. I will file a follow on issue to discuss this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672311644 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -141,9 +139,45 @@ class CometTPCDSQuerySuite "q98", "q99") - // TODO: enable the 3 queries after fixing the issues #1358. - override val tpcdsQueries: Seq[String] = -tpcdsAllQueries.filterNot(excludedTpcdsQueries.contains) + val tpcdsAllQueriesV2_7_0: Seq[String] = Seq( +"q5a", +"q6", +"q10a", +"q11", +"q12", +"q14", +"q14a", +"q18a", +"q20", +"q22", +"q22a", +"q24", +"q27a", +"q34", +"q35", +"q35a", +"q36a", +"q47", +"q49", +"q51a", +"q57", +"q64", +"q67a", +"q70a", +// TODO: unknown failure (seems memory usage over Github action runner) in CI with q72-v2.7 +// in https://github.com/apache/datafusion-comet/pull/613. +// "q72", Review Comment: I will investigate the two queries further but they seem not related to the changes here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1672310479 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -141,9 +139,45 @@ class CometTPCDSQuerySuite "q98", "q99") - // TODO: enable the 3 queries after fixing the issues #1358. - override val tpcdsQueries: Seq[String] = -tpcdsAllQueries.filterNot(excludedTpcdsQueries.contains) + val tpcdsAllQueriesV2_7_0: Seq[String] = Seq( +"q5a", +"q6", +"q10a", +"q11", +"q12", +"q14", +"q14a", +"q18a", +"q20", +"q22", +"q22a", +"q24", +"q27a", +"q34", +"q35", +"q35a", +"q36a", +"q47", +"q49", +"q51a", +"q57", +"q64", +"q67a", +"q70a", +// TODO: unknown failure (seems memory usage over Github action runner) in CI with q72-v2.7 +// in https://github.com/apache/datafusion-comet/pull/613. +// "q72", Review Comment: I found a few particular queries (q72, q16) seems to use more memory than others. q72 cannot be run through sort merge join config now in the CI runner due to its resource limit, but I can run it locally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1671254402 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -141,9 +139,45 @@ class CometTPCDSQuerySuite "q98", "q99") - // TODO: enable the 3 queries after fixing the issues #1358. - override val tpcdsQueries: Seq[String] = -tpcdsAllQueries.filterNot(excludedTpcdsQueries.contains) + val tpcdsAllQueriesV2_7_0: Seq[String] = Seq( +"q5a", +"q6", +"q10a", +"q11", +"q12", +"q14", +"q14a", +"q18a", +"q20", +"q22", +"q22a", +"q24", +"q27a", +"q34", +"q35", +"q35a", +"q36a", +"q47", +"q49", +"q51a", +"q57", +"q64", +"q67a", +"q70a", +// TODO: unknown failure (seems memory usage over Github action runner) in CI with q72-v2.7 +// in https://github.com/apache/datafusion-comet/pull/613. +// "q72", Review Comment: In latest run, I saw `Error: Process completed with exit code 143.`. It seems like the memory usage is larger than the Github action runner. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#issuecomment-2218343486 This only got failures on `CometTPCDSQuerySuite` with sort merge join configs (broadcast and hash join configs are passed). But I don't see any details about the failure in CI logs. Only got: ``` 2024-07-09T15:55:27.4236260Z [ERROR] Failed to execute goal org.scalatest:scalatest-maven-plugin:2.0.2:test (test) on project comet-spark-spark3.4_2.12: There are test failures -> [Help 1] 2024-07-09T15:55:27.4283847Z org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.scalatest:scalatest-maven-plugin:2.0.2:test (test) on project comet-spark-spark3.4_2.12: There are test failures 2024-07-09T15:55:27.4621881Z at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute2 (MojoExecutor.java:333) 2024-07-09T15:55:27.4630194Z at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:316) 2024-07-09T15:55:27.4634523Z at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:212) 2024-07-09T15:55:27.4635530Z at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:174) 2024-07-09T15:55:27.4667105Z at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:75) 2024-07-09T15:55:27.4756898Z at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:162) 2024-07-09T15:55:27.4764057Z at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39) 2024-07-09T15:55:27.4785783Z at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:159) 2024-07-09T15:55:27.4796446Z at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105) 2024-07-09T15:55:27.4807376Z at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:73) 2024-07-09T15:55:27.4814525Z at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:53) 2024-07-09T15:55:27.4817912Z at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:118) 2024-07-09T15:55:27.4821849Z at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:261) 2024-07-09T15:55:27.4828407Z at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:173) 2024-07-09T15:55:27.4829379Z at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:101) 2024-07-09T15:55:27.4832730Z at org.apache.maven.cli.MavenCli.execute (MavenCli.java:906) 2024-07-09T15:55:27.4835266Z at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:283) 2024-07-09T15:55:27.4836081Z at org.apache.maven.cli.MavenCli.main (MavenCli.java:206) 2024-07-09T15:55:27.4839506Z at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method) 2024-07-09T15:55:27.4842763Z at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62) 2024-07-09T15:55:27.4849012Z at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43) 2024-07-09T15:55:27.4861001Z at java.lang.reflect.Method.invoke (Method.java:566) 2024-07-09T15:55:27.4870461Z at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:283) 2024-07-09T15:55:27.4885612Z at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:226) 2024-07-09T15:55:27.4890602Z at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:407) 2024-07-09T15:55:27.4899835Z at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:348) 2024-07-09T15:55:27.4904594Z at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method) 2024-07-09T15:55:27.4911451Z at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62) 2024-07-09T15:55:27.4915859Z at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43) 2024-07-09T15:55:27.4919199Z at java.lang.reflect.Method.invoke (Method.java:566) 2024-07-09T15:55:27.4932920Z at org.apache.maven.wrapper.BootstrapMainStarter.start (BootstrapMainStarter.java:52) 2024-07-09T15:55:27.4936029Z at org.apache.maven.wrapper.WrapperExecutor.execute (WrapperExecutor.java:161) 2024-07-09T15:55:27.4946395Z at org.apache.maven.wrapper.MavenWrapperMain.main (MavenWrapperMain.java:73) 2024-07-09T15:55:27.4963372Z Caused by: org.apache.maven.plugin.MojoFailureException: There are test failures 2024-07-09T15:55:27.4981790Z at org.scalatest.tools.maven.TestMojo.execute (TestMojo.java:109) 2024-07-09T15:55:27.4997308Z at org.apache.maven.plugin.DefaultBuildP
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#issuecomment-2212554468 The OOM issue of some TPCDS queries in CI will be fixed by #639 . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1667405223 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -158,6 +158,11 @@ class CometTPCDSQuerySuite conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true") conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g") +conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true") +conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true") +// Disable `CometTakeOrderedAndProjectExec` because it doesn't produce same output order +// as Spark. +conf.set("spark.comet.exec.takeOrderedAndProjectExec.disabled", "true") Review Comment: After #628, we don't need to disable `CometTakeOrderedAndProjectExec`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1666331623 ## .github/workflows/benchmark.yml: ## @@ -141,7 +141,7 @@ jobs: SPARK_HOME=`pwd` SPARK_TPCDS_DATA=`pwd`/tpcds-sf-1 ./mvnw -B -Prelease -Dsuites=org.apache.spark.sql.CometTPCDSQuerySuite test env: SPARK_TPCDS_JOIN_CONF: | -spark.sql.autoBroadcastJoinThreshold=-1 +spark.sql.autoBroadcastJoinThreshold=-2 # Set to -2 to force Spark to sort the results Review Comment: Spark `TPCDSQueryTestSuite` checks if `spark.sql.autoBroadcastJoinThreshold` is -1 to decide whether to sort returned results. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1666361659 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -158,6 +158,11 @@ class CometTPCDSQuerySuite conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true") conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g") +conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true") +conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true") +// Disable `CometTakeOrderedAndProjectExec` because it doesn't produce same output order +// as Spark. +conf.set("spark.comet.exec.takeOrderedAndProjectExec.disabled", "true") Review Comment: Currently I opened https://github.com/apache/datafusion-comet/pull/628 to enforce sorting for all join types. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1666331623 ## .github/workflows/benchmark.yml: ## @@ -141,7 +141,7 @@ jobs: SPARK_HOME=`pwd` SPARK_TPCDS_DATA=`pwd`/tpcds-sf-1 ./mvnw -B -Prelease -Dsuites=org.apache.spark.sql.CometTPCDSQuerySuite test env: SPARK_TPCDS_JOIN_CONF: | -spark.sql.autoBroadcastJoinThreshold=-1 +spark.sql.autoBroadcastJoinThreshold=-2 # Set to -2 to force Spark to sort the results Review Comment: Spark `TPCDSQueryTestSuite` checks if `spark.sql.autoBroadcastJoinThreshold` is -1 to decide whether to sort returned results. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1666305375 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -158,6 +158,11 @@ class CometTPCDSQuerySuite conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true") conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g") +conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true") +conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true") +// Disable `CometTakeOrderedAndProjectExec` because it doesn't produce same output order +// as Spark. +conf.set("spark.comet.exec.takeOrderedAndProjectExec.disabled", "true") Review Comment: We maybe need to address this difference when comparing Spark and Comet query results. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1666304767 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -158,6 +158,11 @@ class CometTPCDSQuerySuite conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true") conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g") +conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true") +conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true") +// Disable `CometTakeOrderedAndProjectExec` because it doesn't produce same output order +// as Spark. +conf.set("spark.comet.exec.takeOrderedAndProjectExec.disabled", "true") Review Comment: Hmm, I found that the order of returned rows are not related to the sorting columns. For example, if you sort on [a, b], there are some rows with same [a, b] so the order of these rows are irrelevant to the sorting order, i.e., either ``` |a|b|c| |0|1|1| |0|1|2| ``` or ``` |a|b|c| |0|1|2| |0|1|1| ``` are correct results for sorting on [a, b]. In these failed queries, Spark sort produces one and DataFusion sort produces another one. Because we compare the results as string, they are considered incorrect now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
viirya commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1664503124 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -158,6 +158,11 @@ class CometTPCDSQuerySuite conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true") conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g") +conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true") +conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true") +// Disable `CometTakeOrderedAndProjectExec` because it doesn't produce same output order +// as Spark. +conf.set("spark.comet.exec.takeOrderedAndProjectExec.disabled", "true") Review Comment: I think these tests should be deterministic (that's why we can compare it with golden files). I'm not sure why `CometTakeOrderedAndProjectExec` returns out of order results. The results are same, but the orders are different to Spark. I suspect that it is something related to sorting part in `CometTakeOrderedAndProjectExec`. As the sorting is delegated to DataFusion's sort/top k operators, I need to investigate particularly for the failed query (e.g., q6). It is not related to the change here, though. So I will investigate it separately in follow PRs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]
andygrove commented on code in PR #613: URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1664488466 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -158,6 +158,11 @@ class CometTPCDSQuerySuite conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true") conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true") conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g") +conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true") +conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true") +// Disable `CometTakeOrderedAndProjectExec` because it doesn't produce same output order +// as Spark. +conf.set("spark.comet.exec.takeOrderedAndProjectExec.disabled", "true") Review Comment: Is this a correctness issue in general or is the issue specific to this test because we have queries that are non-deterministic? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org