(incubator-gluten) branch main updated: [GLUTEN-5142][CELEBORN] Remove Incubating of Celeborn from reference (#5143)

2024-03-26 Thread ulyssesyou
This is an automated email from the ASF dual-hosted git repository.

ulyssesyou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
 new 3bc5387c0 [GLUTEN-5142][CELEBORN] Remove Incubating of Celeborn from 
reference (#5143)
3bc5387c0 is described below

commit 3bc5387c0e50f3e012f6ffad55dabbb7c52229c9
Author: Nicholas Jiang 
AuthorDate: Wed Mar 27 13:44:40 2024 +0800

[GLUTEN-5142][CELEBORN] Remove Incubating of Celeborn from reference (#5143)
---
 docs/get-started/ClickHouse.md | 8 
 docs/get-started/Velox.md  | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/docs/get-started/ClickHouse.md b/docs/get-started/ClickHouse.md
index ad7183b90..4167af2ee 100644
--- a/docs/get-started/ClickHouse.md
+++ b/docs/get-started/ClickHouse.md
@@ -629,14 +629,14 @@ public read-only account:gluten/hN2xX3uQ4m
 
 ### Celeborn support
 
-Gluten with clickhouse backend has not yet supportted 
[Celeborn](https://github.com/apache/incubator-celeborn) natively as remote 
shuffle service using columar shuffle. However, you can still use Celeborn with 
row shuffle, which means a ColumarBatch will be converted to a row during 
shuffle.
+Gluten with clickhouse backend has not yet supportted 
[Celeborn](https://github.com/apache/celeborn) natively as remote shuffle 
service using columar shuffle. However, you can still use Celeborn with row 
shuffle, which means a ColumarBatch will be converted to a row during shuffle.
 Below introduction is used to enable this feature:
 
-First refer to this URL(https://github.com/apache/incubator-celeborn) to setup 
a celeborn cluster.
+First refer to this URL(https://github.com/apache/celeborn) to setup a 
celeborn cluster.
 
 Then add the Spark Celeborn Client packages to your Spark application's 
classpath(usually add them into `$SPARK_HOME/jars`).
 
-- Celeborn: celeborn-client-spark-3-shaded_2.12-0.3.0-incubating.jar
+- Celeborn: celeborn-client-spark-3-shaded_2.12-[celebornVersion].jar
 
 Currently to use Celeborn following configurations are required in 
`spark-defaults.conf`
 
@@ -666,7 +666,7 @@ spark.sql.adaptive.localShuffleReader.enabled false
 spark.celeborn.storage.hdfs.dir hdfs:///celeborn
 
 # If you want to use dynamic resource allocation,
-# please refer to this URL 
(https://github.com/apache/incubator-celeborn/tree/main/assets/spark-patch) to 
apply the patch into your own Spark.
+# please refer to this URL 
(https://github.com/apache/celeborn/tree/main/assets/spark-patch) to apply the 
patch into your own Spark.
 spark.dynamicAllocation.enabled false
 ```
 
diff --git a/docs/get-started/Velox.md b/docs/get-started/Velox.md
index 7c3d77abc..1fabfc0fe 100644
--- a/docs/get-started/Velox.md
+++ b/docs/get-started/Velox.md
@@ -203,11 +203,11 @@ Currently there are several ways to asscess S3 in Spark. 
Please refer [Velox S3]
 
 ## Celeborn support
 
-Gluten with velox backend supports 
[Celeborn](https://github.com/apache/incubator-celeborn) as remote shuffle 
service. Currently, the supported Celeborn versions are `0.3.x` and `0.4.0`.
+Gluten with velox backend supports 
[Celeborn](https://github.com/apache/celeborn) as remote shuffle service. 
Currently, the supported Celeborn versions are `0.3.x` and `0.4.0`.
 
 Below introduction is used to enable this feature
 
-First refer to this URL(https://github.com/apache/incubator-celeborn) to setup 
a celeborn cluster.
+First refer to this URL(https://github.com/apache/celeborn) to setup a 
celeborn cluster.
 
 When compiling the Gluten Java module, it's required to enable `rss` profile, 
as follows:
 
@@ -217,7 +217,7 @@ mvn clean package -Pbackends-velox -Pspark-3.3 -Prss 
-DskipTests
 
 Then add the Gluten and Spark Celeborn Client packages to your Spark 
application's classpath(usually add them into `$SPARK_HOME/jars`).
 
-- Celeborn: celeborn-client-spark-3-shaded_2.12-0.3.0-incubating.jar
+- Celeborn: celeborn-client-spark-3-shaded_2.12-[celebornVersion].jar
 - Gluten: gluten-velox-bundle-spark3.x_2.12-xx_xx_xx-SNAPSHOT.jar, 
gluten-thirdparty-lib-xx-xx.jar
 
 Currently to use Gluten following configurations are required in 
`spark-defaults.conf`
@@ -248,7 +248,7 @@ spark.sql.adaptive.localShuffleReader.enabled false
 spark.celeborn.storage.hdfs.dir hdfs:///celeborn
 
 # If you want to use dynamic resource allocation,
-# please refer to this URL 
(https://github.com/apache/incubator-celeborn/tree/main/assets/spark-patch) to 
apply the patch into your own Spark.
+# please refer to this URL 
(https://github.com/apache/celeborn/tree/main/assets/spark-patch) to apply the 
patch into your own Spark.
 spark.dynamicAllocation.enabled false
 ```
 


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [I] Remove Incubating of Celeborn from reference [incubator-gluten]

2024-03-26 Thread via GitHub


ulysses-you closed issue #5142: Remove Incubating of Celeborn from reference
URL: https://github.com/apache/incubator-gluten/issues/5142


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5142][CELEBORN] Remove Incubating of Celeborn from reference [incubator-gluten]

2024-03-26 Thread via GitHub


ulysses-you merged PR #5143:
URL: https://github.com/apache/incubator-gluten/pull/5143


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] Velox patch to avoid installing libunwind-dev no longer works [incubator-gluten]

2024-03-26 Thread via GitHub


PHILO-HE commented on PR #5127:
URL: 
https://github.com/apache/incubator-gluten/pull/5127#issuecomment-2021997304

   Sorry for late response. Looks good! Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5142][CELEBORN] Remove Incubating of Celeborn from reference [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5143:
URL: 
https://github.com/apache/incubator-gluten/pull/5143#issuecomment-2021995102

   https://github.com/apache/incubator-gluten/issues/5142


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5136][VL] Duplicated output from Spark-to-Velox broadcast relation conversion [incubator-gluten]

2024-03-26 Thread via GitHub


ulysses-you commented on PR #5141:
URL: 
https://github.com/apache/incubator-gluten/pull/5141#issuecomment-2021974259

   I see, thank you for the explaination!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



[I] Remove Incubating of Celeborn from reference [incubator-gluten]

2024-03-26 Thread via GitHub


SteNicholas opened a new issue, #5142:
URL: https://github.com/apache/incubator-gluten/issues/5142

   ### Description
   
   The ASF board has approved a resolution to graduate Celeborn into a full Top 
Level Project. Incubating of Celeborn should be removed from reference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-4964][CORE]Fallback complex data type in parquet write for Spark32 & Spark33 [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5107:
URL: 
https://github.com/apache/incubator-gluten/pull/5107#issuecomment-2021963550

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5136][VL] Duplicated output from Spark-to-Velox broadcast relation conversion [incubator-gluten]

2024-03-26 Thread via GitHub


zhztheplayer commented on PR #5141:
URL: 
https://github.com/apache/incubator-gluten/pull/5141#issuecomment-2021963407

   > Thank you @zhztheplayer for the quick fix. After this pr if there is no 
c2r, the duplicate keys issue is still existed right ?
   
   After the fix is applied we should no longer have any relevant issues on BHJ 
unless unknown. 
   
   The issue this PR tried to fix only happened when broadcast exchange is 
fallen back but bhj is not. Which is a corner case for current Gluten, usually 
they are both fallen back or both not. Thus ideally we shouldn't have this 
issue in usual bhj processing.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5136][VL] Duplicated output from Spark-to-Velox broadcast relation conversion [incubator-gluten]

2024-03-26 Thread via GitHub


ulysses-you commented on PR #5141:
URL: 
https://github.com/apache/incubator-gluten/pull/5141#issuecomment-2021947563

   Thank you @zhztheplayer for the quick fix. After this pr if there is no c2r, 
the duplicate keys issue is still existed right ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Basic runnable version of ACBO (Advanced CBO) [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5058:
URL: 
https://github.com/apache/incubator-gluten/pull/5058#issuecomment-2021943260

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5123][INFRA]set up java and maven according to os in build_bundle_package.yml [incubator-gluten]

2024-03-26 Thread via GitHub


zhouyuan commented on PR #5124:
URL: 
https://github.com/apache/incubator-gluten/pull/5124#issuecomment-2021938997

   The feature itself looks good to me
   CC @PHILO-HE 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] Add uniffle integration [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #3767:
URL: 
https://github.com/apache/incubator-gluten/pull/3767#issuecomment-2021937111

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



(incubator-gluten) branch main updated: [GLUTEN-5136][VL] Duplicated output from Spark-to-Velox broadcast relation conversion (#5141)

2024-03-26 Thread hongze
This is an automated email from the ASF dual-hosted git repository.

hongze pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
 new e4fe9baec [GLUTEN-5136][VL] Duplicated output from Spark-to-Velox 
broadcast relation conversion (#5141)
e4fe9baec is described below

commit e4fe9baeccde07e2938d5f186151c43591e91720
Author: Hongze Zhang 
AuthorDate: Wed Mar 27 12:54:29 2024 +0800

[GLUTEN-5136][VL] Duplicated output from Spark-to-Velox broadcast relation 
conversion (#5141)
---
 .../apache/spark/sql/execution/BroadcastUtils.scala| 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git 
a/backends-velox/src/main/scala/org/apache/spark/sql/execution/BroadcastUtils.scala
 
b/backends-velox/src/main/scala/org/apache/spark/sql/execution/BroadcastUtils.scala
index a0f28c5ab..ad7694ea2 100644
--- 
a/backends-velox/src/main/scala/org/apache/spark/sql/execution/BroadcastUtils.scala
+++ 
b/backends-velox/src/main/scala/org/apache/spark/sql/execution/BroadcastUtils.scala
@@ -26,7 +26,7 @@ import org.apache.spark.broadcast.Broadcast
 import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.expressions.UnsafeRow
 import org.apache.spark.sql.catalyst.plans.physical.{BroadcastMode, 
BroadcastPartitioning, IdentityBroadcastMode, Partitioning}
-import org.apache.spark.sql.execution.joins.{HashedRelation, 
HashedRelationBroadcastMode}
+import org.apache.spark.sql.execution.joins.{HashedRelation, 
HashedRelationBroadcastMode, LongHashedRelation}
 import org.apache.spark.sql.types.StructType
 import org.apache.spark.sql.vectorized.ColumnarBatch
 import org.apache.spark.util.TaskResources
@@ -96,9 +96,8 @@ object BroadcastUtils {
 // HashedRelation to ColumnarBuildSideRelation.
 val fromBroadcast = from.asInstanceOf[Broadcast[HashedRelation]]
 val fromRelation = fromBroadcast.value.asReadOnlyCopy()
-val keys = fromRelation.keys()
 val toRelation = TaskResources.runUnsafe {
-  val batchItr: Iterator[ColumnarBatch] = fn(keys.flatMap(key => 
fromRelation.get(key)))
+  val batchItr: Iterator[ColumnarBatch] = 
fn(reconstructRows(fromRelation))
   val serialized: Array[Array[Byte]] = serializeStream(batchItr) match 
{
 case ColumnarBatchSerializeResult.EMPTY =>
   Array()
@@ -170,4 +169,17 @@ object BroadcastUtils {
   }
 serializeResult
   }
+
+  private def reconstructRows(relation: HashedRelation): Iterator[InternalRow] 
= {
+// It seems that LongHashedRelation and UnsafeHashedRelation don't follow 
the same
+//  criteria while getting values from them.
+// Should review the internals of this part of code.
+relation match {
+  case relation: LongHashedRelation if relation.keyIsUnique =>
+relation.keys().map(k => relation.getValue(k))
+  case relation: LongHashedRelation if !relation.keyIsUnique =>
+relation.keys().flatMap(k => relation.get(k))
+  case other => other.valuesWithKeyIndex().map(_.getValue)
+}
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5136][VL] Duplicated output from Spark-to-Velox broadcast relation conversion [incubator-gluten]

2024-03-26 Thread via GitHub


zhztheplayer merged PR #5141:
URL: https://github.com/apache/incubator-gluten/pull/5141


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5136][VL] Duplicated output from Spark-to-Velox broadcast relation conversion [incubator-gluten]

2024-03-26 Thread via GitHub


zhztheplayer commented on PR #5141:
URL: 
https://github.com/apache/incubator-gluten/pull/5141#issuecomment-2021931432

   cc @ulysses-you 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5123][INFRA]set up java and maven according to os in build_bundle_package.yml [incubator-gluten]

2024-03-26 Thread via GitHub


zhouyuan commented on PR #5124:
URL: 
https://github.com/apache/incubator-gluten/pull/5124#issuecomment-2021930759

   Hi @dcoliversun 
   This patch seems trying to generate package for each OS, the package built 
from centos7 should be to work on other platforms as it's using static 
packaging via vcpkg. 
   Can you please to check the centos7 package can work on your case?
   
   thanks,
   -yuan


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [I] [VL] Vanilla Spark broadcast exchange + R2C is slow sometimes [incubator-gluten]

2024-03-26 Thread via GitHub


zhztheplayer commented on issue #5136:
URL: 
https://github.com/apache/incubator-gluten/issues/5136#issuecomment-2021919347

   The major issue I have found is that the `flatMap` approach would cause 
`UnsafeHashedRelation` to produce duplicated rows in my case (TPCDS q14a with 
current version of ACBO)
   While the `map` approach would cause `LongHashedRelation` to loss rows 
(TPCDS q2).
   
   The following fix (the same with #5141) can work but I didn't dive into it 
deeply to find the root reason of the inconsistency (maybe related to 
`keyIsUnique`? I am not sure).
   
   ```scala
 private def reconstructRows(relation: HashedRelation): 
Iterator[InternalRow] = {
   // It seems that LongHashedRelation and UnsafeHashedRelation don't 
follow the same
   //  criteria while getting values from them.
   // Should review the internals of this part of code.
   relation match {
 case relation: LongHashedRelation if relation.keyIsUnique =>
   relation.keys().map(k => relation.getValue(k))
 case relation: LongHashedRelation if !relation.keyIsUnique =>
   relation.keys().flatMap(k => relation.get(k))
 case other => other.valuesWithKeyIndex().map(_.getValue)
   }
 }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [I] [VL] Vanilla Spark broadcast exchange + R2C is slow sometimes [incubator-gluten]

2024-03-26 Thread via GitHub


zhztheplayer commented on issue #5136:
URL: 
https://github.com/apache/incubator-gluten/issues/5136#issuecomment-2021916024

   I don't have dedicated UTs for it so it was incorporated into the other PR.
   
   Still I can open one for it if you think it's needed: 
https://github.com/apache/incubator-gluten/pull/5141.
   
   The change was already tested so I will proceed to merge after code style 
check is passed if it's OK to you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Enable from_utc_timestamp Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5140:
URL: 
https://github.com/apache/incubator-gluten/pull/5140#issuecomment-2021909505

   
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on Github Issues?
   
   https://github.com/apache/incubator-gluten/issues
   
   Then could you also rename ***commit message*** and ***pull request title*** 
in the following format?
   
   [GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}
   
   See also:
   
 * [Other pull requests](https://github.com/apache/incubator-gluten/pulls/)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5136][VL] Duplicated output from Spark-to-Velox broadcast relation conversion [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5141:
URL: 
https://github.com/apache/incubator-gluten/pull/5141#issuecomment-2021911803

   https://github.com/apache/incubator-gluten/issues/5136


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Enable from_utc_timestamp Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5140:
URL: 
https://github.com/apache/incubator-gluten/pull/5140#issuecomment-2021909730

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



[PR] [CORE] Enable from_utc_timestamp Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


acvictor opened a new pull request, #5140:
URL: https://github.com/apache/incubator-gluten/pull/5140

   ## What changes were proposed in this pull request?
   
   Enable from_utc_timestamp Spark function
   
   
   ## How was this patch tested?
   
   Added UT
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Enable to_utc_timestamp Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5139:
URL: 
https://github.com/apache/incubator-gluten/pull/5139#issuecomment-2021897306

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Enable to_utc_timestamp Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


acvictor commented on PR #5139:
URL: 
https://github.com/apache/incubator-gluten/pull/5139#issuecomment-2021894818

   @PHILO-HE can you please review? Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



[PR] [CORE] Enable to_utc_timestamp Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


acvictor opened a new pull request, #5139:
URL: https://github.com/apache/incubator-gluten/pull/5139

   ## What changes were proposed in this pull request?
   
   Enable to_utc_timestamp
   
   ## How was this patch tested?
   Added a UT.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Enable to_utc_timestamp Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5139:
URL: 
https://github.com/apache/incubator-gluten/pull/5139#issuecomment-2021894673

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Enable to_utc_timestamp Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5139:
URL: 
https://github.com/apache/incubator-gluten/pull/5139#issuecomment-2021894438

   
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on Github Issues?
   
   https://github.com/apache/incubator-gluten/issues
   
   Then could you also rename ***commit message*** and ***pull request title*** 
in the following format?
   
   [GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}
   
   See also:
   
 * [Other pull requests](https://github.com/apache/incubator-gluten/pulls/)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [DNM] Velox test [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #4929:
URL: 
https://github.com/apache/incubator-gluten/pull/4929#issuecomment-2021890936

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [I] [VL] Vanilla Spark broadcast exchange + R2C is slow sometimes [incubator-gluten]

2024-03-26 Thread via GitHub


ulysses-you commented on issue #5136:
URL: 
https://github.com/apache/incubator-gluten/issues/5136#issuecomment-2021873724

   Thank you @zhztheplayer It's a good point, columnar broadcast would 
broadcast the origin binary data but vanilla Spark would broadcast hash 
relation. So I think this issue is a common case even if there is no r2c.
   
   Is it possbile to create a new pr for this issue ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] Add uniffle integration [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #3767:
URL: 
https://github.com/apache/incubator-gluten/pull/3767#issuecomment-2021856789

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



(incubator-gluten) branch main updated: [GLUTEN-5083][CH] Invalid result with mergeTwoPhasesHashBaseAggregateIfNeed enable (#5137)

2024-03-26 Thread zhangzc
This is an automated email from the ASF dual-hosted git repository.

zhangzc pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
 new d3e4f2e4d [GLUTEN-5083][CH] Invalid result with 
mergeTwoPhasesHashBaseAggregateIfNeed enable (#5137)
d3e4f2e4d is described below

commit d3e4f2e4dea31b8f19e1cf86772cf34c1688d364
Author: lgbo 
AuthorDate: Wed Mar 27 11:15:26 2024 +0800

[GLUTEN-5083][CH] Invalid result with mergeTwoPhasesHashBaseAggregateIfNeed 
enable (#5137)

[CH] Invalid result with mergeTwoPhasesHashBaseAggregateIfNeed enable
---
 cpp-ch/local-engine/Operator/GraceMergingAggregatedStep.cpp | 4 
 cpp-ch/local-engine/Operator/StreamingAggregatingStep.cpp   | 4 
 cpp-ch/local-engine/Parser/AggregateRelParser.cpp   | 2 +-
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/cpp-ch/local-engine/Operator/GraceMergingAggregatedStep.cpp 
b/cpp-ch/local-engine/Operator/GraceMergingAggregatedStep.cpp
index 9294d9719..00d2e3116 100644
--- a/cpp-ch/local-engine/Operator/GraceMergingAggregatedStep.cpp
+++ b/cpp-ch/local-engine/Operator/GraceMergingAggregatedStep.cpp
@@ -67,6 +67,10 @@ GraceMergingAggregatedStep::GraceMergingAggregatedStep(
 
 void GraceMergingAggregatedStep::transformPipeline(DB::QueryPipelineBuilder & 
pipeline, const DB::BuildQueryPipelineSettings &)
 {
+if (params.max_bytes_before_external_group_by)
+{
+throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, 
"max_bytes_before_external_group_by is not supported in 
GraceMergingAggregatedStep");
+}
 auto num_streams = pipeline.getNumStreams();
 auto transform_params = 
std::make_shared(pipeline.getHeader(), params, 
true);
 pipeline.resize(1);
diff --git a/cpp-ch/local-engine/Operator/StreamingAggregatingStep.cpp 
b/cpp-ch/local-engine/Operator/StreamingAggregatingStep.cpp
index ff81ee294..698d353b1 100644
--- a/cpp-ch/local-engine/Operator/StreamingAggregatingStep.cpp
+++ b/cpp-ch/local-engine/Operator/StreamingAggregatingStep.cpp
@@ -286,6 +286,10 @@ StreamingAggregatingStep::StreamingAggregatingStep(
 
 void StreamingAggregatingStep::transformPipeline(DB::QueryPipelineBuilder & 
pipeline, const DB::BuildQueryPipelineSettings &)
 {
+if (params.max_bytes_before_external_group_by)
+{
+throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, 
"max_bytes_before_external_group_by is not supported in 
StreamingAggregatingStep");
+}
 pipeline.dropTotalsAndExtremes();
 auto transform_params = 
std::make_shared(pipeline.getHeader(), params, 
false);
 pipeline.resize(1);
diff --git a/cpp-ch/local-engine/Parser/AggregateRelParser.cpp 
b/cpp-ch/local-engine/Parser/AggregateRelParser.cpp
index 02248d74a..a3ab329f0 100644
--- a/cpp-ch/local-engine/Parser/AggregateRelParser.cpp
+++ b/cpp-ch/local-engine/Parser/AggregateRelParser.cpp
@@ -310,7 +310,7 @@ void AggregateRelParser::addCompleteModeAggregatedStep()
 settings.group_by_overflow_mode,
 settings.group_by_two_level_threshold,
 settings.group_by_two_level_threshold_bytes,
-settings.max_bytes_before_external_group_by,
+0, /*settings.max_bytes_before_external_group_by*/
 settings.empty_result_for_aggregation_by_empty_set,
 getContext()->getTempDataOnDisk(),
 settings.max_threads,


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5083][CH] Invalid result with `mergeTwoPhasesHashBaseAggregateIfNeed` enable [incubator-gluten]

2024-03-26 Thread via GitHub


zzcclp merged PR #5137:
URL: https://github.com/apache/incubator-gluten/pull/5137


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [I] [CH] Invalid result with `mergeTwoPhasesHashBaseAggregateIfNeed` enable [incubator-gluten]

2024-03-26 Thread via GitHub


zzcclp closed issue #5083: [CH] Invalid result with 
`mergeTwoPhasesHashBaseAggregateIfNeed` enable
URL: https://github.com/apache/incubator-gluten/issues/5083


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



(incubator-gluten) branch main updated: [CORE] Support JDK17 (#5120)

2024-03-26 Thread yao
This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
 new 7942701c3 [CORE] Support JDK17 (#5120)
7942701c3 is described below

commit 7942701c3b67c72230f34286f837c3a6f13fd002
Author: Xiduo You 
AuthorDate: Wed Mar 27 11:10:11 2024 +0800

[CORE] Support JDK17 (#5120)

* Support JDK17

* address comment

-

Co-authored-by: Kent Yao 
---
 .github/workflows/velox_docker.yml | 114 ++---
 docs/developers/NewToGluten.md |  12 
 docs/get-started/Velox.md  |  28 -
 pom.xml|  47 ++-
 tools/gluten-it/pom.xml|  23 +++-
 tools/gluten-it/sbin/gluten-it.sh  |  21 ++-
 6 files changed, 152 insertions(+), 93 deletions(-)

diff --git a/.github/workflows/velox_docker.yml 
b/.github/workflows/velox_docker.yml
index f2b73e81d..6329750d2 100644
--- a/.github/workflows/velox_docker.yml
+++ b/.github/workflows/velox_docker.yml
@@ -73,6 +73,17 @@ jobs:
   matrix:
 os: ["ubuntu:20.04", "ubuntu:22.04"]
 spark: ["spark-3.2", "spark-3.3", "spark-3.4", "spark-3.5"]
+java: [ "java-8", "java-17" ]
+# Spark supports JDK17 since 3.3 and later, see 
https://issues.apache.org/jira/browse/SPARK-33772
+exclude:
+  - spark: spark-3.2
+java: java-17
+  - spark: spark-3.4
+java: java-17
+  - spark: spark-3.5
+java: java-17
+  - os: ubuntu:22.04
+java: java-17
 runs-on: ubuntu-20.04
 container: ${{ matrix.os }}
 steps:
@@ -84,69 +95,45 @@ jobs:
   path: ./cpp/build/releases
   - name: Setup java and maven
 run: |
-  apt-get update && \
-  apt-get install -y openjdk-8-jdk maven && \
+  if [ "${{ matrix.java }}" = "java-17" ]; then
+apt-get update && apt-get install -y openjdk-17-jdk maven
+  else
+apt-get update && apt-get install -y openjdk-8-jdk maven
+  fi
   apt remove openjdk-11* -y
-  - name: Build for Spark ${{ matrix.spark }}
-run: |
-  cd $GITHUB_WORKSPACE/ && \
-  mvn clean install -P${{ matrix.spark }} -Pbackends-velox -DskipTests
-  - name: Build and run TPCH/DS ${{ matrix.spark }}
-run: |
-  cd $GITHUB_WORKSPACE/tools/gluten-it && \
-  mvn clean install -P${{ matrix.spark }} \
-  && GLUTEN_IT_JVM_ARGS=-Xmx5G sbin/gluten-it.sh queries-compare \
---local --preset=velox --benchmark-type=h --error-on-memleak 
--off-heap-size=10g -s=1.0 --threads=16 --iterations=1 \
-  && GLUTEN_IT_JVM_ARGS=-Xmx5G sbin/gluten-it.sh queries-compare \
---local --preset=velox --benchmark-type=ds --error-on-memleak 
--off-heap-size=10g -s=1.0 --threads=16 --iterations=1
-
-
-  run-tpc-test-centos7:
-needs: build-native-lib
-strategy:
-  fail-fast: false
-  matrix:
-spark: ["spark-3.2", "spark-3.3", "spark-3.4", "spark-3.5"]
-runs-on: ubuntu-20.04
-container: centos:7
-steps:
-  - uses: actions/checkout@v2
-  - name: Download All Artifacts
-uses: actions/download-artifact@v2
-with:
-  name: velox-native-lib-${{github.sha}}
-  path: ./cpp/build/releases
-  - name: Setup java and maven
-run: |
-  yum update -y && yum install -y java-1.8.0-openjdk-devel wget
-  wget 
https://downloads.apache.org/maven/maven-3/3.8.8/binaries/apache-maven-3.8.8-bin.tar.gz
-  tar -xvf apache-maven-3.8.8-bin.tar.gz
-  mv apache-maven-3.8.8 /usr/lib/maven
-  - name: Build for Spark ${{ matrix.spark }}
+  - name: Build and run TPCH/DS
 run: |
   cd $GITHUB_WORKSPACE/
-  export MAVEN_HOME=/usr/lib/maven
-  export PATH=${PATH}:${MAVEN_HOME}/bin
-  mvn clean install -P${{ matrix.spark }} -Pbackends-velox -DskipTests
-  - name: Build and run TPCH/DS ${{ matrix.spark }}
-run: |
-  cd $GITHUB_WORKSPACE/tools/gluten-it 
-  export MAVEN_HOME=/usr/lib/maven
-  export PATH=${PATH}:${MAVEN_HOME}/bin
-  mvn clean install -P${{ matrix.spark }} \
+  export JAVA_HOME=/usr/lib/jvm/${{ matrix.java }}-openjdk-amd64
+  echo "JAVA_HOME: $JAVA_HOME"
+  mvn clean install -P${{ matrix.spark }} -P${{ matrix.java }} 
-Pbackends-velox -DskipTests
+  cd $GITHUB_WORKSPACE/tools/gluten-it
+  mvn clean install -P${{ matrix.spark }} -P${{ matrix.java }} \
   && GLUTEN_IT_JVM_ARGS=-Xmx5G sbin/gluten-it.sh queries-compare \
 --local --preset=velox --benchmark-type=h --error-on-memleak 
--off-heap-size=10g -s=1.0 --threads=16 --iterations=1 \
   && GLUTEN_IT_JVM_ARGS=-Xmx5G 

Re: [PR] [CORE] Support JDK17 [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5120:
URL: 
https://github.com/apache/incubator-gluten/pull/5120#issuecomment-2021843842

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Support JDK17 [incubator-gluten]

2024-03-26 Thread via GitHub


yaooqinn merged PR #5120:
URL: https://github.com/apache/incubator-gluten/pull/5120


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [DNM] Velox test [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #4929:
URL: 
https://github.com/apache/incubator-gluten/pull/4929#issuecomment-2021800771

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



[PR] [VL] Daily Update Velox Version (2024_03_27) [incubator-gluten]

2024-03-26 Thread via GitHub


marin-ma opened a new pull request, #5138:
URL: https://github.com/apache/incubator-gluten/pull/5138

   ```
   7fc09667d (upstream/main) Add estimateSerializedSize to 
BatchVectorSerializer (#8712)
   c354c31f1 Reuse result vector in Alpha reader (#9226)
   3fbb4754f Create UnitLoader (#9259)
   6ec8f26d9 Remove logical types in min_by and max_by tests (#8999)
   494b8881b Add support for kurtosis Spark aggregate function (#9233)
   2d832eef4 Delete unused ExpressionFuzzer::generateXxxArgs methods (#9256)
   0618c7f69 Fix integer overflow for window ROWS frame (#8870)
   4f3d32fd5 Clean up FOLLY nullable annotation (#9247)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] Daily Update Velox Version (2024_03_27) [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5138:
URL: 
https://github.com/apache/incubator-gluten/pull/5138#issuecomment-2021777264

   
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on Github Issues?
   
   https://github.com/apache/incubator-gluten/issues
   
   Then could you also rename ***commit message*** and ***pull request title*** 
in the following format?
   
   [GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}
   
   See also:
   
 * [Other pull requests](https://github.com/apache/incubator-gluten/pulls/)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5123][INFRA]set up java and maven according to os in build_bundle_package.yml [incubator-gluten]

2024-03-26 Thread via GitHub


dcoliversun commented on PR #5124:
URL: 
https://github.com/apache/incubator-gluten/pull/5124#issuecomment-2021764604

   @zhouyuan @wangyum please review this PR if have time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



(incubator-gluten) branch main updated: [Gluten-5018][CH] support minmax/bloomfilter/set skip index (#5019)

2024-03-26 Thread mahongbin
This is an automated email from the ASF dual-hosted git repository.

mahongbin pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
 new 972597184 [Gluten-5018][CH] support minmax/bloomfilter/set skip index 
(#5019)
972597184 is described below

commit 972597184e147fcf488fd6cda4b447356d61136d
Author: Hongbin Ma 
AuthorDate: Wed Mar 27 09:33:40 2024 +0800

[Gluten-5018][CH] support minmax/bloomfilter/set skip index (#5019)

* temp, by defualt all cols minmax index

basically works, dealing with nullable

nullable/not-null ok

remove unneceesary change

fix compile

* add ut

* remove dataschema

* fix spark32 bug
---
 .../source/DeltaMergeTreeFileFormat.scala  |  17 +-
 .../source/DeltaMergeTreeFileFormat.scala  |  17 +-
 .../java/io/glutenproject/metrics/MetricsStep.java |  11 +
 .../backendsapi/clickhouse/CHIteratorApi.scala |   3 +
 .../backendsapi/clickhouse/CHMetricsApi.scala  |   1 +
 .../execution/GlutenMergeTreePartition.scala   |   3 +
 .../metrics/FileSourceScanMetricsUpdater.scala |   2 +
 .../delta/ClickhouseOptimisticTransaction.scala|   7 +-
 .../sql/delta/catalog/ClickHouseTableV2.scala  |  35 ++-
 .../utils/MergeTreePartsPartitionsUtil.scala   |  33 +++
 .../datasources/v1/CHMergeTreeWriterInjects.scala  |  29 ++-
 .../v1/clickhouse/MergeTreeFileFormatWriter.scala  |   9 +
 ...GlutenClickHouseTPCHNotNullSkipIndexSuite.scala | 271 
 ...lutenClickHouseTPCHNullableSkipIndexSuite.scala | 277 +
 .../apache/spark/affinity/MixedAffinitySuite.scala |   3 +
 cpp-ch/local-engine/Common/MergeTreeTool.cpp   |  84 ++-
 cpp-ch/local-engine/Common/MergeTreeTool.h |   3 +
 cpp-ch/local-engine/Parser/MergeTreeRelParser.cpp  |  18 +-
 cpp-ch/local-engine/Parser/RelMetric.cpp   |   3 +
 cpp-ch/local-engine/Parser/TypeParser.cpp  |   4 +-
 cpp-ch/local-engine/Parser/TypeParser.h|  50 ++--
 .../substrait/rel/ExtensionTableBuilder.java   |   6 +
 .../substrait/rel/ExtensionTableNode.java  |  12 +
 .../datasource/GlutenFormatWriterInjects.scala |   4 +-
 24 files changed, 843 insertions(+), 59 deletions(-)

diff --git 
a/backends-clickhouse/src/main/delta-20/org/apache/spark/sql/execution/datasources/v2/clickhouse/source/DeltaMergeTreeFileFormat.scala
 
b/backends-clickhouse/src/main/delta-20/org/apache/spark/sql/execution/datasources/v2/clickhouse/source/DeltaMergeTreeFileFormat.scala
index fef109d35..d4ca321a9 100644
--- 
a/backends-clickhouse/src/main/delta-20/org/apache/spark/sql/execution/datasources/v2/clickhouse/source/DeltaMergeTreeFileFormat.scala
+++ 
b/backends-clickhouse/src/main/delta-20/org/apache/spark/sql/execution/datasources/v2/clickhouse/source/DeltaMergeTreeFileFormat.scala
@@ -17,7 +17,6 @@
 package org.apache.spark.sql.execution.datasources.v2.clickhouse.source
 
 import org.apache.spark.sql.SparkSession
-import org.apache.spark.sql.catalyst.expressions.Attribute
 import org.apache.spark.sql.delta.DeltaParquetFileFormat
 import org.apache.spark.sql.delta.actions.Metadata
 import org.apache.spark.sql.execution.datasources.{OutputWriter, 
OutputWriterFactory}
@@ -31,9 +30,11 @@ class DeltaMergeTreeFileFormat(metadata: Metadata)
 
   protected var database = ""
   protected var tableName = ""
-  protected var dataSchemas = Seq.empty[Attribute]
   protected var orderByKeyOption: Option[Seq[String]] = None
   protected var lowCardKeyOption: Option[Seq[String]] = None
+  protected var minmaxIndexKeyOption: Option[Seq[String]] = None
+  protected var bfIndexKeyOption: Option[Seq[String]] = None
+  protected var setIndexKeyOption: Option[Seq[String]] = None
   protected var primaryKeyOption: Option[Seq[String]] = None
   protected var partitionColumns: Seq[String] = Seq.empty[String]
   protected var clickhouseTableConfigs: Map[String, String] = Map.empty
@@ -42,18 +43,22 @@ class DeltaMergeTreeFileFormat(metadata: Metadata)
   metadata: Metadata,
   database: String,
   tableName: String,
-  schemas: Seq[Attribute],
   orderByKeyOption: Option[Seq[String]],
   lowCardKeyOption: Option[Seq[String]],
+  minmaxIndexKeyOption: Option[Seq[String]],
+  bfIndexKeyOption: Option[Seq[String]],
+  setIndexKeyOption: Option[Seq[String]],
   primaryKeyOption: Option[Seq[String]],
   clickhouseTableConfigs: Map[String, String],
   partitionColumns: Seq[String]) {
 this(metadata)
 this.database = database
 this.tableName = tableName
-this.dataSchemas = schemas
 this.orderByKeyOption = orderByKeyOption
 this.lowCardKeyOption = lowCardKeyOption
+this.minmaxIndexKeyOption = minmaxIndexKeyOption
+this.bfIndexKeyOption = bfIndexKeyOption
+this.setIndexKeyOption = setIndexKeyOption
 

Re: [I] [CH] basically support set/bloomfilter/minmax index for clickhouse tables [incubator-gluten]

2024-03-26 Thread via GitHub


binmahone closed issue #5018: [CH] basically support set/bloomfilter/minmax 
index for clickhouse tables
URL: https://github.com/apache/incubator-gluten/issues/5018


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CH] Issue 5018 [incubator-gluten]

2024-03-26 Thread via GitHub


binmahone merged PR #5019:
URL: https://github.com/apache/incubator-gluten/pull/5019


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



(incubator-gluten) branch main updated: [VL] Enable SPARK-10634 timestamp test case (#5090)

2024-03-26 Thread rui
This is an automated email from the ASF dual-hosted git repository.

rui pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
 new b962e7cc7 [VL] Enable SPARK-10634 timestamp test case (#5090)
b962e7cc7 is described below

commit b962e7cc74f7a7114770e9a882f10d5eaa59a355
Author: Joey 
AuthorDate: Wed Mar 27 09:32:41 2024 +0800

[VL] Enable SPARK-10634 timestamp test case (#5090)
---
 .../src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala | 2 --
 .../src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala | 2 --
 .../src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala | 2 --
 3 files changed, 6 deletions(-)

diff --git 
a/gluten-ut/spark32/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
 
b/gluten-ut/spark32/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
index 5f66df1a0..2d92c5ca2 100644
--- 
a/gluten-ut/spark32/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
+++ 
b/gluten-ut/spark32/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
@@ -857,7 +857,6 @@ class VeloxTestSettings extends BackendTestSettings {
 // decimal failed ut
 .exclude("SPARK-34212 Parquet should read decimals correctly")
 // Timestamp is read as INT96.
-.exclude("SPARK-10634 timestamp written and read as INT64 - truncation")
 .exclude("Migration from INT96 to TIMESTAMP_MICROS timestamp type")
 .exclude("SPARK-10365 timestamp written and read as INT64 - 
TIMESTAMP_MICROS")
 // Rewrite because the filter after datasource is not needed.
@@ -869,7 +868,6 @@ class VeloxTestSettings extends BackendTestSettings {
 // decimal failed ut
 .exclude("SPARK-34212 Parquet should read decimals correctly")
 // Timestamp is read as INT96.
-.exclude("SPARK-10634 timestamp written and read as INT64 - truncation")
 .exclude("Migration from INT96 to TIMESTAMP_MICROS timestamp type")
 .exclude("SPARK-10365 timestamp written and read as INT64 - 
TIMESTAMP_MICROS")
 // Rewrite because the filter after datasource is not needed.
diff --git 
a/gluten-ut/spark33/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
 
b/gluten-ut/spark33/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
index f2e75f84f..dd14a604b 100644
--- 
a/gluten-ut/spark33/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
+++ 
b/gluten-ut/spark33/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
@@ -682,7 +682,6 @@ class VeloxTestSettings extends BackendTestSettings {
 // decimal failed ut
 .exclude("SPARK-34212 Parquet should read decimals correctly")
 // Timestamp is read as INT96.
-.exclude("SPARK-10634 timestamp written and read as INT64 - truncation")
 .exclude("Migration from INT96 to TIMESTAMP_MICROS timestamp type")
 .exclude("SPARK-10365 timestamp written and read as INT64 - 
TIMESTAMP_MICROS")
 .exclude("SPARK-36182: read TimestampNTZ as TimestampLTZ")
@@ -698,7 +697,6 @@ class VeloxTestSettings extends BackendTestSettings {
 // decimal failed ut
 .exclude("SPARK-34212 Parquet should read decimals correctly")
 // Timestamp is read as INT96.
-.exclude("SPARK-10634 timestamp written and read as INT64 - truncation")
 .exclude("Migration from INT96 to TIMESTAMP_MICROS timestamp type")
 .exclude("SPARK-10365 timestamp written and read as INT64 - 
TIMESTAMP_MICROS")
 .exclude("SPARK-36182: read TimestampNTZ as TimestampLTZ")
diff --git 
a/gluten-ut/spark34/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
 
b/gluten-ut/spark34/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
index 1c37e787b..d2555007b 100644
--- 
a/gluten-ut/spark34/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
+++ 
b/gluten-ut/spark34/src/test/scala/io/glutenproject/utils/velox/VeloxTestSettings.scala
@@ -668,7 +668,6 @@ class VeloxTestSettings extends BackendTestSettings {
 // decimal failed ut
 .exclude("SPARK-34212 Parquet should read decimals correctly")
 // Timestamp is read as INT96.
-.exclude("SPARK-10634 timestamp written and read as INT64 - truncation")
 .exclude("Migration from INT96 to TIMESTAMP_MICROS timestamp type")
 .exclude("SPARK-10365 timestamp written and read as INT64 - 
TIMESTAMP_MICROS")
 .exclude("SPARK-36182: read TimestampNTZ as TimestampLTZ")
@@ -684,7 +683,6 @@ class VeloxTestSettings extends BackendTestSettings {
 // decimal failed ut
 .exclude("SPARK-34212 Parquet should read decimals correctly")
 // Timestamp is read as INT96.
-.exclude("SPARK-10634 timestamp written and read as INT64 - truncation")
 .exclude("Migration from INT96 to TIMESTAMP_MICROS timestamp type")
 .exclude("SPARK-10365 timestamp written and read as INT64 - 
TIMESTAMP_MICROS")
 

Re: [PR] [VL] Enable SPARK-10634 timestamp test case [incubator-gluten]

2024-03-26 Thread via GitHub


rui-mo merged PR #5090:
URL: https://github.com/apache/incubator-gluten/pull/5090


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] Support YearMonthIntervalType and enable make_ym_interval [incubator-gluten]

2024-03-26 Thread via GitHub


marin-ma commented on PR #4798:
URL: 
https://github.com/apache/incubator-gluten/pull/4798#issuecomment-2021758296

   @zzcclp CH CI passed. Could you help to review? Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [DNM] Velox test [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #4929:
URL: 
https://github.com/apache/incubator-gluten/pull/4929#issuecomment-2021752956

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



(incubator-gluten) branch main updated: [CORE] Move BackendBuildInfo case class from GlutenPlugin to Backend class file (#5129)

2024-03-26 Thread ulyssesyou
This is an automated email from the ASF dual-hosted git repository.

ulyssesyou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
 new 6dc7885f6 [CORE] Move BackendBuildInfo case class from GlutenPlugin to 
Backend class file (#5129)
6dc7885f6 is described below

commit 6dc7885f6c54f4ea0f773e920fb455e09298b3b7
Author: Zhen Wang <643348...@qq.com>
AuthorDate: Wed Mar 27 09:15:13 2024 +0800

[CORE] Move BackendBuildInfo case class from GlutenPlugin to Backend class 
file (#5129)
---
 .../io/glutenproject/backendsapi/clickhouse/CHBackend.scala|  6 +++---
 .../io/glutenproject/backendsapi/velox/VeloxBackend.scala  |  6 +++---
 gluten-core/src/main/scala/io/glutenproject/GlutenPlugin.scala |  6 --
 .../src/main/scala/io/glutenproject/backendsapi/Backend.scala  | 10 +++---
 .../io/glutenproject/backendsapi/BackendsApiManager.scala  |  4 +---
 5 files changed, 14 insertions(+), 18 deletions(-)

diff --git 
a/backends-clickhouse/src/main/scala/io/glutenproject/backendsapi/clickhouse/CHBackend.scala
 
b/backends-clickhouse/src/main/scala/io/glutenproject/backendsapi/clickhouse/CHBackend.scala
index fbcb804a3..a7c5c9980 100644
--- 
a/backends-clickhouse/src/main/scala/io/glutenproject/backendsapi/clickhouse/CHBackend.scala
+++ 
b/backends-clickhouse/src/main/scala/io/glutenproject/backendsapi/clickhouse/CHBackend.scala
@@ -16,7 +16,7 @@
  */
 package io.glutenproject.backendsapi.clickhouse
 
-import io.glutenproject.{CH_BRANCH, CH_COMMIT, GlutenConfig, GlutenPlugin}
+import io.glutenproject.{CH_BRANCH, CH_COMMIT, GlutenConfig}
 import io.glutenproject.backendsapi._
 import io.glutenproject.expression.WindowFunctionsBuilder
 import io.glutenproject.extension.ValidationResult
@@ -41,8 +41,8 @@ import scala.util.control.Breaks.{break, breakable}
 
 class CHBackend extends Backend {
   override def name(): String = CHBackend.BACKEND_NAME
-  override def buildInfo(): GlutenPlugin.BackendBuildInfo =
-GlutenPlugin.BackendBuildInfo("ClickHouse", CH_BRANCH, CH_COMMIT, 
"UNKNOWN")
+  override def buildInfo(): BackendBuildInfo =
+BackendBuildInfo("ClickHouse", CH_BRANCH, CH_COMMIT, "UNKNOWN")
   override def iteratorApi(): IteratorApi = new CHIteratorApi
   override def sparkPlanExecApi(): SparkPlanExecApi = new CHSparkPlanExecApi
   override def transformerApi(): TransformerApi = new CHTransformerApi
diff --git 
a/backends-velox/src/main/scala/io/glutenproject/backendsapi/velox/VeloxBackend.scala
 
b/backends-velox/src/main/scala/io/glutenproject/backendsapi/velox/VeloxBackend.scala
index 0ff2bd0d7..3293abe3e 100644
--- 
a/backends-velox/src/main/scala/io/glutenproject/backendsapi/velox/VeloxBackend.scala
+++ 
b/backends-velox/src/main/scala/io/glutenproject/backendsapi/velox/VeloxBackend.scala
@@ -16,7 +16,7 @@
  */
 package io.glutenproject.backendsapi.velox
 
-import io.glutenproject.{GlutenConfig, GlutenPlugin, VELOX_BRANCH, 
VELOX_REVISION, VELOX_REVISION_TIME}
+import io.glutenproject.{GlutenConfig, VELOX_BRANCH, VELOX_REVISION, 
VELOX_REVISION_TIME}
 import io.glutenproject.backendsapi._
 import io.glutenproject.exception.GlutenNotSupportException
 import io.glutenproject.execution.WriteFilesExecTransformer
@@ -44,8 +44,8 @@ import scala.util.control.Breaks.breakable
 
 class VeloxBackend extends Backend {
   override def name(): String = VeloxBackend.BACKEND_NAME
-  override def buildInfo(): GlutenPlugin.BackendBuildInfo =
-GlutenPlugin.BackendBuildInfo("Velox", VELOX_BRANCH, VELOX_REVISION, 
VELOX_REVISION_TIME)
+  override def buildInfo(): BackendBuildInfo =
+BackendBuildInfo("Velox", VELOX_BRANCH, VELOX_REVISION, 
VELOX_REVISION_TIME)
   override def iteratorApi(): IteratorApi = new IteratorApiImpl
   override def sparkPlanExecApi(): SparkPlanExecApi = new SparkPlanExecApiImpl
   override def transformerApi(): TransformerApi = new TransformerApiImpl
diff --git a/gluten-core/src/main/scala/io/glutenproject/GlutenPlugin.scala 
b/gluten-core/src/main/scala/io/glutenproject/GlutenPlugin.scala
index c54b78da9..5fa3083c2 100644
--- a/gluten-core/src/main/scala/io/glutenproject/GlutenPlugin.scala
+++ b/gluten-core/src/main/scala/io/glutenproject/GlutenPlugin.scala
@@ -278,10 +278,4 @@ private[glutenproject] object GlutenPlugin {
   implicit def sparkConfImplicit(conf: SparkConf): SparkConfImplicits = {
 new SparkConfImplicits(conf)
   }
-
-  case class BackendBuildInfo(
-  backend: String,
-  backendBranch: String,
-  backendRevision: String,
-  backendRevisionTime: String)
 }
diff --git 
a/gluten-core/src/main/scala/io/glutenproject/backendsapi/Backend.scala 
b/gluten-core/src/main/scala/io/glutenproject/backendsapi/Backend.scala
index 438194a36..09799cdb1 100644
--- a/gluten-core/src/main/scala/io/glutenproject/backendsapi/Backend.scala
+++ b/gluten-core/src/main/scala/io/glutenproject/backendsapi/Backend.scala
@@ -16,12 +16,10 @@
  */
 

(incubator-gluten) branch main updated: [GLUTEN-5133]Modify the prompt information for TakeOrderedAndProjectExecTransformer (#5134)

2024-03-26 Thread ulyssesyou
This is an automated email from the ASF dual-hosted git repository.

ulyssesyou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
 new 5b8b96e25 [GLUTEN-5133]Modify the prompt information for 
TakeOrderedAndProjectExecTransformer (#5134)
5b8b96e25 is described below

commit 5b8b96e2541525544ba1e80c957a2bd8c5c1e95b
Author: guixiaowen <58287738+guixiao...@users.noreply.github.com>
AuthorDate: Wed Mar 27 09:14:48 2024 +0800

[GLUTEN-5133]Modify the prompt information for 
TakeOrderedAndProjectExecTransformer (#5134)
---
 .../glutenproject/execution/TakeOrderedAndProjectExecTransformer.scala  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/gluten-core/src/main/scala/io/glutenproject/execution/TakeOrderedAndProjectExecTransformer.scala
 
b/gluten-core/src/main/scala/io/glutenproject/execution/TakeOrderedAndProjectExecTransformer.scala
index 0f0137b5d..f7b1fe2f4 100644
--- 
a/gluten-core/src/main/scala/io/glutenproject/execution/TakeOrderedAndProjectExecTransformer.scala
+++ 
b/gluten-core/src/main/scala/io/glutenproject/execution/TakeOrderedAndProjectExecTransformer.scala
@@ -49,7 +49,7 @@ case class TakeOrderedAndProjectExecTransformer(
 val orderByString = truncatedString(sortOrder, "[", ",", "]", maxFields)
 val outputString = truncatedString(output, "[", ",", "]", maxFields)
 
-s"TakeOrderedAndProjectExecTransform(limit=$limit, " +
+s"TakeOrderedAndProjectExecTransformer (limit=$limit, " +
   s"orderBy=$orderByString, output=$outputString)"
   }
 


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [I] [GLUTEN-5133]Modify the prompt information for TakeOrderedAndProjectExecTransformer [incubator-gluten]

2024-03-26 Thread via GitHub


ulysses-you closed issue #5133: [GLUTEN-5133]Modify the prompt information for 
TakeOrderedAndProjectExecTransformer
URL: https://github.com/apache/incubator-gluten/issues/5133


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5133]Modify the prompt information for TakeOrderedAndProjectE… [incubator-gluten]

2024-03-26 Thread via GitHub


ulysses-you merged PR #5134:
URL: https://github.com/apache/incubator-gluten/pull/5134


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Move BackendBuildInfo case class from GlutenPlugin to Backend class file [incubator-gluten]

2024-03-26 Thread via GitHub


ulysses-you merged PR #5129:
URL: https://github.com/apache/incubator-gluten/pull/5129


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL][DNM]Test Q95 post probe spill [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5063:
URL: 
https://github.com/apache/incubator-gluten/pull/5063#issuecomment-2021746292

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5083][CH] Invalid result with `mergeTwoPhasesHashBaseAggregateIfNeed` enable [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5137:
URL: 
https://github.com/apache/incubator-gluten/pull/5137#issuecomment-2021735086

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5083][CH] Invalid result with `mergeTwoPhasesHashBaseAggregateIfNeed` enable [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5137:
URL: 
https://github.com/apache/incubator-gluten/pull/5137#issuecomment-2021734843

   https://github.com/apache/incubator-gluten/issues/5083


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Support JDK17 [incubator-gluten]

2024-03-26 Thread via GitHub


ulysses-you commented on code in PR #5120:
URL: https://github.com/apache/incubator-gluten/pull/5120#discussion_r1540297481


##
docs/get-started/Velox.md:
##
@@ -5,28 +5,34 @@ nav_order: 1
 parent: Getting-Started
 ---
 # Supported Version
-| Type  | Version  |
-|---|--|
-| Spark | 3.2.2, 3.3.1 |
-| OS| Ubuntu20.04/22.04, Centos7/8 |
-| jdk   | openjdk8 |
-| scala | 2.12
 
-Spark3.4.0 support is still WIP. TPCH/DS can pass, UT is not yet passed.
+| Type  | Version |
+|---|-|
+| Spark | 3.2.2, 3.3.1, 3.4.2, 3.5.1(wip) |
+| OS| Ubuntu20.04/22.04, Centos7/8|
+| jdk   | openjdk8/jdk17  |
+| scala | 2.12|
 
-There are pending PRs for jdk11 support.
+**JDK17**

Review Comment:
   I moved it to `NewToGluten.md`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Support JDK17 [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5120:
URL: 
https://github.com/apache/incubator-gluten/pull/5120#issuecomment-2021727351

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



[I] [VL] Vanilla Spark broadcast exchange + C2R is slow sometimes [incubator-gluten]

2024-03-26 Thread via GitHub


zhztheplayer opened a new issue, #5136:
URL: https://github.com/apache/incubator-gluten/issues/5136

   ### Backend
   
   VL (Velox)
   
   ### Bug description
   
   This is because the code to convert vanilla Spark's hashed relation to 
Gluten's sometimes produced duplicated rows.
   
   The fix will be incorporated in 
https://github.com/apache/incubator-gluten/pull/5058 since it can be tested by 
the ACBO changes.
   
   ### Spark version
   
   None
   
   ### Spark configurations
   
   _No response_
   
   ### System information
   
   _No response_
   
   ### Relevant logs
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Basic runnable version of ACBO (Advanced CBO) [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5058:
URL: 
https://github.com/apache/incubator-gluten/pull/5058#issuecomment-2021706928

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



(incubator-gluten) branch main updated: [VL] Velox patch to avoid installing libunwind-dev no longer works (#5127)

2024-03-26 Thread hongze
This is an automated email from the ASF dual-hosted git repository.

hongze pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
 new 2aa60d0ea [VL] Velox patch to avoid installing libunwind-dev no longer 
works (#5127)
2aa60d0ea is described below

commit 2aa60d0eae8fdd0f4020842c5233ca8a3197bd5e
Author: Hongze Zhang 
AuthorDate: Wed Mar 27 08:26:33 2024 +0800

[VL] Velox patch to avoid installing libunwind-dev no longer works (#5127)
---
 ep/build-velox/src/get_velox.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ep/build-velox/src/get_velox.sh b/ep/build-velox/src/get_velox.sh
index 767585e91..26e7a9cd0 100755
--- a/ep/build-velox/src/get_velox.sh
+++ b/ep/build-velox/src/get_velox.sh
@@ -86,7 +86,7 @@ function process_setup_ubuntu {
   # need set BUILD_SHARED_LIBS flag for thrift
   sed -i  "/facebook\/fbthrift/{n;s/cmake_install 
-DBUILD_TESTS=OFF/cmake_install -DBUILD_TESTS=OFF -DBUILD_SHARED_LIBS=OFF/;}" 
scripts/setup-ubuntu.sh
   # Do not install libunwind which can cause interruption when catching native 
exception.
-  sed -i 's/sudo --preserve-env apt install -y libunwind-dev && //' 
scripts/setup-ubuntu.sh
+  sed -i 's/${SUDO} apt install -y libunwind-dev//' scripts/setup-ubuntu.sh
   sed -i '/ccache/a\  *thrift* \\' scripts/setup-ubuntu.sh
   sed -i '/ccache/a\  libiberty-dev \\' scripts/setup-ubuntu.sh
   sed -i '/ccache/a\  libxml2-dev \\' scripts/setup-ubuntu.sh


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] Velox patch to avoid installing libunwind-dev no longer works [incubator-gluten]

2024-03-26 Thread via GitHub


zhztheplayer merged PR #5127:
URL: https://github.com/apache/incubator-gluten/pull/5127


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] Velox patch to avoid installing libunwind-dev no longer works [incubator-gluten]

2024-03-26 Thread via GitHub


zhztheplayer commented on PR #5127:
URL: 
https://github.com/apache/incubator-gluten/pull/5127#issuecomment-2021704715

   cc @PHILO-HE 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL][MINOR] Refactor operator/function tests [incubator-gluten]

2024-03-26 Thread via GitHub


GlutenPerfBot commented on PR #5037:
URL: 
https://github.com/apache/incubator-gluten/pull/5037#issuecomment-2021667403

   = Performance report for TPCH SF2000 with Velox backend, for reference 
only 
   
   
   
   query
   log/native_master_03_26_2024_time.csv
   log/native_master_03_25_2024_2ce826995_time.csv
   difference
   percentage
   
   
   q1
   38.64
   34.62
   -4.022
   89.59%
   
   
   q2
   24.52
   23.57
   -0.953
   96.11%
   
   
   q3
   38.16
   36.77
   -1.391
   96.35%
   
   
   q4
   39.39
   38.42
   -0.974
   97.53%
   
   
   q5
   69.01
   67.14
   -1.866
   97.30%
   
   
   q6
   5.96
   7.58
   1.616
   127.11%
   
   
   q7
   82.13
   85.17
   3.034
   103.69%
   
   
   q8
   84.26
   86.22
   1.961
   102.33%
   
   
   q9
   119.85
   121.73
   1.884
   101.57%
   
   
   q10
   43.07
   44.57
   1.492
   103.46%
   
   
   q11
   20.06
   20.62
   0.562
   102.80%
   
   
   q12
   26.89
   27.49
   0.598
   102.22%
   
   
   q13
   47.06
   46.45
   -0.604
   98.72%
   
   
   q14
   21.39
   17.85
   -3.538
   83.46%
   
   
   q15
   30.98
   29.13
   -1.854
   94.02%
   
   
   q16
   13.81
   15.29
   1.474
   110.67%
   
   
   q17
   101.16
   98.96
   -2.201
   97.82%
   
   
   q18
   141.47
   141.61
   0.146
   100.10%
   
   
   q19
   13.65
   13.63
   -0.017
   99.87%
   
   
 

Re: [PR] [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240327) [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5135:
URL: 
https://github.com/apache/incubator-gluten/pull/5135#issuecomment-2021611455

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



[PR] [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240327) [incubator-gluten]

2024-03-26 Thread via GitHub


lwz9103 opened a new pull request, #5135:
URL: https://github.com/apache/incubator-gluten/pull/5135

   Auto commit by gluten daily build, please check the build status and merge 
it if it's green.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240327) [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5135:
URL: 
https://github.com/apache/incubator-gluten/pull/5135#issuecomment-2021611241

   https://github.com/apache/incubator-gluten/issues/1632


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [I] Crash when writing an array of struct [incubator-gluten]

2024-03-26 Thread via GitHub


clee704 commented on issue #4964:
URL: 
https://github.com/apache/incubator-gluten/issues/4964#issuecomment-2021599373

   @JkSelf Actually it crashes on Spark 3.4 too.
   
   #
   # A fatal error has been detected by the Java Runtime Environment:
   #
   #  SIGSEGV (0xb) at pc=0x7fa4a369e2c5, pid=915090, tid=915112
   #
   # JRE version: OpenJDK Runtime Environment (11.0.22+7) (build 
11.0.22+7-post-Ubuntu-0ubuntu222.04.1)
   # Java VM: OpenJDK 64-Bit Server VM (11.0.22+7-post-Ubuntu-0ubuntu222.04.1, 
mixed mode, tiered, g1 gc, linux-amd64)
   # Problematic frame:
   # C  [libvelox.so+0x229e2c5]  (anonymous 
namespace)::makeRowVector(std::vector,
 std::allocator > > const&)+0xa5
   #
   # Core dump will be written. Default location: Core dumps may be processed 
with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or 
dumping to /ssd/chungmin/repos/spark/core.915090)
   #
   # An error report file with more information is saved as:
   # /ssd/chungmin/repos/spark/hs_err_pid915090.log
   #
   # If you would like to submit a bug report, please visit:
   #   https://bugs.launchpad.net/ubuntu/+source/openjdk-lts
   # The crash happened outside the Java Virtual Machine in native code.
   # See problematic frame for where to report the bug.
   #
   
   Spark 3.4.2
   Gluten 58a459bf487120208a774d7959f7c7db417f490b


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5133]Modify the prompt information for TakeOrderedAndProjectE… [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5134:
URL: 
https://github.com/apache/incubator-gluten/pull/5134#issuecomment-2020923742

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5133]Modify the prompt information for TakeOrderedAndProjectE… [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5134:
URL: 
https://github.com/apache/incubator-gluten/pull/5134#issuecomment-2020922735

   https://github.com/apache/incubator-gluten/issues/5133


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



[PR] [GLUTEN-5133]Modify the prompt information for TakeOrderedAndProjectE… [incubator-gluten]

2024-03-26 Thread via GitHub


guixiaowen opened a new pull request, #5134:
URL: https://github.com/apache/incubator-gluten/pull/5134

   …xecTransformer #5133
   
   ## What changes were proposed in this pull request?
   
   In TakeOrderedAndProjectExecTransformer, the prompt information is different 
from others.
   
   For example:
   
   spark-sql>explain select a from test.tablea order by a limit 5
   plan
   == Physical Plan ==
   VeloxColumnarToRowExec
   +- TakeOrderedAndProjectExecTransform(limit=5, orderBy=[a#13 ASC NULLS 
FIRST], output=[a#13])
   +- ^(3) NativeScan hive test.tablea [a#13], HiveTableRelation [test.tablea, 
org.apache.hadoop.hive.ql.io.orc.OrcSerde, Data Cols: [a#13], Partition Cols: 
[]]
   
   "TakeOrderedAndProjectExecTransform" is changed to 
""TakeOrderedAndProjectExecTransformer, which will be consistent with other 
enhanced information styles.
   
   (Fixes: \#5133)
   
   
   After this pr:
   
   spark-sql>explain select a from test.tablea order by a limit 5
   plan
   == Physical Plan ==
   VeloxColumnarToRowExec
   +- TakeOrderedAndProjectExecTransformer (limit=5, orderBy=[a#13 ASC NULLS 
FIRST], output=[a#13])
   +- ^(3) NativeScan hive test.tablea [a#13], HiveTableRelation [test.tablea, 
org.apache.hadoop.hive.ql.io.orc.OrcSerde, Data Cols: [a#13], Partition Cols: 
[]]
   
   ## How was this patch tested?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



[I] Modify the prompt information for TakeOrderedAndProjectExecTransformer [incubator-gluten]

2024-03-26 Thread via GitHub


guixiaowen opened a new issue, #5133:
URL: https://github.com/apache/incubator-gluten/issues/5133

   ### Description
   
   In TakeOrderedAndProjectExecTransformer, the prompt information is different 
from others. 
   
   For example:
   
   spark-sql>explain select a from test.tablea  order by a limit 5
   plan
   == Physical Plan ==
   VeloxColumnarToRowExec
   +- TakeOrderedAndProjectExecTransform(limit=5, orderBy=[a#13 ASC NULLS 
FIRST], output=[a#13])
  +- ^(3) NativeScan hive test.tablea [a#13], HiveTableRelation 
[`test`.`tablea`, org.apache.hadoop.hive.ql.io.orc.OrcSerde, Data Cols: [a#13], 
Partition Cols: []]
   
   
   "TakeOrderedAndProjectExecTransform"  is changed to 
""TakeOrderedAndProjectExecTransformer, which will be consistent with other 
enhanced information styles.
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



[I] Modify the prompt information for TakeOrderedAndProjectExecTransformer's simpleString [incubator-gluten]

2024-03-26 Thread via GitHub


guixiaowen opened a new issue, #5132:
URL: https://github.com/apache/incubator-gluten/issues/5132

   ### Description
   
   In TakeOrderedAndProjectExecTransformer, the prompt information is different 
from others. 
   
   For example:
   
   spark-sql>explain select a from test.tablea  order by a limit 5
   plan
   == Physical Plan ==
   VeloxColumnarToRowExec
   +- TakeOrderedAndProjectExecTransform(limit=5, orderBy=[a#13 ASC NULLS 
FIRST], output=[a#13])
  +- ^(3) NativeScan hive test.tablea [a#13], HiveTableRelation 
[`test`.`tablea`, org.apache.hadoop.hive.ql.io.orc.OrcSerde, Data Cols: [a#13], 
Partition Cols: []]
   
   
   "TakeOrderedAndProjectExecTransform"  is changed to 
""TakeOrderedAndProjectExecTransformer, which will be consistent with other 
enhanced information styles.
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] Support YearMonthIntervalType and enable make_ym_interval [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #4798:
URL: 
https://github.com/apache/incubator-gluten/pull/4798#issuecomment-2020860684

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [I] [VL] Unsupported spark function list [please leave a comment if you plan to pick some] [incubator-gluten]

2024-03-26 Thread via GitHub


supermem613 commented on issue #4039:
URL: 
https://github.com/apache/incubator-gluten/issues/4039#issuecomment-2020700016

   I'd like to pick up base64 and unbase64, please. 
   
   (FYI, looks like there was a PR above for unbase64, but it seems to have 
been closed without committing ~45-55 days ago, so hopefully I am not 
conflicting with any work).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5123][INFRA]set up java and maven according to os in build_bundle_package.yml [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5124:
URL: 
https://github.com/apache/incubator-gluten/pull/5124#issuecomment-2020688726

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Enable second Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5131:
URL: 
https://github.com/apache/incubator-gluten/pull/5131#issuecomment-2020597809

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [I] [VL] spark.read.csv("/tmp/test.csv") throws Exception [incubator-gluten]

2024-03-26 Thread via GitHub


xumingming commented on issue #5044:
URL: 
https://github.com/apache/incubator-gluten/issues/5044#issuecomment-2020579428

   @PHILO-HE Thanks for the information! I tried with parquet data(nation table 
in TPCH), the details are the following:
   
   ```
   == Fallback Summary ==
   (4) Project: Not supported to map spark function name to substrait function 
name: toprettystring(n_nationkey#23, Some(Asia/Shanghai)), class name: 
ToPrettyString.
   (5) CollectLimit: Gluten does not touch it or does not support it
   
   == Physical Plan ==
   CollectLimit (5)
   +- Project (4)
  +- VeloxColumnarToRowExec (3)
 +- ^ Scan parquet  (1)
   ```
   
   Is the fallback for `Project` expected?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL][CI] Enable Celeborn tests & Gluten CPP tests [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5114:
URL: 
https://github.com/apache/incubator-gluten/pull/5114#issuecomment-2020420744

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5123][INFRA]set up java and maven according to os in build_bundle_package.yml [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5124:
URL: 
https://github.com/apache/incubator-gluten/pull/5124#issuecomment-2020416623

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Support JDK17 [incubator-gluten]

2024-03-26 Thread via GitHub


PHILO-HE commented on code in PR #5120:
URL: https://github.com/apache/incubator-gluten/pull/5120#discussion_r1539216963


##
.github/workflows/velox_docker.yml:
##
@@ -73,6 +73,17 @@ jobs:
   matrix:
 os: ["ubuntu:20.04", "ubuntu:22.04"]
 spark: ["spark-3.2", "spark-3.3", "spark-3.4", "spark-3.5"]
+java: [ "java-8", "java-17" ]
+# Spark supports JDK17 since 3.3 and later, see 
https://issues.apache.org/jira/browse/SPARK-33772
+exclude:

Review Comment:
   Looks `include` cannot make it concise. Please ignore this comment. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Enable second Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5131:
URL: 
https://github.com/apache/incubator-gluten/pull/5131#issuecomment-2020391322

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Enable second Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5131:
URL: 
https://github.com/apache/incubator-gluten/pull/5131#issuecomment-2020388375

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Enable second Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5131:
URL: 
https://github.com/apache/incubator-gluten/pull/5131#issuecomment-2020371865

   
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on Github Issues?
   
   https://github.com/apache/incubator-gluten/issues
   
   Then could you also rename ***commit message*** and ***pull request title*** 
in the following format?
   
   [GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}
   
   See also:
   
 * [Other pull requests](https://github.com/apache/incubator-gluten/pulls/)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Support JDK17 [incubator-gluten]

2024-03-26 Thread via GitHub


PHILO-HE commented on code in PR #5120:
URL: https://github.com/apache/incubator-gluten/pull/5120#discussion_r1539159436


##
.github/workflows/velox_docker.yml:
##
@@ -84,69 +95,49 @@ jobs:
   path: ./cpp/build/releases
   - name: Setup java and maven
 run: |
-  apt-get update && \
-  apt-get install -y openjdk-8-jdk maven && \
+  if [ "${{ matrix.java }}" = "java-17" ]; then
+apt-get update && apt-get install -y openjdk-17-jdk maven
+  else
+apt-get update && apt-get install -y openjdk-8-jdk maven
+  fi
   apt remove openjdk-11* -y
-  - name: Build for Spark ${{ matrix.spark }}
-run: |
-  cd $GITHUB_WORKSPACE/ && \
-  mvn clean install -P${{ matrix.spark }} -Pbackends-velox -DskipTests
-  - name: Build and run TPCH/DS ${{ matrix.spark }}
-run: |
-  cd $GITHUB_WORKSPACE/tools/gluten-it && \
-  mvn clean install -P${{ matrix.spark }} \
-  && GLUTEN_IT_JVM_ARGS=-Xmx5G sbin/gluten-it.sh queries-compare \
---local --preset=velox --benchmark-type=h --error-on-memleak 
--off-heap-size=10g -s=1.0 --threads=16 --iterations=1 \
-  && GLUTEN_IT_JVM_ARGS=-Xmx5G sbin/gluten-it.sh queries-compare \
---local --preset=velox --benchmark-type=ds --error-on-memleak 
--off-heap-size=10g -s=1.0 --threads=16 --iterations=1
-
-
-  run-tpc-test-centos7:
-needs: build-native-lib
-strategy:
-  fail-fast: false
-  matrix:
-spark: ["spark-3.2", "spark-3.3", "spark-3.4", "spark-3.5"]
-runs-on: ubuntu-20.04
-container: centos:7
-steps:
-  - uses: actions/checkout@v2
-  - name: Download All Artifacts
-uses: actions/download-artifact@v2
-with:
-  name: velox-native-lib-${{github.sha}}
-  path: ./cpp/build/releases
-  - name: Setup java and maven
-run: |
-  yum update -y && yum install -y java-1.8.0-openjdk-devel wget
-  wget 
https://downloads.apache.org/maven/maven-3/3.8.8/binaries/apache-maven-3.8.8-bin.tar.gz
-  tar -xvf apache-maven-3.8.8-bin.tar.gz
-  mv apache-maven-3.8.8 /usr/lib/maven
-  - name: Build for Spark ${{ matrix.spark }}
+  - name: Build and run TPCH/DS
 run: |
   cd $GITHUB_WORKSPACE/
-  export MAVEN_HOME=/usr/lib/maven
-  export PATH=${PATH}:${MAVEN_HOME}/bin
-  mvn clean install -P${{ matrix.spark }} -Pbackends-velox -DskipTests
-  - name: Build and run TPCH/DS ${{ matrix.spark }}
-run: |
-  cd $GITHUB_WORKSPACE/tools/gluten-it 
-  export MAVEN_HOME=/usr/lib/maven
-  export PATH=${PATH}:${MAVEN_HOME}/bin
-  mvn clean install -P${{ matrix.spark }} \
+  if [ "${{ matrix.java }}" = "java-17" ]; then
+export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64

Review Comment:
   Nit:
   export JAVA_HOME=/usr/lib/jvm/${{ matrix.java }}-openjdk-amd64



##
.github/workflows/velox_docker.yml:
##
@@ -73,6 +73,17 @@ jobs:
   matrix:
 os: ["ubuntu:20.04", "ubuntu:22.04"]
 spark: ["spark-3.2", "spark-3.3", "spark-3.4", "spark-3.5"]
+java: [ "java-8", "java-17" ]
+# Spark supports JDK17 since 3.3 and later, see 
https://issues.apache.org/jira/browse/SPARK-33772
+exclude:

Review Comment:
   Nit: maybe, better to use `include` for simplicity.



##
docs/get-started/Velox.md:
##
@@ -5,28 +5,34 @@ nav_order: 1
 parent: Getting-Started
 ---
 # Supported Version
-| Type  | Version  |
-|---|--|
-| Spark | 3.2.2, 3.3.1 |
-| OS| Ubuntu20.04/22.04, Centos7/8 |
-| jdk   | openjdk8 |
-| scala | 2.12
 
-Spark3.4.0 support is still WIP. TPCH/DS can pass, UT is not yet passed.
+| Type  | Version |
+|---|-|
+| Spark | 3.2.2, 3.3.1, 3.4.2, 3.5.1(wip) |
+| OS| Ubuntu20.04/22.04, Centos7/8|
+| jdk   | openjdk8/jdk17  |
+| scala | 2.12|
 
-There are pending PRs for jdk11 support.
+**JDK17**

Review Comment:
   Maybe, better to document this part in a common place, as it is not specific 
to Velox backend.



##
.github/workflows/velox_docker.yml:
##
@@ -84,69 +95,49 @@ jobs:
   path: ./cpp/build/releases
   - name: Setup java and maven
 run: |
-  apt-get update && \
-  apt-get install -y openjdk-8-jdk maven && \
+  if [ "${{ matrix.java }}" = "java-17" ]; then
+apt-get update && apt-get install -y openjdk-17-jdk maven
+  else
+apt-get update && apt-get install -y openjdk-8-jdk maven
+  fi
   apt remove openjdk-11* -y
-  - name: Build for Spark ${{ matrix.spark }}
-run: |
-  cd $GITHUB_WORKSPACE/ && \
-  mvn clean 

Re: [PR] [CORE] Enable second Spark function [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5131:
URL: 
https://github.com/apache/incubator-gluten/pull/5131#issuecomment-2020372832

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] Enable SPARK-10634 timestamp test case [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5090:
URL: 
https://github.com/apache/incubator-gluten/pull/5090#issuecomment-2020340206

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] Support YearMonthIntervalType and enable make_ym_interval [incubator-gluten]

2024-03-26 Thread via GitHub


zzcclp commented on PR #4798:
URL: 
https://github.com/apache/incubator-gluten/pull/4798#issuecomment-2020336897

   It seems there are some `SPARK-36830: Support reading and writing ANSI 
intervals` which are not disable for the spark 3.3


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] Enable SPARK-10634 timestamp test case [incubator-gluten]

2024-03-26 Thread via GitHub


rui-mo commented on code in PR #5090:
URL: https://github.com/apache/incubator-gluten/pull/5090#discussion_r1539108319


##
ep/build-velox/src/get_velox.sh:
##
@@ -16,8 +16,8 @@
 
 set -exu
 
-VELOX_REPO=https://github.com/oap-project/velox.git
-VELOX_BRANCH=2024_03_25
+VELOX_REPO=https://github.com/liujiayi771/velox.git
+VELOX_BRANCH=2024_03_25_ts_fix

Review Comment:
   Could you revert this change?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL][MINOR] Refactor operator/function tests [incubator-gluten]

2024-03-26 Thread via GitHub


PHILO-HE merged PR #5037:
URL: https://github.com/apache/incubator-gluten/pull/5037


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



(incubator-gluten) branch main updated: [VL][MINOR] Refactor operator/function validation tests (#5037)

2024-03-26 Thread philo
This is an automated email from the ASF dual-hosted git repository.

philo pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
 new adf0566a2 [VL][MINOR] Refactor operator/function validation tests 
(#5037)
adf0566a2 is described below

commit adf0566a2056276612694bc980f4e6e9028eb7d1
Author: PHILO-HE 
AuthorDate: Tue Mar 26 20:12:28 2024 +0800

[VL][MINOR] Refactor operator/function validation tests (#5037)
---
 ...lutenClickHouseWholeStageTransformerSuite.scala |   1 -
 .../benchmarks/NativeBenchmarkPlanGenerator.scala  |   1 -
 .../benchmarks/ShuffleWriterFuzzerTest.scala   |   1 -
 .../io/glutenproject/execution/FallbackSuite.scala |   1 -
 ...sionSuite.scala => FunctionsValidateTest.scala} |  71 -
 ...te.scala => ScalarFunctionsValidateSuite.scala} | 162 +++--
 .../io/glutenproject/execution/TestOperator.scala  | 135 ++---
 .../execution/VeloxAggregateFunctionsSuite.scala   |   1 -
 .../execution/VeloxColumnarCacheSuite.scala|   1 -
 .../execution/VeloxHashJoinSuite.scala |   1 -
 .../execution/VeloxLiteralSuite.scala  |   1 -
 .../execution/VeloxMetricsSuite.scala  |   1 -
 .../VeloxOrcDataTypeValidationSuite.scala  |   1 -
 .../VeloxParquetDataTypeValidationSuite.scala  |   1 -
 .../glutenproject/execution/VeloxScanSuite.scala   |   1 -
 .../execution/VeloxStringFunctionsSuite.scala  |   1 -
 .../glutenproject/execution/VeloxTPCDSSuite.scala  |   1 -
 .../glutenproject/execution/VeloxTPCHSuite.scala   |   1 -
 .../execution/VeloxWindowExpressionSuite.scala |   1 -
 .../execution/WindowFunctionsValidateSuite.scala   |  35 +
 .../sql/execution/VeloxParquetWriteSuite.scala |   1 -
 .../execution/WholeStageTransformerSuite.scala |   1 -
 .../glutenproject/execution/VeloxDeltaSuite.scala  |   1 -
 .../execution/VeloxIcebergSuite.scala  |   1 -
 24 files changed, 175 insertions(+), 248 deletions(-)

diff --git 
a/backends-clickhouse/src/test/scala/io/glutenproject/execution/GlutenClickHouseWholeStageTransformerSuite.scala
 
b/backends-clickhouse/src/test/scala/io/glutenproject/execution/GlutenClickHouseWholeStageTransformerSuite.scala
index e40f3d0e7..a2de7cf51 100644
--- 
a/backends-clickhouse/src/test/scala/io/glutenproject/execution/GlutenClickHouseWholeStageTransformerSuite.scala
+++ 
b/backends-clickhouse/src/test/scala/io/glutenproject/execution/GlutenClickHouseWholeStageTransformerSuite.scala
@@ -70,7 +70,6 @@ class GlutenClickHouseWholeStageTransformerSuite extends 
WholeStageTransformerSu
   protected val metaStorePathAbsolute = basePath + "/meta"
   protected val hiveMetaStoreDB = metaStorePathAbsolute + "/metastore_db"
 
-  override protected val backend: String = "ch"
   final override protected val resourcePath: String = "" // ch not need this
   override protected val fileFormat: String = "parquet"
 }
diff --git 
a/backends-velox/src/test/scala/io/glutenproject/benchmarks/NativeBenchmarkPlanGenerator.scala
 
b/backends-velox/src/test/scala/io/glutenproject/benchmarks/NativeBenchmarkPlanGenerator.scala
index c9863111a..dafe3af3e 100644
--- 
a/backends-velox/src/test/scala/io/glutenproject/benchmarks/NativeBenchmarkPlanGenerator.scala
+++ 
b/backends-velox/src/test/scala/io/glutenproject/benchmarks/NativeBenchmarkPlanGenerator.scala
@@ -35,7 +35,6 @@ import scala.collection.JavaConverters._
 object GenerateExample extends Tag("io.glutenproject.tags.GenerateExample")
 
 class NativeBenchmarkPlanGenerator extends VeloxWholeStageTransformerSuite {
-  override protected val backend: String = "velox"
   override protected val resourcePath: String = "/tpch-data-parquet-velox"
   override protected val fileFormat: String = "parquet"
   val generatedPlanDir = getClass.getResource("/").getPath + 
"../../../generated-native-benchmark/"
diff --git 
a/backends-velox/src/test/scala/io/glutenproject/benchmarks/ShuffleWriterFuzzerTest.scala
 
b/backends-velox/src/test/scala/io/glutenproject/benchmarks/ShuffleWriterFuzzerTest.scala
index 9d723f04f..7f863de68 100644
--- 
a/backends-velox/src/test/scala/io/glutenproject/benchmarks/ShuffleWriterFuzzerTest.scala
+++ 
b/backends-velox/src/test/scala/io/glutenproject/benchmarks/ShuffleWriterFuzzerTest.scala
@@ -37,7 +37,6 @@ object ShuffleWriterFuzzerTest {
 @FuzzerTest
 @SkipTestTags
 class ShuffleWriterFuzzerTest extends VeloxWholeStageTransformerSuite {
-  override protected val backend: String = "velox"
   override protected val resourcePath: String = "/tpch-data-parquet-velox"
   override protected val fileFormat: String = "parquet"
 
diff --git 
a/backends-velox/src/test/scala/io/glutenproject/execution/FallbackSuite.scala 
b/backends-velox/src/test/scala/io/glutenproject/execution/FallbackSuite.scala
index d9b1b4604..69e5b614c 100644
--- 

Re: [PR] [CORE] Support JDK17 [incubator-gluten]

2024-03-26 Thread via GitHub


ulysses-you commented on PR #5120:
URL: 
https://github.com/apache/incubator-gluten/pull/5120#issuecomment-2020149985

   cc @zhztheplayer @zhouyuan  @PHILO-HE  if you have other comments


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [GLUTEN-5123][INFRA]set up java and maven according to os in build_bundle_package.yml [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5124:
URL: 
https://github.com/apache/incubator-gluten/pull/5124#issuecomment-2020149678

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [MINOR] Remove redundant string format [incubator-gluten]

2024-03-26 Thread via GitHub


ulysses-you merged PR #5126:
URL: https://github.com/apache/incubator-gluten/pull/5126


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



(incubator-gluten) branch main updated: [MINOR] Remove redundant string format (#5126)

2024-03-26 Thread ulyssesyou
This is an automated email from the ASF dual-hosted git repository.

ulyssesyou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
 new dba4bcd3c [MINOR] Remove redundant string format (#5126)
dba4bcd3c is described below

commit dba4bcd3c4587f91296cd2387dc089c8c7f4b970
Author: Zhen Wang <643348...@qq.com>
AuthorDate: Tue Mar 26 19:02:41 2024 +0800

[MINOR] Remove redundant string format (#5126)
---
 gluten-core/src/main/scala/io/glutenproject/GlutenPlugin.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gluten-core/src/main/scala/io/glutenproject/GlutenPlugin.scala 
b/gluten-core/src/main/scala/io/glutenproject/GlutenPlugin.scala
index 670c9411d..c54b78da9 100644
--- a/gluten-core/src/main/scala/io/glutenproject/GlutenPlugin.scala
+++ b/gluten-core/src/main/scala/io/glutenproject/GlutenPlugin.scala
@@ -141,7 +141,7 @@ private[glutenproject] class GlutenDriverPlugin extends 
DriverPlugin with Loggin
 } else {
   s"$GLUTEN_SESSION_EXTENSION_NAME"
 }
-conf.set(SPARK_SESSION_EXTS_KEY, String.format("%s", extensions))
+conf.set(SPARK_SESSION_EXTS_KEY, extensions)
 
 // off-heap bytes
 if (!conf.contains(GlutenConfig.GLUTEN_OFFHEAP_SIZE_KEY)) {


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL](WIP) Support native UDAF [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5130:
URL: 
https://github.com/apache/incubator-gluten/pull/5130#issuecomment-2020090416

   
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on Github Issues?
   
   https://github.com/apache/incubator-gluten/issues
   
   Then could you also rename ***commit message*** and ***pull request title*** 
in the following format?
   
   [GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}
   
   See also:
   
 * [Other pull requests](https://github.com/apache/incubator-gluten/pulls/)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



[PR] [VL](WIP) Support native UDAF [incubator-gluten]

2024-03-26 Thread via GitHub


marin-ma opened a new pull request, #5130:
URL: https://github.com/apache/incubator-gluten/pull/5130

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Move BackendBuildInfo case class from GlutenPlugin to Backend class file [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5129:
URL: 
https://github.com/apache/incubator-gluten/pull/5129#issuecomment-2020078075

   
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on Github Issues?
   
   https://github.com/apache/incubator-gluten/issues
   
   Then could you also rename ***commit message*** and ***pull request title*** 
in the following format?
   
   [GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}
   
   See also:
   
 * [Other pull requests](https://github.com/apache/incubator-gluten/pulls/)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [CORE] Move BackendBuildInfo case class from GlutenPlugin to Backend class file [incubator-gluten]

2024-03-26 Thread via GitHub


github-actions[bot] commented on PR #5129:
URL: 
https://github.com/apache/incubator-gluten/pull/5129#issuecomment-2020078525

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



[PR] [CORE] Move BackendBuildInfo case class from GlutenPlugin to Backend class file [incubator-gluten]

2024-03-26 Thread via GitHub


wForget opened a new pull request, #5129:
URL: https://github.com/apache/incubator-gluten/pull/5129

   ## What changes were proposed in this pull request?
   
   The `BackendBuildInfo` case class seemed more appropriate in `Backend` class 
file, so I moved it.
   
   ## How was this patch tested?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



  1   2   >