Re: [PR] [GLUTEN-3582] Support FLBAType and BOOLEAN [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5962: URL: https://github.com/apache/incubator-gluten/pull/5962#issuecomment-2144914650 https://github.com/apache/incubator-gluten/issues/3582 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [GLUTEN-3582] Support FLBAType and BOOLEAN [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5962: URL: https://github.com/apache/incubator-gluten/pull/5962#issuecomment-2144915110 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] [GLUTEN-3582] Support FLBAType and BOOLEAN [incubator-gluten]

2024-06-03 Thread via GitHub
baibaichen opened a new pull request, #5962: URL: https://github.com/apache/incubator-gluten/pull/5962 ## What changes were proposed in this pull request? Support FLBAType and BOOLEAN (Fixes: \#3582) ## How was this patch tested? new UTs -- This is an

Re: [PR] [GLUTEN-5959] Fix function replace report an error with null value [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5960: URL: https://github.com/apache/incubator-gluten/pull/5960#issuecomment-2144907665 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [VL] Fix shuffle with null type failure [incubator-gluten]

2024-06-03 Thread via GitHub
ulysses-you commented on PR #5961: URL: https://github.com/apache/incubator-gluten/pull/5961#issuecomment-2144842330 cc @marin-ma thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] [VL] Fix shuffle with null type failure [incubator-gluten]

2024-06-03 Thread via GitHub
ulysses-you opened a new pull request, #5961: URL: https://github.com/apache/incubator-gluten/pull/5961 ## What changes were proposed in this pull request? The query would fail if the shuffle partition > 1: ``` spark.sql("select c1 , null as c2 from

Re: [PR] [VL] Fix shuffle with null type failure [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5961: URL: https://github.com/apache/incubator-gluten/pull/5961#issuecomment-2144828430 Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues?

Re: [PR] [GLUTEN-5959] Fix function replace report an error with null value [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5960: URL: https://github.com/apache/incubator-gluten/pull/5960#issuecomment-2144826657 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [GLUTEN-5959] Fix function replace report an error with null value [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5960: URL: https://github.com/apache/incubator-gluten/pull/5960#issuecomment-2144826230 https://github.com/apache/incubator-gluten/issues/5959 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[I] [CH] The function replace report an error with null value [incubator-gluten]

2024-06-03 Thread via GitHub
loneylee opened a new issue, #5959: URL: https://github.com/apache/incubator-gluten/issues/5959 ### Backend CH (ClickHouse) ### Bug description With run the follow sql ``` create table tableName(src String, idx String, dest String) using parquet; nsert into

Re: [PR] [VL] Gluten-it: Add option --scan-partitions [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5958: URL: https://github.com/apache/incubator-gluten/pull/5958#issuecomment-2144688552 Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues?

[PR] [VL] Gluten-it: Add option --scan-partitions [incubator-gluten]

2024-06-03 Thread via GitHub
zhztheplayer opened a new pull request, #5958: URL: https://github.com/apache/incubator-gluten/pull/5958 Add options `--scan-partitions` and remove `--min-scan-partitions` which was not working for some time. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [VL] Gluten-it: Add option --scan-partitions [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5958: URL: https://github.com/apache/incubator-gluten/pull/5958#issuecomment-2144689098 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[I] [CH] Function get_json_object return nothing when the `get_json_object` field also in where conditions [incubator-gluten]

2024-06-03 Thread via GitHub
KevinyhZou opened a new issue, #5957: URL: https://github.com/apache/incubator-gluten/issues/5957 ### Backend CH (ClickHouse) ### Bug description In query `select * from get_json_object(val, '$.game_total_time') where id = ### and gid = ### `, it return the result

[PR] [GLUTEN-4502][VL] Allow TIMESTAMP & complex types in cast expression [incubator-gluten]

2024-06-03 Thread via GitHub
PHILO-HE opened a new pull request, #4509: URL: https://github.com/apache/incubator-gluten/pull/4509 ## What changes were proposed in this pull request? This is just an attempt to offload cast with such types involved to velox. Maybe, at least, we still need to block nested complex

Re: [I] [Core] JVM 19 new feature: FFI [incubator-gluten]

2024-06-03 Thread via GitHub
zhanglistar commented on issue #5946: URL: https://github.com/apache/incubator-gluten/issues/5946#issuecomment-2144619164 ``` The FFI version we used was a preview in Java 19, and the interface has changed through to Java 22, where it has been finalized. Future work with this prototype

Re: [PR] [CI] Add CMake format check [incubator-gluten]

2024-06-03 Thread via GitHub
PHILO-HE commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144609194 I tried to put code format check for cpp/cpp-ch together and then will remove `ch_code_style.yml`. But it's strange that if we do that (i.e., by removing

Re: [I] [VL] Failed to build cmake-3.29 [incubator-gluten]

2024-06-03 Thread via GitHub
zhanglistar commented on issue #5912: URL: https://github.com/apache/incubator-gluten/issues/5912#issuecomment-2144587889 Seems that `-ldl` is missing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [GLUTEN-5668][CH] Support mixed conditions in shuffle hash join [incubator-gluten]

2024-06-03 Thread via GitHub
zhanglistar merged PR #5735: URL: https://github.com/apache/incubator-gluten/pull/5735 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] [CH] Support non-equal hash join [incubator-gluten]

2024-06-03 Thread via GitHub
zhanglistar closed issue #5668: [CH] Support non-equal hash join URL: https://github.com/apache/incubator-gluten/issues/5668 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [CI] Add CMake format check [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144525661 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [CI] Add CMake format check [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144520159 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144516071 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [VL] Optimize the performance of hash based shuffle by accumulating batches [incubator-gluten]

2024-06-03 Thread via GitHub
XinShuoWang commented on PR #5951: URL: https://github.com/apache/incubator-gluten/pull/5951#issuecomment-2144509002 ``` template arrow::Status splitFixedType(const uint8_t* srcAddr, const std::vector& dstAddrs) { for (auto& pid : partitionUsed_) { auto

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144475971 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [VL] Optimize the performance of hash based shuffle by accumulating batches [incubator-gluten]

2024-06-03 Thread via GitHub
FelixYBW commented on PR #5951: URL: https://github.com/apache/incubator-gluten/pull/5951#issuecomment-219307 Thank you for the improvement. The ideal case of current split function is that: the input batch size should be as large as possible but all columns can fit into L2

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-217027 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [VL] Upgrade simdjson to 3.9.3 in vcpkg build [incubator-gluten]

2024-06-03 Thread via GitHub
GlutenPerfBot commented on PR #5938: URL: https://github.com/apache/incubator-gluten/pull/5938#issuecomment-214666 = Performance report for TPCH SF2000 with Velox backend, for reference only query

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144438560 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144436270 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [VL] Use conf to control C2R occupied memory [incubator-gluten]

2024-06-03 Thread via GitHub
FelixYBW commented on code in PR #5952: URL: https://github.com/apache/incubator-gluten/pull/5952#discussion_r1623880020 ## cpp/velox/operators/serializer/VeloxColumnarToRowConverter.cc: ## @@ -16,46 +16,64 @@ */ #include "VeloxColumnarToRowConverter.h" +#include

Re: [PR] [VL] Use conf to control C2R occupied memory [incubator-gluten]

2024-06-03 Thread via GitHub
FelixYBW commented on code in PR #5952: URL: https://github.com/apache/incubator-gluten/pull/5952#discussion_r1623879788 ## cpp/velox/operators/serializer/VeloxColumnarToRowConverter.cc: ## @@ -16,46 +16,64 @@ */ #include "VeloxColumnarToRowConverter.h" +#include

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144427846 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144420661 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144418029 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144413958 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144398184 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144395149 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144393256 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144389293 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [VL] Support Row Index Metadata Column [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5351: URL: https://github.com/apache/incubator-gluten/pull/5351#issuecomment-2144386418 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [WIP] Add CMake format checker [incubator-gluten]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #5941: URL: https://github.com/apache/incubator-gluten/pull/5941#issuecomment-2144384931 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [GLUTEN-5414] [VL] Support arrow csv option and schema [incubator-gluten]

2024-06-02 Thread via GitHub
GlutenPerfBot commented on PR #5850: URL: https://github.com/apache/incubator-gluten/pull/5850#issuecomment-2144338257 = Performance report for TPCH SF2000 with Velox backend, for reference only query

Re: [PR] [VL] Daily Update Velox Version (2024_06_02) [incubator-gluten]

2024-06-02 Thread via GitHub
PHILO-HE commented on PR #5950: URL: https://github.com/apache/incubator-gluten/pull/5950#issuecomment-2144289715 Closing as it is covered by 06/03 update. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [VL] Daily Update Velox Version (2024_06_02) [incubator-gluten]

2024-06-02 Thread via GitHub
PHILO-HE closed pull request #5950: [VL] Daily Update Velox Version (2024_06_02) URL: https://github.com/apache/incubator-gluten/pull/5950 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [VL] Daily Update Velox Version (2024_06_01) [incubator-gluten]

2024-06-02 Thread via GitHub
PHILO-HE commented on PR #5948: URL: https://github.com/apache/incubator-gluten/pull/5948#issuecomment-2144289149 Closing as it is covered by 06/03 update. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [VL] Daily Update Velox Version (2024_06_01) [incubator-gluten]

2024-06-02 Thread via GitHub
PHILO-HE closed pull request #5948: [VL] Daily Update Velox Version (2024_06_01) URL: https://github.com/apache/incubator-gluten/pull/5948 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [VL] Daily Update Velox Version (2024_06_03) [incubator-gluten]

2024-06-02 Thread via GitHub
PHILO-HE merged PR #5956: URL: https://github.com/apache/incubator-gluten/pull/5956 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [VL] Daily Update Velox Version (2024_06_03) [incubator-gluten]

2024-06-02 Thread via GitHub
GlutenPerfBot commented on PR #5956: URL: https://github.com/apache/incubator-gluten/pull/5956#issuecomment-2144266271 = Performance report for TPCH SF2000 with Velox backend, for reference only query

Re: [PR] [CORE] Use the smaller table to build hashmap in shuffled hash join [incubator-gluten]

2024-06-02 Thread via GitHub
zml1206 commented on PR #5750: URL: https://github.com/apache/incubator-gluten/pull/5750#issuecomment-2144231022 > > > If the custom strategy can be removed by moving the code to ColumnarOverrides (without more workarounds), Personally I will be inclined to do that since it: > > >

Re: [PR] [CH] Fix left and substring with length -1 [incubator-gluten]

2024-06-02 Thread via GitHub
github-actions[bot] commented on PR #5943: URL: https://github.com/apache/incubator-gluten/pull/5943#issuecomment-2144229359 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [VL] Upgrade simdjson to 3.9.3 in vcpkg build [incubator-gluten]

2024-06-02 Thread via GitHub
PHILO-HE merged PR #5938: URL: https://github.com/apache/incubator-gluten/pull/5938 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [VL] Daily Update Velox Version (2024_06_03) [incubator-gluten]

2024-06-02 Thread via GitHub
GlutenPerfBot commented on PR #5956: URL: https://github.com/apache/incubator-gluten/pull/5956#issuecomment-2144199450 = Performance report for TPCDS SF2000 with Velox backend, for reference only query

Re: [PR] [VL] Use conf to control C2R occupied memory [incubator-gluten]

2024-06-02 Thread via GitHub
ulysses-you commented on code in PR #5952: URL: https://github.com/apache/incubator-gluten/pull/5952#discussion_r1623726465 ## cpp/velox/operators/serializer/VeloxColumnarToRowConverter.cc: ## @@ -16,46 +16,64 @@ */ #include "VeloxColumnarToRowConverter.h" +#include

Re: [PR] [VL] Use conf to control C2R occupied memory [incubator-gluten]

2024-06-02 Thread via GitHub
ulysses-you commented on code in PR #5952: URL: https://github.com/apache/incubator-gluten/pull/5952#discussion_r1623707892 ## cpp/velox/operators/serializer/VeloxColumnarToRowConverter.cc: ## @@ -16,46 +16,64 @@ */ #include "VeloxColumnarToRowConverter.h" +#include

Re: [PR] [VL] Upgrade simdjson to 3.9.3 in vcpkg build [incubator-gluten]

2024-06-02 Thread via GitHub
PHILO-HE commented on PR #5938: URL: https://github.com/apache/incubator-gluten/pull/5938#issuecomment-2144159911 This pr is ready to merge. The above warning only requires velox code change to fix. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [VL] Upgrade simdjson to 3.9.3 in vcpkg build [incubator-gluten]

2024-06-02 Thread via GitHub
PHILO-HE commented on PR #5938: URL: https://github.com/apache/incubator-gluten/pull/5938#issuecomment-2144152289 > @PHILO-HE Can you take this oppotunity to do a perf test of json parse? Masha said there is huge perf gain. Let's confirm from Gluten. @FelixYBW, I just did a small

Re: [PR] [GLUTEN-5414] [VL] Support arrow csv option and schema [incubator-gluten]

2024-06-02 Thread via GitHub
zhztheplayer merged PR #5850: URL: https://github.com/apache/incubator-gluten/pull/5850 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] [CH] Support java/scala timezone id on ch backend [incubator-gluten]

2024-06-02 Thread via GitHub
baibaichen closed issue #5939: [CH] Support java/scala timezone id on ch backend URL: https://github.com/apache/incubator-gluten/issues/5939 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [VL] Use conf to control C2R occupied memory [incubator-gluten]

2024-06-02 Thread via GitHub
ulysses-you commented on code in PR #5952: URL: https://github.com/apache/incubator-gluten/pull/5952#discussion_r1623709956 ## gluten-data/src/main/java/org/apache/gluten/vectorized/NativeColumnarToRowJniWrapper.java: ## @@ -38,8 +38,8 @@ public long handle() { public

Re: [PR] [GLUTEN-5939][CH] Support java timezone id named 'GMT+8' or 'GMT+08:00' [incubator-gluten]

2024-06-02 Thread via GitHub
baibaichen merged PR #5940: URL: https://github.com/apache/incubator-gluten/pull/5940 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [GLUTEN-5320][VL] Reduce driver memory footprint by postpone the creation and serialization of LocalFilesNode [incubator-gluten]

2024-06-02 Thread via GitHub
Yohahaha commented on PR #5321: URL: https://github.com/apache/incubator-gluten/pull/5321#issuecomment-2144142932 @WangGuangxin are you still working on this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [VL] Daily Update Velox Version (2024_06_03) [incubator-gluten]

2024-06-02 Thread via GitHub
PHILO-HE commented on PR #5956: URL: https://github.com/apache/incubator-gluten/pull/5956#issuecomment-2144141426 /Benchmark Velox TPCDS -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [GLUTEN-5320][VL] Reduce driver memory footprint by postpone the creation and serialization of LocalFilesNode [incubator-gluten]

2024-06-02 Thread via GitHub
github-actions[bot] commented on PR #5321: URL: https://github.com/apache/incubator-gluten/pull/5321#issuecomment-2144136307 This PR was auto-closed because it has been stalled for 10 days with no activity. Please feel free to reopen if it is still valid. Thanks. -- This is an automated

Re: [PR] [GLUTEN-5320][VL] Reduce driver memory footprint by postpone the creation and serialization of LocalFilesNode [incubator-gluten]

2024-06-02 Thread via GitHub
github-actions[bot] closed pull request #5321: [GLUTEN-5320][VL] Reduce driver memory footprint by postpone the creation and serialization of LocalFilesNode URL: https://github.com/apache/incubator-gluten/pull/5321 -- This is an automated message from the Apache Git Service. To respond to

Re: [I] [VL] Include Velox code as git submodule [incubator-gluten]

2024-06-02 Thread via GitHub
ulysses-you commented on issue #5932: URL: https://github.com/apache/incubator-gluten/issues/5932#issuecomment-2144101188 +1 to use git submodule -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [GLUTEN-5668][CH] Support mixed conditions in shuffle hash join [incubator-gluten]

2024-06-02 Thread via GitHub
github-actions[bot] commented on PR #5735: URL: https://github.com/apache/incubator-gluten/pull/5735#issuecomment-2144101057 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [VL] Optimize the performance of hash based shuffle by accumulating batches [incubator-gluten]

2024-06-02 Thread via GitHub
marin-ma commented on PR #5951: URL: https://github.com/apache/incubator-gluten/pull/5951#issuecomment-2144099188 Thanks @XinShuoWang Please fix the code style so that we can run TPCH benchmark :) > I think this PR can also control whether to cache data in combination with memory

Re: [PR] [VL] Optimize the performance of hash based shuffle by accumulating batches [incubator-gluten]

2024-06-02 Thread via GitHub
marin-ma commented on code in PR #5951: URL: https://github.com/apache/incubator-gluten/pull/5951#discussion_r1623688580 ## cpp/velox/shuffle/VeloxHashBasedShuffleWriter.h: ## @@ -249,6 +249,7 @@ class VeloxHashBasedShuffleWriter : public VeloxShuffleWriter { template

Re: [PR] [VL] Optimize the performance of hash based shuffle by accumulating batches [incubator-gluten]

2024-06-02 Thread via GitHub
marin-ma commented on PR #5951: URL: https://github.com/apache/incubator-gluten/pull/5951#issuecomment-2144091204 /Benchmark Velox -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [VL] Daily Update Velox Version (2024_06_03) [incubator-gluten]

2024-06-02 Thread via GitHub
github-actions[bot] commented on PR #5956: URL: https://github.com/apache/incubator-gluten/pull/5956#issuecomment-2144072255 Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues?

[PR] [VL] Daily Update Velox Version (2024_06_03) [incubator-gluten]

2024-06-02 Thread via GitHub
GlutenPerfBot opened a new pull request, #5956: URL: https://github.com/apache/incubator-gluten/pull/5956 Upstream Velox's New Commits: ```txt 4acb520a7 by Pedro Pedreira, Add technical governance mechanics description (9990) c30c3c687 by duanmeng, Refactor fuzzers (10004)

Re: [PR] [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240603) [incubator-gluten]

2024-06-02 Thread via GitHub
github-actions[bot] commented on PR #5955: URL: https://github.com/apache/incubator-gluten/pull/5955#issuecomment-2144049981 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240603) [incubator-gluten]

2024-06-02 Thread via GitHub
github-actions[bot] commented on PR #5955: URL: https://github.com/apache/incubator-gluten/pull/5955#issuecomment-2144049894 https://github.com/apache/incubator-gluten/issues/1632 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[PR] [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240603) [incubator-gluten]

2024-06-02 Thread via GitHub
kyligence-git opened a new pull request, #5955: URL: https://github.com/apache/incubator-gluten/pull/5955 Auto commit by gluten daily build, please check the build status and merge it if it's green. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [GLUTEN-5944][CH] Fallback to run delta vacuum command [incubator-gluten]

2024-06-02 Thread via GitHub
GlutenPerfBot commented on PR #5945: URL: https://github.com/apache/incubator-gluten/pull/5945#issuecomment-2143995297 = Performance report for TPCH SF2000 with Velox backend, for reference only query

Re: [PR] [GLUTEN-5944][CH] Fallback to run delta vacuum command [incubator-gluten]

2024-06-02 Thread via GitHub
GlutenPerfBot commented on PR #5945: URL: https://github.com/apache/incubator-gluten/pull/5945#issuecomment-2143977872 = Performance report for TPCDS SF2000 with Velox backend, for reference only query

Re: [PR] [VL] Use conf to control C2R occupied memory [incubator-gluten]

2024-06-02 Thread via GitHub
Yohahaha commented on code in PR #5952: URL: https://github.com/apache/incubator-gluten/pull/5952#discussion_r1623413974 ## cpp/core/operators/c2r/ColumnarToRow.h: ## @@ -27,7 +28,15 @@ class ColumnarToRowConverter { virtual ~ColumnarToRowConverter() = default; -

Re: [PR] [GLUTEN-5953][VL] Prevent pushdown filters with unsupported data types to scan node [incubator-gluten]

2024-06-02 Thread via GitHub
github-actions[bot] commented on PR #5954: URL: https://github.com/apache/incubator-gluten/pull/5954#issuecomment-2143854957 https://github.com/apache/incubator-gluten/issues/5953 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[PR] [GLUTEN-5953][VL] Prevent pushdown filters with unsupported data types to scan node [incubator-gluten]

2024-06-02 Thread via GitHub
WangGuangxin opened a new pull request, #5954: URL: https://github.com/apache/incubator-gluten/pull/5954 ## What changes were proposed in this pull request? Prevent pushdown filters with unsupported data types to scan node in case the scan fallback to vanilla scan operator

Re: [PR] [VL] Use conf to control C2R occupied memory [incubator-gluten]

2024-06-02 Thread via GitHub
XinShuoWang commented on PR #5799: URL: https://github.com/apache/incubator-gluten/pull/5799#issuecomment-2143763454 > why the PR is closed? Sorry, I accidentally forced push the main branch, which caused the PR to be closed. The new PR is here:

Re: [PR] [VL] Use conf to control C2R occupied memory [incubator-gluten]

2024-06-02 Thread via GitHub
github-actions[bot] commented on PR #5952: URL: https://github.com/apache/incubator-gluten/pull/5952#issuecomment-2143762992 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] [VL] Use conf to control C2R occupied memory [incubator-gluten]

2024-06-02 Thread via GitHub
XinShuoWang opened a new pull request, #5952: URL: https://github.com/apache/incubator-gluten/pull/5952 ## What changes were proposed in this pull request? In the current design, the Column2Row operation is completed in one go, which consumes a lot of memory and causes the program

Re: [PR] [VL] Use conf to control C2R occupied memory [incubator-gluten]

2024-06-02 Thread via GitHub
github-actions[bot] commented on PR #5952: URL: https://github.com/apache/incubator-gluten/pull/5952#issuecomment-2143762893 Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues?

Re: [PR] [VL] Optimize the performance of hash based shuffle by accumulating batches [incubator-gluten]

2024-06-02 Thread via GitHub
github-actions[bot] commented on PR #5951: URL: https://github.com/apache/incubator-gluten/pull/5951#issuecomment-2143761589 Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues?

[PR] [VL] Optimize the performance of hash based shuffle by accumulating batches [incubator-gluten]

2024-06-02 Thread via GitHub
XinShuoWang opened a new pull request, #5951: URL: https://github.com/apache/incubator-gluten/pull/5951 ## What changes were proposed in this pull request? I used perf to observe the benchmark and found that the most time-consuming functions were `splitFixedWidthValueBuffer` and

Re: [PR] [VL] Support columnar collect limit [incubator-gluten]

2024-06-01 Thread via GitHub
github-actions[bot] commented on PR #5266: URL: https://github.com/apache/incubator-gluten/pull/5266#issuecomment-2143661440 This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days. -- This is an automated

Re: [PR] [GLUTEN-5229][CH] Fix GlutenClickHouseMergeTreeWriteOnHDFSSuite and GlutenClickHouseMergeTreeWriteOnS3Suite after gluten domain name changes [incubator-gluten]

2024-06-01 Thread via GitHub
github-actions[bot] commented on PR #5300: URL: https://github.com/apache/incubator-gluten/pull/5300#issuecomment-2143661436 This PR was auto-closed because it has been stalled for 10 days with no activity. Please feel free to reopen if it is still valid. Thanks. -- This is an automated

Re: [PR] [GLUTEN-5229][CH] Fix GlutenClickHouseMergeTreeWriteOnHDFSSuite and GlutenClickHouseMergeTreeWriteOnS3Suite after gluten domain name changes [incubator-gluten]

2024-06-01 Thread via GitHub
github-actions[bot] closed pull request #5300: [GLUTEN-5229][CH] Fix GlutenClickHouseMergeTreeWriteOnHDFSSuite and GlutenClickHouseMergeTreeWriteOnS3Suite after gluten domain name changes URL: https://github.com/apache/incubator-gluten/pull/5300 -- This is an automated message from the

[PR] [VL] Daily Update Velox Version (2024_06_02) [incubator-gluten]

2024-06-01 Thread via GitHub
GlutenPerfBot opened a new pull request, #5950: URL: https://github.com/apache/incubator-gluten/pull/5950 Upstream Velox's New Commits: ```txt 7773f764e by Pedro Eugenio Rocha Pedreira, Vectorizing merge join (9763) b88aadccf by Masha Basmanova, Optimize EvalErrors (9996)

Re: [PR] [VL] Daily Update Velox Version (2024_06_02) [incubator-gluten]

2024-06-01 Thread via GitHub
github-actions[bot] commented on PR #5950: URL: https://github.com/apache/incubator-gluten/pull/5950#issuecomment-2143639459 Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues?

Re: [PR] [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240602) [incubator-gluten]

2024-06-01 Thread via GitHub
github-actions[bot] commented on PR #5949: URL: https://github.com/apache/incubator-gluten/pull/5949#issuecomment-2143621178 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240602) [incubator-gluten]

2024-06-01 Thread via GitHub
github-actions[bot] commented on PR #5949: URL: https://github.com/apache/incubator-gluten/pull/5949#issuecomment-2143621137 https://github.com/apache/incubator-gluten/issues/1632 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[PR] [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240602) [incubator-gluten]

2024-06-01 Thread via GitHub
kyligence-git opened a new pull request, #5949: URL: https://github.com/apache/incubator-gluten/pull/5949 Auto commit by gluten daily build, please check the build status and merge it if it's green. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [GLUTEN-5944][CH] Fallback to run delta vacuum command [incubator-gluten]

2024-06-01 Thread via GitHub
GlutenPerfBot commented on PR #5945: URL: https://github.com/apache/incubator-gluten/pull/5945#issuecomment-2143558531 = Performance report for TPCH SF2000 with Velox backend, for reference only query

Re: [PR] [GLUTEN-5944][CH] Fallback to run delta vacuum command [incubator-gluten]

2024-06-01 Thread via GitHub
GlutenPerfBot commented on PR #5945: URL: https://github.com/apache/incubator-gluten/pull/5945#issuecomment-2143539946 = Performance report for TPCDS SF2000 with Velox backend, for reference only query

Re: [PR] [VL] Fall back collect_set, min and max when input is complex type [incubator-gluten]

2024-06-01 Thread via GitHub
zhli1142015 commented on PR #5934: URL: https://github.com/apache/incubator-gluten/pull/5934#issuecomment-2143428122 Get it, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [VL] Fall back collect_set, min and max when input is complex type [incubator-gluten]

2024-06-01 Thread via GitHub
jinchengchenghh commented on PR #5934: URL: https://github.com/apache/incubator-gluten/pull/5934#issuecomment-2143425769 This is because it does not use gluten modified and compile Arrow jar version, after this PR, it will not happen. https://github.com/apache/incubator-gluten/pull/5850

Re: [PR] [GLUTEN-5414] [VL] Support arrow csv option and schema [incubator-gluten]

2024-06-01 Thread via GitHub
github-actions[bot] commented on PR #5850: URL: https://github.com/apache/incubator-gluten/pull/5850#issuecomment-2143425625 Run Gluten Clickhouse CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Arrow CSV reader peak memory is very large [incubator-gluten]

2024-06-01 Thread via GitHub
jinchengchenghh commented on issue #5766: URL: https://github.com/apache/incubator-gluten/issues/5766#issuecomment-2143424599 I mark this format as spiltable false, so it should not split.

<    1   2   3   4   5   6   7   8   9   10   >