[PR] fix bounds accumulator reset in HashJoinExec dynamic filter pushdown [datafusion]

2025-09-01 Thread via GitHub
adriangb opened a new pull request, #17371: URL: https://github.com/apache/datafusion/pull/17371 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

[PR] add id and update callbacks to dynamic filters [datafusion]

2025-09-01 Thread via GitHub
adriangb opened a new pull request, #17370: URL: https://github.com/apache/datafusion/pull/17370 The idea is that this might help in distributed systems where we want to send dynamic filter updates across the wire -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] feat: Support hdfs with OpenDAL [datafusion-comet]

2025-09-01 Thread via GitHub
wForget commented on PR #2244: URL: https://github.com/apache/datafusion-comet/pull/2244#issuecomment-3243815946 > I'm still on it @wForget, the local hdfs cluster setup having some issue Thank you for your verification and feedback. -- This is an automated message from the Apache

Re: [I] Rewrite `datafusion-sqlancer` in Rust [datafusion]

2025-09-01 Thread via GitHub
arpity22 commented on issue #14535: URL: https://github.com/apache/datafusion/issues/14535#issuecomment-3243736427 Hi! This looks really interesting. I’d like to work on it (just as a side project for fun). Is anyone else working on it at the moment who I could sync up with? -- This is a

Re: [I] Comet 0.9.1 jars are nearly 2x larger than 0.9.0 jars [datafusion-comet]

2025-09-01 Thread via GitHub
comphead commented on issue #2232: URL: https://github.com/apache/datafusion-comet/issues/2232#issuecomment-3243733620 > 0.9.1 has a new native library that was not shipped in 0.9.0: > > ``` > -rw-rw-r-- 1 andy andy 227531568 Jan 22 2020 org/apache/comet/darwin/aarch64/libcomet.d

Re: [PR] feature: sort by/cluster by/distribute by [datafusion]

2025-09-01 Thread via GitHub
chenkovsky commented on PR #16310: URL: https://github.com/apache/datafusion/pull/16310#issuecomment-3243466236 @Dandandan @alamb could you please review this PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] feat: Make supported hadoop filesystem schemes configurable [datafusion-comet]

2025-09-01 Thread via GitHub
comphead commented on PR #2272: URL: https://github.com/apache/datafusion-comet/pull/2272#issuecomment-3243706237 @parthchandra cc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] feat: Support hdfs with OpenDAL [datafusion-comet]

2025-09-01 Thread via GitHub
comphead commented on PR #2244: URL: https://github.com/apache/datafusion-comet/pull/2244#issuecomment-3243703540 I'm still on it @wForget, the local hdfs cluster setup having some issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] chore(deps): bump aws-credential-types from 1.2.5 to 1.2.6 in /native [datafusion-comet]

2025-09-01 Thread via GitHub
comphead merged PR #2275: URL: https://github.com/apache/datafusion-comet/pull/2275 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [PR] chore(deps): bump mimalloc from 0.1.47 to 0.1.48 in /native [datafusion-comet]

2025-09-01 Thread via GitHub
comphead commented on PR #2276: URL: https://github.com/apache/datafusion-comet/pull/2276#issuecomment-3243700148 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] chore(deps): bump bindgen from 0.72.0 to 0.72.1 in /native [datafusion-comet]

2025-09-01 Thread via GitHub
comphead merged PR #2274: URL: https://github.com/apache/datafusion-comet/pull/2274 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [PR] chore(deps): bump cc from 1.2.34 to 1.2.35 in /native [datafusion-comet]

2025-09-01 Thread via GitHub
comphead merged PR #2277: URL: https://github.com/apache/datafusion-comet/pull/2277 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [PR] Redesign ownership model between `FileScanConfig` and `FileSource`s [datafusion]

2025-09-01 Thread via GitHub
comphead commented on PR #17242: URL: https://github.com/apache/datafusion/pull/17242#issuecomment-3243650877 Thanks @waynexia for the diagram and explanation. Definitely agree for the simplification, abstractions indeed are overly flexible, more than needed and getting this simplified

Re: [PR] Redesign ownership model between `FileScanConfig` and `FileSource`s [datafusion]

2025-09-01 Thread via GitHub
adriangb commented on PR #17242: URL: https://github.com/apache/datafusion/pull/17242#issuecomment-3243531886 @waynexia thanks so much for the input! I agree with most of your points, let's see what @friendlymatthew thinks > FileFormat takes FilsScanConfig and format config stored in

Re: [PR] feat: support multi-threaded writing of Parquet files with modular encryption [datafusion]

2025-09-01 Thread via GitHub
adamreeve commented on code in PR #16738: URL: https://github.com/apache/datafusion/pull/16738#discussion_r2314740865 ## datafusion/datasource-parquet/src/file_format.rs: ## @@ -1654,7 +1636,8 @@ async fn output_single_parquet_file_parallelized( object_store_writer: Box,

Re: [I] Unparsing of CROSS JOINs with filters is generating incorrect queries [datafusion]

2025-09-01 Thread via GitHub
chenkovsky commented on issue #17359: URL: https://github.com/apache/datafusion/issues/17359#issuecomment-3243494724 untake -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] Relax sort [datafusion-comet]

2025-09-01 Thread via GitHub
hsiang-c opened a new pull request, #2279: URL: https://github.com/apache/datafusion-comet/pull/2279 ## Which issue does this PR close? Closes #. https://github.com/apache/datafusion-comet/issues/1854 ## Rationale for this change ## What changes are includ

Re: [PR] fix: implement lazy evaluation in Coalesce function [datafusion-comet]

2025-09-01 Thread via GitHub
coderfender commented on PR #2270: URL: https://github.com/apache/datafusion-comet/pull/2270#issuecomment-3243260142 Rebased with main branch . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Enable merge queue in github to avoid commit confliction. [datafusion]

2025-09-01 Thread via GitHub
blaginin commented on issue #6880: URL: https://github.com/apache/datafusion/issues/6880#issuecomment-3243080686 Great infra team has given us temporary access to merge things bypassing the CI checks https://github.com/user-attachments/assets/c1868119-0166-403a-88b0-554e65580bac"; />

[PR] chore(deps): bump procfs from 0.17.0 to 0.18.0 in /native [datafusion-comet]

2025-09-01 Thread via GitHub
dependabot[bot] opened a new pull request, #2278: URL: https://github.com/apache/datafusion-comet/pull/2278 Bumps [procfs](https://github.com/eminence/procfs) from 0.17.0 to 0.18.0. Release notes Sourced from https://github.com/eminence/procfs/releases";>procfs's releases. v

Re: [PR] Merge queue prep [datafusion-sqlparser-rs]

2025-09-01 Thread via GitHub
blaginin commented on PR #2007: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2007#issuecomment-3243072245 https://github.com/user-attachments/assets/5414cd60-c097-41e3-956b-74d0965a06ac"; /> In case something would go wrong during testing, we now have a way to forc

Re: [I] Optimize the join operators [datafusion]

2025-09-01 Thread via GitHub
jonathanc-n commented on issue #16710: URL: https://github.com/apache/datafusion/issues/16710#issuecomment-3243164676 @AdamGS yes I think we can open another issue to separate it from this one -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] chore: notice and cargo deps cleanup [datafusion-ballista]

2025-09-01 Thread via GitHub
milenkovicm merged PR #1295: URL: https://github.com/apache/datafusion-ballista/pull/1295 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

Re: [PR] fix: Disable CollectLeft join as it is broken in ballista [datafusion-ballista]

2025-09-01 Thread via GitHub
milenkovicm merged PR #1301: URL: https://github.com/apache/datafusion-ballista/pull/1301 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

Re: [PR] fix: `ShuffleReader` should return statistics [datafusion-ballista]

2025-09-01 Thread via GitHub
milenkovicm merged PR #1302: URL: https://github.com/apache/datafusion-ballista/pull/1302 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

Re: [PR] fix: `UnresolvedShuffleExec` should support `with_new_children` [datafusion-ballista]

2025-09-01 Thread via GitHub
milenkovicm merged PR #1300: URL: https://github.com/apache/datafusion-ballista/pull/1300 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

[PR] chore(deps): bump mimalloc from 0.1.47 to 0.1.48 in /native [datafusion-comet]

2025-09-01 Thread via GitHub
dependabot[bot] opened a new pull request, #2276: URL: https://github.com/apache/datafusion-comet/pull/2276 Bumps [mimalloc](https://github.com/purpleprotocol/mimalloc_rust) from 0.1.47 to 0.1.48. Release notes Sourced from https://github.com/purpleprotocol/mimalloc_rust/releases";

[PR] chore(deps): bump aws-credential-types from 1.2.5 to 1.2.6 in /native [datafusion-comet]

2025-09-01 Thread via GitHub
dependabot[bot] opened a new pull request, #2275: URL: https://github.com/apache/datafusion-comet/pull/2275 Bumps [aws-credential-types](https://github.com/smithy-lang/smithy-rs) from 1.2.5 to 1.2.6. Commits See full diff in https://github.com/smithy-lang/smithy-rs/commits";>co

[PR] chore(deps): bump bindgen from 0.72.0 to 0.72.1 in /native [datafusion-comet]

2025-09-01 Thread via GitHub
dependabot[bot] opened a new pull request, #2274: URL: https://github.com/apache/datafusion-comet/pull/2274 Bumps [bindgen](https://github.com/rust-lang/rust-bindgen) from 0.72.0 to 0.72.1. Release notes Sourced from https://github.com/rust-lang/rust-bindgen/releases";>bindgen's r

[PR] feat: Add support for `COUNT(DISTINCT expr)` [datafusion-comet]

2025-09-01 Thread via GitHub
andygrove opened a new pull request, #2273: URL: https://github.com/apache/datafusion-comet/pull/2273 ## Which issue does this PR close? Part of https://github.com/apache/datafusion-comet/issues/1267 ## Rationale for this change Increase coverage of TPC-H

Re: [PR] Add PostgreSQL `CREATE USER` and `ALTER USER` support [datafusion-sqlparser-rs]

2025-09-01 Thread via GitHub
ramnes commented on code in PR #2015: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2015#discussion_r2313932290 ## src/parser/alter.rs: ## @@ -39,6 +39,15 @@ impl Parser<'_> { )) } +pub fn parse_alter_user(&mut self) -> Result { +if dia

Re: [I] add support for filter clause in window functions [datafusion]

2025-09-01 Thread via GitHub
geoffreyclaude commented on issue #674: URL: https://github.com/apache/datafusion/issues/674#issuecomment-3242919086 I need this for MATCH_RECOGNIZE, I'll push a PR tomorrow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Optimize the join operators [datafusion]

2025-09-01 Thread via GitHub
AdamGS commented on issue #16710: URL: https://github.com/apache/datafusion/issues/16710#issuecomment-3242712532 While running benchmarks between datafusion and duckdb (for [vortex](https://github.com/vortex-data/vortex)) I've also noticed that TPC-DS query 72 slows down significantly for d

Re: [I] Failed to execute TPC-DS Q47 in ballista [datafusion-ballista]

2025-09-01 Thread via GitHub
milenkovicm closed issue #1296: Failed to execute TPC-DS Q47 in ballista URL: https://github.com/apache/datafusion-ballista/issues/1296 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Ensure stage-level sort requirements are enforced in distributed planning [datafusion-ballista]

2025-09-01 Thread via GitHub
milenkovicm merged PR #1306: URL: https://github.com/apache/datafusion-ballista/pull/1306 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

Re: [PR] chore: notice and cargo deps cleanup [datafusion-ballista]

2025-09-01 Thread via GitHub
andygrove commented on code in PR #1295: URL: https://github.com/apache/datafusion-ballista/pull/1295#discussion_r2307477736 ## NOTICE.txt: ## @@ -1,54 +1,20 @@ -Apache Arrow -Copyright 2016-2019 The Apache Software Foundation +Apache Ballista Review Comment: ```suggestion

Re: [PR] Add PostgreSQL `CREATE USER` and `ALTER USER` support [datafusion-sqlparser-rs]

2025-09-01 Thread via GitHub
ramnes commented on code in PR #2015: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2015#discussion_r2313924164 ## src/parser/alter.rs: ## @@ -39,6 +39,15 @@ impl Parser<'_> { )) } +pub fn parse_alter_user(&mut self) -> Result { Review Comment

Re: [PR] chore: Refactor remaining predicate expression serde [datafusion-comet]

2025-09-01 Thread via GitHub
andygrove merged PR #2265: URL: https://github.com/apache/datafusion-comet/pull/2265 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[PR] fix: window unparsing [datafusion]

2025-09-01 Thread via GitHub
chenkovsky opened a new pull request, #17367: URL: https://github.com/apache/datafusion/pull/17367 ## Which issue does this PR close? - Closes #17360. ## Rationale for this change in LogicalPlan::Filter unparsing, if there's a window expr, it should be converted to qu

Re: [PR] Add array_transform function [datafusion]

2025-09-01 Thread via GitHub
timsaucer commented on PR #17289: URL: https://github.com/apache/datafusion/pull/17289#issuecomment-3242283062 > Tim, would we be able to implement lambda functions on top of this PR or more work needed? > > Something like: > > ```sql > SELECT array_sort(array('Hello', 'Worl

Re: [PR] feat: Make supported hadoop filesystem schemes configurable [datafusion-comet]

2025-09-01 Thread via GitHub
codecov-commenter commented on PR #2272: URL: https://github.com/apache/datafusion-comet/pull/2272#issuecomment-3242158617 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/2272?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

[PR] feat: Make supported hadoop filesystem schemes configurable [datafusion-comet]

2025-09-01 Thread via GitHub
wForget opened a new pull request, #2272: URL: https://github.com/apache/datafusion-comet/pull/2272 ## Which issue does this PR close? Closes #2271. ## Rationale for this change Currently we prefer to use jvm-based libhdfs to implement native hdfs reader, which m

[I] Support additional hadoop file systems [datafusion-comet]

2025-09-01 Thread via GitHub
wForget opened a new issue, #2271: URL: https://github.com/apache/datafusion-comet/issues/2271 ### What is the problem the feature request solves? Currently we prefer to use jvm-based libhdfs to implement native hdfs reader, which means we can support more hadoop file systems. But cur

Re: [I] Potential revamp of broadcast compression policy [datafusion-comet]

2025-09-01 Thread via GitHub
akupchinskiy commented on issue #2216: URL: https://github.com/apache/datafusion-comet/issues/2216#issuecomment-3241071375 @parthchandra [This](https://github.com/apache/spark/blob/7007e1c7ad646bfdc2a89579b2abaa2b3facc6af/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala

Re: [I] Unparsing of Window functions is generating incorrect queries [datafusion]

2025-09-01 Thread via GitHub
chenkovsky commented on issue #17360: URL: https://github.com/apache/datafusion/issues/17360#issuecomment-3241961536 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] feat: support multi-threaded writing of Parquet files with modular encryption [datafusion]

2025-09-01 Thread via GitHub
rok commented on code in PR #16738: URL: https://github.com/apache/datafusion/pull/16738#discussion_r2313569898 ## datafusion/datasource-parquet/src/file_format.rs: ## @@ -1654,7 +1636,8 @@ async fn output_single_parquet_file_parallelized( object_store_writer: Box, dat

Re: [PR] feat: support multi-threaded writing of Parquet files with modular encryption [datafusion]

2025-09-01 Thread via GitHub
rok commented on code in PR #16738: URL: https://github.com/apache/datafusion/pull/16738#discussion_r2313569898 ## datafusion/datasource-parquet/src/file_format.rs: ## @@ -1654,7 +1636,8 @@ async fn output_single_parquet_file_parallelized( object_store_writer: Box, dat

Re: [PR] Add PostgreSQL `CREATE USER` and `ALTER USER` support [datafusion-sqlparser-rs]

2025-09-01 Thread via GitHub
ramnes commented on code in PR #2015: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2015#discussion_r2313348699 ## src/ast/mod.rs: ## @@ -3314,6 +3314,8 @@ pub enum Statement { CreateRole { names: Vec, if_not_exists: bool, +/// Wheth

Re: [PR] Redshift: UNLOAD [datafusion-sqlparser-rs]

2025-09-01 Thread via GitHub
yoavcloud commented on code in PR #2013: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2013#discussion_r2313313013 ## src/parser/mod.rs: ## @@ -16477,19 +16569,35 @@ impl<'a> Parser<'a> { } pub fn parse_unload(&mut self) -> Result { +self.expec