Re: [I] Implementing `From` for `sqlparser::ast::Statement` variants [datafusion-sqlparser-rs]

2025-09-08 Thread via GitHub
LucaCappelletti94 commented on issue #2020: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/2020#issuecomment-3268966727 @iffyio could you kindly lmk your opinion on the matter before I start a PR? -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] fix: Fallback length function with binary input [datafusion-comet]

2025-09-08 Thread via GitHub
codecov-commenter commented on PR #2349: URL: https://github.com/apache/datafusion-comet/pull/2349#issuecomment-3268681463 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/2349?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] `EXPLAIN VERBOSE` only works when format is set to (non-default) 'indent' [datafusion]

2025-09-08 Thread via GitHub
petern48 commented on issue #17480: URL: https://github.com/apache/datafusion/issues/17480#issuecomment-3268882942 > Good idea! Perhaps we can override `EXPLAIN ANALYZE` too? That's a good idea, too! I tried it in the cli, and `explain analyze` already overrides to `indent`. The code

[PR] build: Fix CI? [datafusion-comet]

2025-09-08 Thread via GitHub
andygrove opened a new pull request, #2353: URL: https://github.com/apache/datafusion-comet/pull/2353 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

Re: [PR] Added derive trait `Copy` to `OrderByOptions` struct [datafusion-sqlparser-rs]

2025-09-08 Thread via GitHub
iffyio merged PR #2021: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2021 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] chore: Split expression serde hash map into separate categories [datafusion-comet]

2025-09-08 Thread via GitHub
rishvin commented on PR #2322: URL: https://github.com/apache/datafusion-comet/pull/2322#issuecomment-3268588193 Looks good! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] chore(deps): bump twox-hash from 2.1.1 to 2.1.2 in /native [datafusion-comet]

2025-09-08 Thread via GitHub
dependabot[bot] opened a new pull request, #2335: URL: https://github.com/apache/datafusion-comet/pull/2335 Bumps [twox-hash](https://github.com/shepmaster/twox-hash) from 2.1.1 to 2.1.2. Changelog Sourced from https://github.com/shepmaster/twox-hash/blob/main/CHANGELOG.md";>twox-h

[PR] chore(deps): bump log from 0.4.27 to 0.4.28 in /native [datafusion-comet]

2025-09-08 Thread via GitHub
dependabot[bot] opened a new pull request, #2333: URL: https://github.com/apache/datafusion-comet/pull/2333 Bumps [log](https://github.com/rust-lang/log) from 0.4.27 to 0.4.28. Release notes Sourced from https://github.com/rust-lang/log/releases";>log's releases. 0.4.28 Wh

[PR] chore(deps): bump actions/download-artifact from 4 to 5 [datafusion-comet]

2025-09-08 Thread via GitHub
dependabot[bot] opened a new pull request, #2332: URL: https://github.com/apache/datafusion-comet/pull/2332 Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 5. Release notes Sourced from https://github.com/actions/download-artifact/releases

[PR] chore(deps): bump log4rs from 1.3.0 to 1.4.0 in /native [datafusion-comet]

2025-09-08 Thread via GitHub
dependabot[bot] opened a new pull request, #2334: URL: https://github.com/apache/datafusion-comet/pull/2334 Bumps [log4rs](https://github.com/estk/log4rs) from 1.3.0 to 1.4.0. Release notes Sourced from https://github.com/estk/log4rs/releases";>log4rs's releases. v1.4.0 -- Ke

Re: [PR] Push down preferred sorts into `TableScan` logical plan node [datafusion]

2025-09-08 Thread via GitHub
pepijnve commented on code in PR #17337: URL: https://github.com/apache/datafusion/pull/17337#discussion_r2329567588 ## datafusion/optimizer/src/push_down_sort.rs: ## @@ -0,0 +1,580 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [PR] Dynamic filters blog post (rev 2) [datafusion-site]

2025-09-08 Thread via GitHub
alamb commented on PR #103: URL: https://github.com/apache/datafusion-site/pull/103#issuecomment-3266091514 > The different partitions must not have scanned data which included both extremes, resulting in an efficient dynamic filter. > > Would it be feasible to have [`ColumnBounds`](

Re: [I] Push down entire hash table from HashJoinExec into scans [datafusion]

2025-09-08 Thread via GitHub
alamb commented on issue #17171: URL: https://github.com/apache/datafusion/issues/17171#issuecomment-3266105128 Another possibility is to use a data structure like a Bloom Filter, which I think s what spark does. https://issues.apache.org/jira/browse/SPARK-32268 has a bit of backgro

Re: [PR] better preserve statistics when applying limits [datafusion]

2025-09-08 Thread via GitHub
adriangb commented on PR #17381: URL: https://github.com/apache/datafusion/pull/17381#issuecomment-3266612893 @xudong963 CI is green -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Support csv truncated rows in datafusion [datafusion]

2025-09-08 Thread via GitHub
alamb commented on PR #17465: URL: https://github.com/apache/datafusion/pull/17465#issuecomment-3266784819 Thanks @zhuqi-lucas ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [D] Using External Indexes, Metadata Stores, Catalogs and Caches to Accelerate Queries on Apache Parquet - Apache DataFusion Blog [datafusion-site]

2025-09-08 Thread via GitHub
GitHub user giscus[bot] closed a discussion: Using External Indexes, Metadata Stores, Catalogs and Caches to Accelerate Queries on Apache Parquet - Apache DataFusion Blog # Using External Indexes, Metadata Stores, Catalogs and Caches to Accelerate Queries on Apache Parquet - Apache DataFusion

Re: [I] Push down entire hash table from HashJoinExec into scans [datafusion]

2025-09-08 Thread via GitHub
alamb commented on issue #17171: URL: https://github.com/apache/datafusion/issues/17171#issuecomment-3267280510 > My counter argument to this would be that this is only a problem if the size of your build side ≈ the size of your probe side, but if that's the case you already probably have a

[PR] Add support for ClickHouse CSE. [datafusion-sqlparser-rs]

2025-09-08 Thread via GitHub
pravic opened a new pull request, #2024: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2024 https://clickhouse.com/docs/sql-reference/statements/select/with#common-scalar-expressions: ```sql WITH AS ``` fixes #1514. Unfortunately, this changes the publi

[I] [native_iceberg_compat] Add support for custom authentication [datafusion-comet]

2025-09-08 Thread via GitHub
andygrove opened a new issue, #2340: URL: https://github.com/apache/datafusion-comet/issues/2340 ### What is the problem the feature request solves? This is mostly a documentation and testing tasks, since this is already implemented. ### Describe the potential solution _

[I] Custom authentication [datafusion-comet]

2025-09-08 Thread via GitHub
andygrove opened a new issue, #2341: URL: https://github.com/apache/datafusion-comet/issues/2341 ### What is the problem the feature request solves? # Custom Authentication & External File Systems *(Access hdfs/hadoop-aws via JNI)* ## 1. HDFS support via `fs-hdfs` - [x] Fo

Re: [PR] Unnest Correlated Subquery [datafusion]

2025-09-08 Thread via GitHub
duongcongtoai commented on PR #17110: URL: https://github.com/apache/datafusion/pull/17110#issuecomment-3267559977 PR to fix null propagation: https://github.com/irenjj/datafusion/pull/1/files -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [I] `DataFrame.cache()` does not work in distributed environments [datafusion]

2025-09-08 Thread via GitHub
milenkovicm commented on issue #17297: URL: https://github.com/apache/datafusion/issues/17297#issuecomment-3267800666 > > [datafusion/datafusion/core/src/execution/context/mod.rs](https://github.com/apache/datafusion/blob/fd7df66724f958a2d44ba1fda1b11dc6833f0296/datafusion/core/src/execution

Re: [PR] make `giscus` comment section opt-in to comply with ASF policy [datafusion-site]

2025-09-08 Thread via GitHub
kevinjqliu commented on PR #106: URL: https://github.com/apache/datafusion-site/pull/106#issuecomment-3267013012 thank you for the screenshot showing network traffic! I updated the PR with your suggestion -- This is an automated message from the Apache Git Service. To respond to the messa

[I] Upgrade to DataFusion 50.0.0 [datafusion-comet]

2025-09-08 Thread via GitHub
andygrove opened a new issue, #2343: URL: https://github.com/apache/datafusion-comet/issues/2343 ### What is the problem the feature request solves? _No response_ ### Describe the potential solution _No response_ ### Additional context _No response_ -- Th

Re: [PR] Add support for ClickHouse CSE. [datafusion-sqlparser-rs]

2025-09-08 Thread via GitHub
pravic commented on code in PR #2024: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2024#discussion_r2332055186 ## src/parser/mod.rs: ## @@ -12260,6 +12260,27 @@ impl<'a> Parser<'a> { }) } +/// Parse a CTE or CSE. +pub fn parse_cte_or_cse(&

Re: [PR] feat(spark): implement Spark `map` function `map_from_arrays` [datafusion]

2025-09-08 Thread via GitHub
SparkApplicationMaster commented on code in PR #17456: URL: https://github.com/apache/datafusion/pull/17456#discussion_r2331438737 ## datafusion/spark/src/function/map/map_from_arrays.rs: ## @@ -0,0 +1,207 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

Re: [PR] chore(deps): bump wasm-bindgen-test from 0.3.50 to 0.3.51 [datafusion]

2025-09-08 Thread via GitHub
comphead merged PR #17470: URL: https://github.com/apache/datafusion/pull/17470 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] feat: [iceberg] delete rows support using selection vectors [datafusion-comet]

2025-09-08 Thread via GitHub
comphead commented on code in PR #2346: URL: https://github.com/apache/datafusion-comet/pull/2346#discussion_r2331760084 ## common/src/main/java/org/apache/comet/vector/CometSelectionVector.java: ## @@ -0,0 +1,279 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] feat: implement job data cleanup in pull-staged strategy #1219 [datafusion-ballista]

2025-09-08 Thread via GitHub
KR-bluejay commented on code in PR #1314: URL: https://github.com/apache/datafusion-ballista/pull/1314#discussion_r2331838811 ## ballista/executor/src/execution_loop.rs: ## @@ -88,8 +90,29 @@ pub async fn poll_loop match poll_work_result { Ok(result) =>

Re: [PR] POC: datafusion-cli instrumented object store [datafusion]

2025-09-08 Thread via GitHub
BlakeOrth commented on PR #17266: URL: https://github.com/apache/datafusion/pull/17266#issuecomment-3268395582 > Also, BTW tried it out but it doesn't seem to be working anymore @alamb I've found the bug and fixed this behavior. Although this is one of those scenarios where I'm somewh

Re: [PR] feat: implement job data cleanup in pull-staged strategy #1219 [datafusion-ballista]

2025-09-08 Thread via GitHub
milenkovicm commented on code in PR #1314: URL: https://github.com/apache/datafusion-ballista/pull/1314#discussion_r2331232678 ## ballista/executor/src/execution_loop.rs: ## @@ -88,8 +90,29 @@ pub async fn poll_loop match poll_work_result { Ok(result) =>

[PR] Always use 'indent' format for explain verbose [datafusion]

2025-09-08 Thread via GitHub
petern48 opened a new pull request, #17481: URL: https://github.com/apache/datafusion/pull/17481 ## Which issue does this PR close? - Closes #17480 ## Rationale for this change `datafusion-cli` uses `tree` format by default. In order to get proper explain ver

[PR] POC: `ClassicJoin` for PWMJ [datafusion]

2025-09-08 Thread via GitHub
jonathanc-n opened a new pull request, #17482: URL: https://github.com/apache/datafusion/pull/17482 ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tes

[PR] chore: Remove IcebergCometBatchReader.java [datafusion-comet]

2025-09-08 Thread via GitHub
comphead opened a new pull request, #2347: URL: https://github.com/apache/datafusion-comet/pull/2347 ## Which issue does this PR close? Remove unused code Closes #. ## Rationale for this change ## What changes are included in this PR? ##

Re: [PR] WIP: Upgrade to arrow 56.1.0 [datafusion]

2025-09-08 Thread via GitHub
alamb commented on PR #17275: URL: https://github.com/apache/datafusion/pull/17275#issuecomment-3267868833 I saw this @nuno-faria I hope to look at it tomorrow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] feat: [iceberg] delete rows support using selection vectors [datafusion-comet]

2025-09-08 Thread via GitHub
parthchandra commented on code in PR #2346: URL: https://github.com/apache/datafusion-comet/pull/2346#discussion_r2331698398 ## native/core/src/execution/operators/scan.rs: ## @@ -239,6 +239,87 @@ impl ScanExec { let mut timer = arrow_ffi_time.timer(); +// C

[I] `EXPLAIN VERBOSE` only works when format is set to (non-default) 'indent' [datafusion]

2025-09-08 Thread via GitHub
petern48 opened a new issue, #17480: URL: https://github.com/apache/datafusion/issues/17480 ### Describe the bug On the `datafusion-cli`, `tree` was made the default explain format in [this PR](https://github.com/apache/datafusion/pull/15427). Now, when we use `EXPLAIN VERBOSE`, we s

Re: [I] `EXPLAIN VERBOSE` only works when format is set to (non-default) 'indent' [datafusion]

2025-09-08 Thread via GitHub
petern48 commented on issue #17480: URL: https://github.com/apache/datafusion/issues/17480#issuecomment-3268694949 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Blog: Add table of contents to blog article [datafusion-site]

2025-09-08 Thread via GitHub
kevinjqliu commented on code in PR #107: URL: https://github.com/apache/datafusion-site/pull/107#discussion_r2330729667 ## plugins/extract_toc/README.md: ## @@ -0,0 +1,137 @@ +Extract Table of Content + + +A Pelican plugin to extract table of contents (To

Re: [PR] docs: Add note about Root CA Certificate location with native scans [datafusion-comet]

2025-09-08 Thread via GitHub
comphead commented on code in PR #2325: URL: https://github.com/apache/datafusion-comet/pull/2325#discussion_r2330734463 ## docs/source/user-guide/latest/datasources.md: ## @@ -175,6 +175,13 @@ The `native_datafusion` and `native_iceberg_compat` Parquet scan implementations

Re: [PR] feat: Implement `DFSchema.print_schema_tree()` method [datafusion]

2025-09-08 Thread via GitHub
alamb commented on code in PR #17459: URL: https://github.com/apache/datafusion/pull/17459#discussion_r2330714653 ## datafusion/common/src/dfschema.rs: ## @@ -863,6 +863,208 @@ impl DFSchema { .zip(self.inner.fields().iter()) .map(|(qualifier, field)| (

[PR] fix(SubqueryAlias): use maybe_project_redundant_column [datafusion]

2025-09-08 Thread via GitHub
notfilippo opened a new pull request, #17478: URL: https://github.com/apache/datafusion/pull/17478 ## Which issue does this PR close? - Closes #17405. ## Rationale for this change When creating nested `SubqueryAlias` operations in complex joins, DataFusion was incorrectl

[PR] Support join cardinality estimation if distinct_count is set [datafusion]

2025-09-08 Thread via GitHub
jackkleeman opened a new pull request, #17476: URL: https://github.com/apache/datafusion/pull/17476 The goal of this PR is to allow cardinality statistics being passed through joins even if fields don't have max and min values set, as long as a distinct value estimate is provided. Cu

Re: [PR] Push down preferred sorts into `TableScan` logical plan node [datafusion]

2025-09-08 Thread via GitHub
adriangb commented on code in PR #17337: URL: https://github.com/apache/datafusion/pull/17337#discussion_r2330741000 ## datafusion/optimizer/src/push_down_sort.rs: ## @@ -0,0 +1,580 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [PR] [branch-50] fix: Implement AggregateUDFImpl::reverse_expr for StringAgg (#17165) [datafusion]

2025-09-08 Thread via GitHub
alamb commented on PR #17473: URL: https://github.com/apache/datafusion/pull/17473#issuecomment-3266991329 Thanks @comphead and @nuno-faria -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] docs: Update supported expressions and operators in user guide [datafusion-comet]

2025-09-08 Thread via GitHub
comphead commented on code in PR #2327: URL: https://github.com/apache/datafusion-comet/pull/2327#discussion_r2330739057 ## docs/source/user-guide/latest/datatypes.md: ## @@ -19,27 +19,29 @@ # Supported Spark Data Types Review Comment: when Comet says supported does it me

Re: [PR] docs: Use `sphinx-reredirects` for redirects [datafusion-comet]

2025-09-08 Thread via GitHub
andygrove merged PR #2324: URL: https://github.com/apache/datafusion-comet/pull/2324 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Fix `PartialOrd` for logical plan nodes and expressions [datafusion]

2025-09-08 Thread via GitHub
alamb commented on code in PR #17438: URL: https://github.com/apache/datafusion/pull/17438#discussion_r2330666140 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2114,7 +2116,9 @@ pub struct Values { // Manual implementation needed because of `schema` field. Comparison excl

Re: [PR] docs: Update supported expressions and operators in user guide [datafusion-comet]

2025-09-08 Thread via GitHub
comphead commented on code in PR #2327: URL: https://github.com/apache/datafusion-comet/pull/2327#discussion_r2330746423 ## docs/source/user-guide/latest/operators.md: ## @@ -22,16 +22,24 @@ The following Spark operators are currently replaced with native versions. Query stage

Re: [PR] Generalize struct-to-struct casting with CastOptions and SchemaAdapter integration [datafusion]

2025-09-08 Thread via GitHub
adriangb commented on code in PR #17468: URL: https://github.com/apache/datafusion/pull/17468#discussion_r2330852818 ## datafusion/common/src/nested_struct.rs: ## @@ -215,40 +271,81 @@ mod tests { }; } +fn field(name: &str, data_type: DataType) -> Field { +

Re: [PR] Generalize struct-to-struct casting with CastOptions and SchemaAdapter integration [datafusion]

2025-09-08 Thread via GitHub
adriangb commented on PR #17468: URL: https://github.com/apache/datafusion/pull/17468#issuecomment-3267209382 Btw I approved but let's leave this up for another day or so to see if anyone else has feedback -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Dynamic filters blog post (rev 2) [datafusion-site]

2025-09-08 Thread via GitHub
djanderson commented on code in PR #103: URL: https://github.com/apache/datafusion-site/pull/103#discussion_r2330868489 ## content/blog/2025-09-10-dynamic-filters.md: ## @@ -0,0 +1,643 @@ +--- +layout: post +title: Dynamic Filters: Passing Information Between Operators During Ex

Re: [PR] POC: datafusion-cli instrumented object store [datafusion]

2025-09-08 Thread via GitHub
BlakeOrth commented on PR #17266: URL: https://github.com/apache/datafusion/pull/17266#issuecomment-3267229694 @alamb Thanks for the review! I'll take a look into why it's suddenly stopped working (or perhaps it's a "works on my machine" situation, which is also never good). > I thin

[I] [native_iceberg_compat] Add support for Parquet modular decryption [datafusion-comet]

2025-09-08 Thread via GitHub
andygrove opened a new issue, #2339: URL: https://github.com/apache/datafusion-comet/issues/2339 ### What is the problem the feature request solves? Placeholder. Details TBD. - Comet needs native KMS provider that can call into Spark via JNI ### Describe the potential sol

Re: [PR] feature: sort by/cluster by/distribute by [datafusion]

2025-09-08 Thread via GitHub
alamb commented on PR #16310: URL: https://github.com/apache/datafusion/pull/16310#issuecomment-3267969860 Sadly I don't think I will have time ot reivew this feature for a while. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] fix: Incorrect memory accounting in `array_agg` function [datafusion]

2025-09-08 Thread via GitHub
github-actions[bot] closed pull request #16519: fix: Incorrect memory accounting in `array_agg` function URL: https://github.com/apache/datafusion/pull/16519 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] test: add fuzz test for doing aggregation with larger than memory groups and sorting with limited memory [datafusion]

2025-09-08 Thread via GitHub
github-actions[bot] closed pull request #15727: test: add fuzz test for doing aggregation with larger than memory groups and sorting with limited memory URL: https://github.com/apache/datafusion/pull/15727 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Statistics: Implement SampledDistribution variant to Distribution to … [datafusion]

2025-09-08 Thread via GitHub
github-actions[bot] closed pull request #16614: Statistics: Implement SampledDistribution variant to Distribution to … URL: https://github.com/apache/datafusion/pull/16614 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [I] Various issues with Comet's handling of aggregates [datafusion-comet]

2025-09-08 Thread via GitHub
andygrove commented on issue #2294: URL: https://github.com/apache/datafusion-comet/issues/2294#issuecomment-3268247088 duplicate of https://github.com/apache/datafusion-comet/issues/1267 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Support csv truncated rows in datafusion [datafusion]

2025-09-08 Thread via GitHub
zhuqi-lucas merged PR #17465: URL: https://github.com/apache/datafusion/pull/17465 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] Support csv truncated rows in datafusion [datafusion]

2025-09-08 Thread via GitHub
zhuqi-lucas commented on PR #17465: URL: https://github.com/apache/datafusion/pull/17465#issuecomment-326862 Thank you @xudong963 , @alamb! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] fix: Fallback length function with non-string input [datafusion-comet]

2025-09-08 Thread via GitHub
wForget opened a new pull request, #2349: URL: https://github.com/apache/datafusion-comet/pull/2349 ## Which issue does this PR close? Closes #2338. ## Rationale for this change length function panic with binary input ## What changes are included in this PR

Re: [PR] feat: Make supported hadoop filesystem schemes configurable [datafusion-comet]

2025-09-08 Thread via GitHub
parthchandra merged PR #2272: URL: https://github.com/apache/datafusion-comet/pull/2272 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

[PR] chore: Add hdfs feature test job [datafusion-comet]

2025-09-08 Thread via GitHub
wForget opened a new pull request, #2350: URL: https://github.com/apache/datafusion-comet/pull/2350 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes t

Re: [PR] POC: `ClassicJoin` for PWMJ [datafusion]

2025-09-08 Thread via GitHub
jonathanc-n commented on code in PR #17482: URL: https://github.com/apache/datafusion/pull/17482#discussion_r2331953794 ## datafusion/sqllogictest/test_files/joins.slt: ## @@ -5161,6 +5178,44 @@ WHERE k1 < 0 +# PiecewiseMergeJoin Test +statement ok +set datafusion.exec

[PR] ignore [datafusion-comet]

2025-09-08 Thread via GitHub
andygrove opened a new pull request, #2352: URL: https://github.com/apache/datafusion-comet/pull/2352 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

Re: [I] `EXPLAIN VERBOSE` only works when format is set to (non-default) 'indent' [datafusion]

2025-09-08 Thread via GitHub
2010YOUY01 commented on issue #17480: URL: https://github.com/apache/datafusion/issues/17480#issuecomment-3268722339 Good idea! Perhaps we can override `EXPLAIN ANALYZE` too? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] docs: Add note about Root CA Certificate location with native scans [datafusion-comet]

2025-09-08 Thread via GitHub
mbutrovich commented on code in PR #2325: URL: https://github.com/apache/datafusion-comet/pull/2325#discussion_r2331486201 ## docs/source/user-guide/latest/datasources.md: ## @@ -175,6 +175,13 @@ The `native_datafusion` and `native_iceberg_compat` Parquet scan implementations

Re: [PR] Improve `Hash` and `Ord` speed for `dyn LogicalType` [datafusion]

2025-09-08 Thread via GitHub
findepi merged PR #17437: URL: https://github.com/apache/datafusion/pull/17437 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafu

Re: [PR] Extract complex default impls from AggregateUDFImpl trait [datafusion]

2025-09-08 Thread via GitHub
findepi merged PR #17391: URL: https://github.com/apache/datafusion/pull/17391 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafu

Re: [PR] fix: modify the type coercion logic to avoid planning error [datafusion]

2025-09-08 Thread via GitHub
kosiew commented on code in PR #17418: URL: https://github.com/apache/datafusion/pull/17418#discussion_r2329417002 ## datafusion/sqllogictest/test_files/select.slt: ## @@ -620,6 +620,12 @@ select * from (values (1)) LIMIT 10*100; 1 +# select both nulls with basic arithm

[PR] Add table of contents to blog article [datafusion-site]

2025-09-08 Thread via GitHub
nuno-faria opened a new pull request, #107: URL: https://github.com/apache/datafusion-site/pull/107 Having a table of contents on the side makes an article easier to follow in my opinion. It looks like this on wide screens, following the page when scrolling: https://github.com/user-att

[PR] chore(deps): bump log from 0.4.27 to 0.4.28 [datafusion]

2025-09-08 Thread via GitHub
dependabot[bot] opened a new pull request, #17471: URL: https://github.com/apache/datafusion/pull/17471 Bumps [log](https://github.com/rust-lang/log) from 0.4.27 to 0.4.28. Release notes Sourced from https://github.com/rust-lang/log/releases";>log's releases. 0.4.28 What's

Re: [PR] feat: Support distributed plan in `EXPLAIN` command [datafusion-ballista]

2025-09-08 Thread via GitHub
milenkovicm commented on PR #1309: URL: https://github.com/apache/datafusion-ballista/pull/1309#issuecomment-3265134524 @danielhumanmod will try to review it in next few days, thanks a lot -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] job data cleanup does not work if `pull-staged` strategy selected [datafusion-ballista]

2025-09-08 Thread via GitHub
milenkovicm commented on issue #1219: URL: https://github.com/apache/datafusion-ballista/issues/1219#issuecomment-3265143781 thanks for the pr @KR-bluejay 1. i'm not sure, will have a look 2. i dont think this is user facing change, does not matter much will have a look a

[PR] chore(deps): bump wasm-bindgen-test from 0.3.50 to 0.3.51 [datafusion]

2025-09-08 Thread via GitHub
dependabot[bot] opened a new pull request, #17470: URL: https://github.com/apache/datafusion/pull/17470 Bumps [wasm-bindgen-test](https://github.com/wasm-bindgen/wasm-bindgen) from 0.3.50 to 0.3.51. Commits See full diff in https://github.com/wasm-bindgen/wasm-bindgen/commits";

Re: [PR] Fix ambiguous column names in substrait conversion as a result of literals having the same name during conversion. [datafusion]

2025-09-08 Thread via GitHub
xanderbailey commented on code in PR #17299: URL: https://github.com/apache/datafusion/pull/17299#discussion_r2329524658 ## datafusion/substrait/src/logical_plan/consumer/rel/project_rel.rs: ## @@ -62,7 +62,17 @@ pub async fn from_project_rel( // to transform it

Re: [I] job data cleanup does not work if `pull-staged` strategy selected [datafusion-ballista]

2025-09-08 Thread via GitHub
KR-bluejay commented on issue #1219: URL: https://github.com/apache/datafusion-ballista/issues/1219#issuecomment-3265180186 Got it, thank you for the update! I'll wait for your feedback. -- This is an automated message from the Apache Git Service. To respond to the message, please log

[I] Incorrect null literal handling for `to_local_time()` function (SQLancer) [datafusion]

2025-09-08 Thread via GitHub
2010YOUY01 opened a new issue, #17472: URL: https://github.com/apache/datafusion/issues/17472 ### Describe the bug datafusion-cli is compiled from the latest main commit https://github.com/apache/datafusion/commit/d19bf524e384bc24e509c70f1806b6f330829529 ``` > select to_loca

Re: [I] `length` function panic with binary input [datafusion-comet]

2025-09-08 Thread via GitHub
wForget commented on issue #2338: URL: https://github.com/apache/datafusion-comet/issues/2338#issuecomment-3266161390 Currently, Comet uses dafafusion `character_length` function, which only supports string types. https://github.com/apache/datafusion/blob/main/datafusion/functions/

Re: [PR] Dynamic filters blog post (rev 2) [datafusion-site]

2025-09-08 Thread via GitHub
adriangb commented on PR #103: URL: https://github.com/apache/datafusion-site/pull/103#issuecomment-3266291107 Morning bike ride thought: the goal of a hash join is to split up the work into multiple partitions so that we can do work in parallel. The hashing of the join keys is just one way

Re: [PR] Dynamic filters blog post (rev 2) [datafusion-site]

2025-09-08 Thread via GitHub
alamb commented on PR #103: URL: https://github.com/apache/datafusion-site/pull/103#issuecomment-3266398980 (BTW I can't merge this PR unless another committer approves it) https://github.com/user-attachments/assets/296afcf4-60d6-451a-a69e-c80220487469"; /> -- This is an automated

Re: [I] Release DataFusion `50.0.0` (Aug/Sep 2025) [datafusion]

2025-09-08 Thread via GitHub
timsaucer commented on issue #16799: URL: https://github.com/apache/datafusion/issues/16799#issuecomment-3266345417 Tested the branch on `datafusion-python` and it went mostly smoothly. https://github.com/apache/datafusion-python/pull/1231 -- This is an automated message from the Apache G

Re: [PR] Dynamic filters blog post (rev 2) [datafusion-site]

2025-09-08 Thread via GitHub
alamb commented on PR #103: URL: https://github.com/apache/datafusion-site/pull/103#issuecomment-3266396241 > Yes I can share some ad-hoc tests, using a simple join query with TPC-H data (sf=20). The ideal execution plan for the following query is to first filter `customer` by `c_phone` and

Re: [PR] Dynamic filters blog post (rev 2) [datafusion-site]

2025-09-08 Thread via GitHub
alamb commented on PR #103: URL: https://github.com/apache/datafusion-site/pull/103#issuecomment-3266424754 > Morning bike ride thought: the goal of a hash join is to split up the work into multiple partitions so that we can do work in parallel. The hashing of the join keys is just one way

Re: [I] Release DataFusion `50.0.0` (Aug/Sep 2025) [datafusion]

2025-09-08 Thread via GitHub
xudong963 commented on issue #16799: URL: https://github.com/apache/datafusion/issues/16799#issuecomment-3265825357 FYI, I'm testing 50 for mv repo. And I plan to start the vote process On Thus/Fri -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] fix: modify the type coercion logic to avoid planning error [datafusion]

2025-09-08 Thread via GitHub
kosiew commented on code in PR #17418: URL: https://github.com/apache/datafusion/pull/17418#discussion_r2329411507 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -316,6 +321,17 @@ impl<'a> BinaryTypeCoercer<'a> { } } +#[inline] +fn is_both_null(lhs: &DataTy

Re: [I] COALESCE expr in datafusion should perform lazy evaluation of the operands [datafusion]

2025-09-08 Thread via GitHub
alamb closed issue #17322: COALESCE expr in datafusion should perform lazy evaluation of the operands URL: https://github.com/apache/datafusion/issues/17322 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] chore(deps): bump uuid from 1.18.0 to 1.18.1 in /native [datafusion-comet]

2025-09-08 Thread via GitHub
dependabot[bot] opened a new pull request, #2336: URL: https://github.com/apache/datafusion-comet/pull/2336 Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.18.0 to 1.18.1. Release notes Sourced from https://github.com/uuid-rs/uuid/releases";>uuid's releases. v1.18.1 W

Re: [PR] Improve `PartialEq`, `Eq` speed for `LexOrdering`, make `PartialEq` and `PartialOrd` consistent [datafusion]

2025-09-08 Thread via GitHub
findepi commented on code in PR #17442: URL: https://github.com/apache/datafusion/pull/17442#discussion_r2330219757 ## datafusion/physical-expr-common/src/sort_expr.rs: ## @@ -367,8 +367,21 @@ impl LexOrdering { /// Creates a new [`LexOrdering`] from the given vector of sor

Re: [PR] Improve `PartialEq`, `Eq` speed for `LexOrdering`, make `PartialEq` and `PartialOrd` consistent [datafusion]

2025-09-08 Thread via GitHub
findepi commented on code in PR #17442: URL: https://github.com/apache/datafusion/pull/17442#discussion_r2330208915 ## datafusion/physical-expr-common/src/sort_expr.rs: ## @@ -367,8 +367,21 @@ impl LexOrdering { /// Creates a new [`LexOrdering`] from the given vector of sor

Re: [PR] Window Functions Order Conservation -- Follow-up On Set Monotonicity [datafusion]

2025-09-08 Thread via GitHub
findepi commented on code in PR #14813: URL: https://github.com/apache/datafusion/pull/14813#discussion_r2330233448 ## datafusion/physical-plan/src/windows/mod.rs: ## @@ -337,30 +342,151 @@ pub(crate) fn window_equivalence_properties( input: &Arc, window_exprs: &[Arc],

Re: [PR] fix: Expose hash to FFI udf/udaf/udwf to fix their Eq [datafusion]

2025-09-08 Thread via GitHub
findepi commented on PR #17350: URL: https://github.com/apache/datafusion/pull/17350#issuecomment-3266293326 @timsaucer what if we simply don't do https://github.com/apache/datafusion/issues/17087 ? -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] Window Functions Order Conservation -- Follow-up On Set Monotonicity [datafusion]

2025-09-08 Thread via GitHub
berkaysynnada commented on code in PR #14813: URL: https://github.com/apache/datafusion/pull/14813#discussion_r2330386653 ## datafusion/physical-plan/src/windows/mod.rs: ## @@ -337,30 +342,151 @@ pub(crate) fn window_equivalence_properties( input: &Arc, window_exprs: &

Re: [PR] Enable dynamic filter pushdown for LEFT/RIGHT/SEMI/ANTI/Mark joins; surface probe metadata in plans; add join-preservation docs [datafusion]

2025-09-08 Thread via GitHub
adriangb commented on PR #17090: URL: https://github.com/apache/datafusion/pull/17090#issuecomment-3266527817 Amazing! On Mon, Sep 8, 2025 at 2:05 AM kosiew ***@***.***> wrote: > *kosiew* left a comment (apache/datafusion#17090) >

Re: [PR] Add PhysicalExpr::is_volatile_node to upgrade guide [datafusion]

2025-09-08 Thread via GitHub
adriangb commented on code in PR #17443: URL: https://github.com/apache/datafusion/pull/17443#discussion_r2330274634 ## docs/source/library-user-guide/upgrading.md: ## @@ -285,6 +285,24 @@ If you have custom implementations of `FileOpener` or work directly with `FileOp [#173

Re: [I] Release DataFusion `50.0.0` (Aug/Sep 2025) [datafusion]

2025-09-08 Thread via GitHub
mbutrovich commented on issue #16799: URL: https://github.com/apache/datafusion/issues/16799#issuecomment-3266559582 I just bumped by draft Comet PR to use branch-50 instead of a recent commit on main. I'll check on CI after my next flight. -- This is an automated message from the Apache

Re: [PR] chore(deps): bump log4rs from 1.3.0 to 1.4.0 in /native [datafusion-comet]

2025-09-08 Thread via GitHub
andygrove merged PR #2334: URL: https://github.com/apache/datafusion-comet/pull/2334 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[PR] chore(deps): bump cc from 1.2.35 to 1.2.36 in /native [datafusion-comet]

2025-09-08 Thread via GitHub
dependabot[bot] opened a new pull request, #2337: URL: https://github.com/apache/datafusion-comet/pull/2337 Bumps [cc](https://github.com/rust-lang/cc-rs) from 1.2.35 to 1.2.36. Changelog Sourced from https://github.com/rust-lang/cc-rs/blob/main/CHANGELOG.md";>cc's changelog.

Re: [PR] fix: lazy evaluation for coalesce [datafusion]

2025-09-08 Thread via GitHub
alamb merged PR #17357: URL: https://github.com/apache/datafusion/pull/17357 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

  1   2   >