Re: [PR] chore: refactor Substrait consumer's "rename_field" and implement the rest of types [datafusion]

2025-06-10 Thread via GitHub
gabotechs commented on code in PR #16345: URL: https://github.com/apache/datafusion/pull/16345#discussion_r2139335258 ## datafusion/substrait/src/logical_plan/consumer/utils.rs: ## @@ -81,98 +81,167 @@ pub(super) fn next_struct_field_name( } } -pub(super) fn rename_field

Re: [PR] Add compression option to SpillManager [datafusion]

2025-06-10 Thread via GitHub
ding-young commented on PR #16268: URL: https://github.com/apache/datafusion/pull/16268#issuecomment-2961345444 This is ready for review :) @2010YOUY01 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Investigate performance tradeoff in compressing spill files [datafusion]

2025-06-10 Thread via GitHub
ding-young commented on issue #16367: URL: https://github.com/apache/datafusion/issues/16367#issuecomment-2961338341 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[I] Investigate performance tradeoff in compressing spill files [datafusion]

2025-06-10 Thread via GitHub
ding-young opened a new issue, #16367: URL: https://github.com/apache/datafusion/issues/16367 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/16065 , Related to https://github.com/apache/datafusion/issues/14078

Re: [PR] Add support for glob string in datafusion-cli query [datafusion]

2025-06-10 Thread via GitHub
a-agmon commented on PR #16332: URL: https://github.com/apache/datafusion/pull/16332#issuecomment-2961294401 @alamb - thank you very much for the generous comments. I appreciate it. Re naming - I completely agree. Was just wondering whether its better to introduce one function that infe

[I] row-wise min and max [datafusion]

2025-06-10 Thread via GitHub
drtconway opened a new issue, #16366: URL: https://github.com/apache/datafusion/issues/16366 ### Is your feature request related to a problem or challenge? Thanks for an awesome platform! I'm really loving it! In R `pmin` and `pmax` do row-wise min and max respectively. They're

Re: [PR] [PoC] Add API for tracking distinct buffers in `MemoryPool` by reference count [datafusion]

2025-06-10 Thread via GitHub
Dandandan commented on PR #16359: URL: https://github.com/apache/datafusion/pull/16359#issuecomment-2961277605 > Thank you, this solves the memory overcounting issue across batches. I got some suggestions/questions. > > This new pool implementation might cause some issue for `MemoryR

Re: [PR] Unify Metadata Handing: use `FieldMetadata` in `Expr::Alias` and `ExprSchemable` [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16320: URL: https://github.com/apache/datafusion/pull/16320#issuecomment-2961275410 🤖: Benchmark completed Details ``` group alamb_field_metadata2 main -

Re: [I] [Epic] Pipeline breaking cancellation support and improvement [datafusion]

2025-06-10 Thread via GitHub
zhuqi-lucas commented on issue #16353: URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2961255260 Thank you @alamb ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Add support for glob string in datafusion-cli query [datafusion]

2025-06-10 Thread via GitHub
alamb commented on code in PR #16332: URL: https://github.com/apache/datafusion/pull/16332#discussion_r2139195148 ## datafusion-cli/src/functions.rs: ## @@ -460,3 +473,92 @@ impl TableFunctionImpl for ParquetMetadataFunc { Ok(Arc::new(parquet_metadata)) } } + +///

Re: [PR] [PoC] Add API for tracking distinct buffers in `MemoryPool` by reference count [datafusion]

2025-06-10 Thread via GitHub
Dandandan commented on code in PR #16359: URL: https://github.com/apache/datafusion/pull/16359#discussion_r2139196390 ## datafusion/execution/src/memory_pool/mod.rs: ## @@ -131,14 +133,58 @@ pub trait MemoryPool: Send + Sync + std::fmt::Debug { /// This must always succeed

Re: [PR] fix: Fix SparkSha2 to be compliant with Spark response and add support for Int32 [datafusion]

2025-06-10 Thread via GitHub
rishvin commented on PR #16350: URL: https://github.com/apache/datafusion/pull/16350#issuecomment-2961232868 Thanks @andygrove / @getChan for the feedback. Please review the changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Re-Add CodeCov [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #15256: URL: https://github.com/apache/datafusion/pull/15256#issuecomment-2961234354 I am trying to clean up the review queue and it wasn't clear to me what the plan for this PR is, so marking it as draft -- This is an automated message from the Apache Git Service. T

Re: [PR] Support datafusion-cli access to public S3 buckets that do not require authentication [datafusion]

2025-06-10 Thread via GitHub
alamb commented on code in PR #16300: URL: https://github.com/apache/datafusion/pull/16300#discussion_r2139183428 ## datafusion-cli/src/object_storage.rs: ## @@ -105,9 +106,52 @@ pub async fn get_s3_object_store_builder( builder = builder.with_allow_http(*allow_http);

[I] Improve performance of `datafusion-cli` when reading from remote storage [datafusion]

2025-06-10 Thread via GitHub
alamb opened a new issue, #16365: URL: https://github.com/apache/datafusion/issues/16365 ### Is your feature request related to a problem or challenge? - Part of https://github.com/apache/datafusion/pull/16300/files While testing https://github.com/apache/datafusion/pull/16300,

Re: [PR] Unify Metadata Handing: use `FieldMetadata` in `Expr::Alias` and `ExprSchemable` [datafusion]

2025-06-10 Thread via GitHub
alamb commented on code in PR #16320: URL: https://github.com/apache/datafusion/pull/16320#discussion_r2139153886 ## datafusion/expr/src/expr.rs: ## @@ -601,7 +625,7 @@ pub struct Alias { pub expr: Box, pub relation: Option, pub name: String, -pub metadata: Op

Re: [I] DF 48 upgrade guide missing window function breaking change [datafusion]

2025-06-10 Thread via GitHub
alamb commented on issue #16326: URL: https://github.com/apache/datafusion/issues/16326#issuecomment-2961213135 I believe this was fixed in - https://github.com/apache/datafusion/pull/16313 -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Unify Metadata Handing: use `FieldMetadata` in `Expr::Alias` and `ExprSchemable` [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16320: URL: https://github.com/apache/datafusion/pull/16320#issuecomment-2961193827 🤖 `./gh_compare_branch_bench.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh) Running Linux aal-dev 6.11.0-1015-gcp #15~

Re: [PR] Encapsulate metadata for literals on to a `FieldMetadata` structure [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16317: URL: https://github.com/apache/datafusion/pull/16317#issuecomment-2961187091 Here is a follow on PR to also use the same structure in `Expr::Alias` (which also had metadata, but was stored as a HashMap) - https://github.com/apache/datafusion/pull/16320/files

Re: [PR] Unify Metadata Handing: use `FieldMetadata` in `Expr::Alias` and `ExprSchemable` [datafusion]

2025-06-10 Thread via GitHub
alamb commented on code in PR #16320: URL: https://github.com/apache/datafusion/pull/16320#discussion_r2139153886 ## datafusion/expr/src/expr.rs: ## @@ -601,7 +625,7 @@ pub struct Alias { pub expr: Box, pub relation: Option, pub name: String, -pub metadata: Op

Re: [I] Request to update crates.io ownership [datafusion]

2025-06-10 Thread via GitHub
alamb commented on issue #16323: URL: https://github.com/apache/datafusion/issues/16323#issuecomment-2961181201 I am sorry @xudong963 -- there were apparently many other crates I didn't add you as an owner for. I have just sent a bunch more invitations -- hopefully after you accept them th

Re: [PR] feat: mapping sql Char/Text/String default to Utf8View [datafusion]

2025-06-10 Thread via GitHub
zhuqi-lucas commented on code in PR #16290: URL: https://github.com/apache/datafusion/pull/16290#discussion_r2139142869 ## datafusion/sqllogictest/test_files/arrow_files.slt: ## @@ -61,22 +61,12 @@ LOCATION '../core/tests/data/partitioned_table_arrow/' PARTITIONED BY (part);

Re: [PR] Add support `UInt64` and other integer data types for `to_hex` [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16335: URL: https://github.com/apache/datafusion/pull/16335#issuecomment-2961170037 Thanks @tlm365 and @xudong963 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] Request to update crates.io ownership [datafusion]

2025-06-10 Thread via GitHub
alamb commented on issue #16323: URL: https://github.com/apache/datafusion/issues/16323#issuecomment-2961176597 👀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] Revert use file schema in parquet pruning [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16086: URL: https://github.com/apache/datafusion/pull/16086#issuecomment-2961173895 FWIW @phillipleblanc also hit this as well in SpiceAI. See this for more details - https://github.com/spiceai/spiceai/pull/6178 -- This is an automated message from the Apache Gi

Re: [PR] Add support `UInt64` and other integer data types for `to_hex` [datafusion]

2025-06-10 Thread via GitHub
alamb merged PR #16335: URL: https://github.com/apache/datafusion/pull/16335 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] to_hex cannot take UInt64 [datafusion]

2025-06-10 Thread via GitHub
alamb closed issue #16327: to_hex cannot take UInt64 URL: https://github.com/apache/datafusion/issues/16327 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [PR] feat: mapping sql Char/Text/String default to Utf8View [datafusion]

2025-06-10 Thread via GitHub
zhuqi-lucas commented on code in PR #16290: URL: https://github.com/apache/datafusion/pull/16290#discussion_r2139142869 ## datafusion/sqllogictest/test_files/arrow_files.slt: ## @@ -61,22 +61,12 @@ LOCATION '../core/tests/data/partitioned_table_arrow/' PARTITIONED BY (part);

Re: [PR] feat: mapping sql Char/Text/String default to Utf8View [datafusion]

2025-06-10 Thread via GitHub
zhuqi-lucas commented on code in PR #16290: URL: https://github.com/apache/datafusion/pull/16290#discussion_r2139142869 ## datafusion/sqllogictest/test_files/arrow_files.slt: ## @@ -61,22 +61,12 @@ LOCATION '../core/tests/data/partitioned_table_arrow/' PARTITIONED BY (part);

Re: [PR] Encapsulate metadata for literals on to a `FieldMetadata` structure [datafusion]

2025-06-10 Thread via GitHub
alamb merged PR #16317: URL: https://github.com/apache/datafusion/pull/16317 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Document `copy_array_data` function with example [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16361: URL: https://github.com/apache/datafusion/pull/16361#issuecomment-2961165506 Thank you for the review @2010YOUY01 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] Optimize `NestedLoopJoinExec` Memory Usage [datafusion]

2025-06-10 Thread via GitHub
UBarney commented on issue #16364: URL: https://github.com/apache/datafusion/issues/16364#issuecomment-2961161122 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] fix: support read Struct by user schema [datafusion-comet]

2025-06-10 Thread via GitHub
comphead commented on code in PR #1860: URL: https://github.com/apache/datafusion-comet/pull/1860#discussion_r2139136916 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -2728,4 +2728,35 @@ class CometExpressionSuite extends CometTestBase with Adaptive

Re: [PR] Perf: load default Utf8View for CSV datatype [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16243: URL: https://github.com/apache/datafusion/pull/16243#issuecomment-2961158371 Marking as draft as I think this PR is no longer waiting on feedback and I am trying to make it easier to find PRs in need of review. Please mark it as ready for review when it is read

Re: [PR] feat: Parquet modular encryption [datafusion]

2025-06-10 Thread via GitHub
alamb commented on code in PR #16351: URL: https://github.com/apache/datafusion/pull/16351#discussion_r2139135378 ## datafusion/common/src/config.rs: ## @@ -591,6 +930,12 @@ config_namespace! { /// writing out already in-memory data, such as from a cached /// d

[I] Optimize `NestedLoopJoinExec` Memory Usage [datafusion]

2025-06-10 Thread via GitHub
UBarney opened a new issue, #16364: URL: https://github.com/apache/datafusion/issues/16364 ### Is your feature request related to a problem or challenge? The current Nested Loop Join implementation follows this simplified logic: 1. Buffer the Build Side: All data from the left (buil

Re: [I] Ensure Substrait consumer can handle expressions in VirtualTable [datafusion]

2025-06-10 Thread via GitHub
lorenarosati commented on issue #16363: URL: https://github.com/apache/datafusion/issues/16363#issuecomment-2961113864 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[I] Ensure Substrait consumer can handle expressions in VirtualTable [datafusion]

2025-06-10 Thread via GitHub
lorenarosati opened a new issue, #16363: URL: https://github.com/apache/datafusion/issues/16363 ### Describe the bug When we convert a Substrait plan (which includes a virtual table) to a DataFusion LogicalPlan using the `from_substrait_plan()` [function](https://github.com/apache/da

Re: [PR] Revert use file schema in parquet pruning [datafusion]

2025-06-10 Thread via GitHub
xudong963 commented on PR #16086: URL: https://github.com/apache/datafusion/pull/16086#issuecomment-2961055915 > Great to hear! Sorry for the inconvenience... No problem! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Revert use file schema in parquet pruning [datafusion]

2025-06-10 Thread via GitHub
adriangb commented on PR #16086: URL: https://github.com/apache/datafusion/pull/16086#issuecomment-2961049931 Great to hear! Sorry for the inconvenience... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Revert use file schema in parquet pruning [datafusion]

2025-06-10 Thread via GitHub
xudong963 commented on PR #16086: URL: https://github.com/apache/datafusion/pull/16086#issuecomment-2961047647 Fyi, we've upgraded to DF47 with the fix successfully. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Document `copy_array_data` function with example [datafusion]

2025-06-10 Thread via GitHub
2010YOUY01 commented on code in PR #16361: URL: https://github.com/apache/datafusion/pull/16361#discussion_r2139062181 ## datafusion/common/src/scalar/mod.rs: ## @@ -3527,6 +3527,33 @@ impl ScalarValue { } } +/// Compacts the data of an `ArrayData` into a new `ArrayData`

Re: [PR] docs: Expand `MemoryPool` docs with related structs [datafusion]

2025-06-10 Thread via GitHub
2010YOUY01 merged PR #16289: URL: https://github.com/apache/datafusion/pull/16289 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Update parser recursion limit from 50 to 100 [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15622: URL: https://github.com/apache/datafusion/pull/15622#issuecomment-2961018161 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] [PoC] Add API for tracking distinct buffers in `MemoryPool` by reference count [datafusion]

2025-06-10 Thread via GitHub
2010YOUY01 commented on code in PR #16359: URL: https://github.com/apache/datafusion/pull/16359#discussion_r2139040880 ## datafusion/execution/src/memory_pool/mod.rs: ## @@ -131,14 +133,58 @@ pub trait MemoryPool: Send + Sync + std::fmt::Debug { /// This must always succeed

Re: [PR] WIP: Aggregate UDF FFI [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15510: URL: https://github.com/apache/datafusion/pull/15510#issuecomment-2961018258 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] chore: move `optimize_subquery_sort` into optimizer [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15441: URL: https://github.com/apache/datafusion/pull/15441#issuecomment-2961018385 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] feat(sql): add diagnostic for wrong number of function arguments [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15490: URL: https://github.com/apache/datafusion/pull/15490#issuecomment-2961018338 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] WIP: Test enabling Parquet filter pushdown with parquet caching page cache reader [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15506: URL: https://github.com/apache/datafusion/pull/15506#issuecomment-2961018300 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] fix: union all by name [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15603: URL: https://github.com/apache/datafusion/pull/15603#issuecomment-2961018203 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] docs: add conventional commit guide and PR title examples [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15638: URL: https://github.com/apache/datafusion/pull/15638#issuecomment-2961018128 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

[PR] Simplify predicates in filter [datafusion]

2025-06-10 Thread via GitHub
xudong963 opened a new pull request, #16362: URL: https://github.com/apache/datafusion/pull/16362 ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes teste

Re: [PR] Document Table Constraint Enforcement Behavior in Custom Table Providers Guide [datafusion]

2025-06-10 Thread via GitHub
alamb commented on code in PR #16340: URL: https://github.com/apache/datafusion/pull/16340#discussion_r2138991917 ## docs/source/library-user-guide/table-constraints.md: ## @@ -0,0 +1,46 @@ + + +# Table Constraint Enforcement + +Table providers can describe table constraints usi

Re: [I] Request to update crates.io ownership [datafusion]

2025-06-10 Thread via GitHub
xudong963 commented on issue #16323: URL: https://github.com/apache/datafusion/issues/16323#issuecomment-2960933902 Hi @alamb , it seems there is still something wrong: ``` Uploading datafusion-common v48.0.0 (/Users/xudong/opensource/datafusion/datafusion/common) error: failed t

Re: [PR] fix: create file for empty stream [datafusion]

2025-06-10 Thread via GitHub
alamb commented on code in PR #16342: URL: https://github.com/apache/datafusion/pull/16342#discussion_r2138987979 ## datafusion/core/src/datasource/file_format/csv.rs: ## @@ -795,6 +796,25 @@ mod tests { Ok(()) } +#[tokio::test] +async fn test_csv_write_e

Re: [PR] Fix cp_solver doc formatting [datafusion]

2025-06-10 Thread via GitHub
alamb merged PR #16352: URL: https://github.com/apache/datafusion/pull/16352 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Fix array_agg memory over use [datafusion]

2025-06-10 Thread via GitHub
alamb commented on code in PR #16346: URL: https://github.com/apache/datafusion/pull/16346#discussion_r2138985658 ## datafusion/functions-aggregate/src/array_agg.rs: ## @@ -313,7 +315,11 @@ impl Accumulator for ArrayAggAccumulator { }; if !val.is_empty() { -

[PR] Document `copy_array_data` function with example [datafusion]

2025-06-10 Thread via GitHub
alamb opened a new pull request, #16361: URL: https://github.com/apache/datafusion/pull/16361 ## Which issue does this PR close? - Closes #. ## Rationale for this change - While reviewing https://github.com/apache/datafusion/pull/16346 from @gabotechs I ran into

Re: [PR] chore(deps): bump clap from 4.5.39 to 4.5.40 [datafusion]

2025-06-10 Thread via GitHub
alamb merged PR #16354: URL: https://github.com/apache/datafusion/pull/16354 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] chore(deps): bump syn from 2.0.101 to 2.0.102 [datafusion]

2025-06-10 Thread via GitHub
alamb merged PR #16355: URL: https://github.com/apache/datafusion/pull/16355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Add note in upgrade guide about changes to `Expr::Scalar` in 48.0.0 [datafusion]

2025-06-10 Thread via GitHub
alamb commented on code in PR #16360: URL: https://github.com/apache/datafusion/pull/16360#discussion_r2138947851 ## docs/source/library-user-guide/upgrading.md: ## @@ -200,7 +235,7 @@ working but no one knows due to lack of test coverage). [api deprecation guidelines]: http

Re: [PR] feat: add metadata to literal expressions [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16170: URL: https://github.com/apache/datafusion/pull/16170#issuecomment-2960852408 I also made a PR to add a note to the upgrade guide here: - https://github.com/apache/datafusion/pull/16360 -- This is an automated message from the Apache Git Service. To respond

[PR] Add note in upgrade guide about changes to `Expr::Scalar` in 48.0.0 [datafusion]

2025-06-10 Thread via GitHub
alamb opened a new pull request, #16360: URL: https://github.com/apache/datafusion/pull/16360 ## Which issue does this PR close? - Follow on to https://github.com/apache/datafusion/pull/16170 ## Rationale for this change I hit another required change while testing the delta-rs up

Re: [I] [Epic] Pipeline breaking cancellation support and improvement [datafusion]

2025-06-10 Thread via GitHub
alamb commented on issue #16353: URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2960837514 Thank you @zhuqi-lucas -- I also added this as a wishlist item for https://github.com/apache/datafusion/issues/16235 -- This is an automated message from the Apache Git Service

Re: [PR] chore: Replace archived actions-rs/install action [datafusion-sqlparser-rs]

2025-06-10 Thread via GitHub
alamb commented on code in PR #1876: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1876#discussion_r2138936424 ## .github/workflows/rust.yml: ## @@ -85,11 +88,8 @@ jobs: uses: ./.github/actions/setup-builder with: rust-version: ${{ matrix.ru

Re: [PR] chore: Replace archived actions-rs/install action [datafusion-sqlparser-rs]

2025-06-10 Thread via GitHub
alamb merged PR #1876: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] chore: add a utils method to getColumnReader with SQLConf [datafusion-comet]

2025-06-10 Thread via GitHub
huaxingao commented on PR #360: URL: https://github.com/apache/datafusion-comet/pull/360#issuecomment-2960779443 I don't think this is needed any more, so closing it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] chore: add a utils method to getColumnReader with SQLConf [datafusion-comet]

2025-06-10 Thread via GitHub
huaxingao closed pull request #360: chore: add a utils method to getColumnReader with SQLConf URL: https://github.com/apache/datafusion-comet/pull/360 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] chore: refactor Substrait consumer's "rename_field" and implement the rest of types [datafusion]

2025-06-10 Thread via GitHub
westonpace commented on code in PR #16345: URL: https://github.com/apache/datafusion/pull/16345#discussion_r2138834078 ## datafusion/substrait/src/logical_plan/consumer/utils.rs: ## @@ -81,98 +81,167 @@ pub(super) fn next_struct_field_name( } } -pub(super) fn rename_fiel

Re: [I] Add fuzz testing to CI [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on issue #1373: URL: https://github.com/apache/datafusion-comet/issues/1373#issuecomment-2960683018 We do now have a `CometFuzzSuite` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] Add fuzz testing to CI [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed issue #1373: Add fuzz testing to CI URL: https://github.com/apache/datafusion-comet/issues/1373 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] feat: support RangePartitioning with native shuffle [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on code in PR #1862: URL: https://github.com/apache/datafusion-comet/pull/1862#discussion_r2138821947 ## native/core/src/execution/shuffle/range_partitioner.rs: ## @@ -0,0 +1,432 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more co

Re: [PR] feat: support RangePartitioning with native shuffle [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on code in PR #1862: URL: https://github.com/apache/datafusion-comet/pull/1862#discussion_r2138822721 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -2904,6 +2903,8 @@ object QueryPlanSerde extends Logging with CometExprShim {

Re: [PR] feat: support RangePartitioning with native shuffle [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on code in PR #1862: URL: https://github.com/apache/datafusion-comet/pull/1862#discussion_r2138821450 ## native/core/src/execution/shuffle/range_partitioner.rs: ## @@ -0,0 +1,432 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more co

Re: [PR] feat: ANSI support for Add [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #616: feat: ANSI support for Add URL: https://github.com/apache/datafusion-comet/pull/616 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] chore: Stop Running Spark SQL tests for Spark 3.5.4 and 3.5.5 [datafusion-comet]

2025-06-10 Thread via GitHub
codecov-commenter commented on PR #1870: URL: https://github.com/apache/datafusion-comet/pull/1870#issuecomment-2960634456 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1870?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] feat: Add aggregate expression fuzz testing in CI [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #1374: feat: Add aggregate expression fuzz testing in CI URL: https://github.com/apache/datafusion-comet/pull/1374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Add ANSI support for Add, Subtract & Multiply [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #1135: URL: https://github.com/apache/datafusion-comet/pull/1135#issuecomment-2960630950 I am closing this PR since it is no longer active. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] Add ANSI support for Add, Subtract & Multiply [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #1135: Add ANSI support for Add, Subtract & Multiply URL: https://github.com/apache/datafusion-comet/pull/1135 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] feat: add expression array_size [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #1122: URL: https://github.com/apache/datafusion-comet/pull/1122#issuecomment-2960629920 I am closing this PR since it is no longer active. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] feat: Implement ANSI support for Round [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #989: feat: Implement ANSI support for Round URL: https://github.com/apache/datafusion-comet/pull/989 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] feat: add expression array_size [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #1122: feat: add expression array_size URL: https://github.com/apache/datafusion-comet/pull/1122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] feat: Implement ANSI support for Round [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #989: URL: https://github.com/apache/datafusion-comet/pull/989#issuecomment-2960629321 I am closing this PR since it is no longer active. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] chore: Override node name for CometSparkToColumnar [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #958: URL: https://github.com/apache/datafusion-comet/pull/958#issuecomment-2960628775 I am closing this PR since it is no longer active. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] chore: Override node name for CometSparkToColumnar [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #958: chore: Override node name for CometSparkToColumnar URL: https://github.com/apache/datafusion-comet/pull/958 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] build: Upgrade Spark 4.0 to preview2 [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #955: build: Upgrade Spark 4.0 to preview2 URL: https://github.com/apache/datafusion-comet/pull/955 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] build: Upgrade Spark 4.0 to preview2 [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #955: URL: https://github.com/apache/datafusion-comet/pull/955#issuecomment-2960626602 I am closing this PR since it has not been active lately. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] build(deps): bump com.google.protobuf:protobuf-java from 3.19.6 to 3.25.5 [datafusion-comet]

2025-06-10 Thread via GitHub
dependabot[bot] commented on PR #954: URL: https://github.com/apache/datafusion-comet/pull/954#issuecomment-2960626032 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor versi

Re: [PR] build(deps): bump com.google.protobuf:protobuf-java from 3.19.6 to 3.25.5 [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #954: build(deps): bump com.google.protobuf:protobuf-java from 3.19.6 to 3.25.5 URL: https://github.com/apache/datafusion-comet/pull/954 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] feat: ANSI support for Add [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #616: URL: https://github.com/apache/datafusion-comet/pull/616#issuecomment-2960624710 I am closing this PR since it has not been active lately. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] chore: add a utils method to getColumnReader with SQLConf [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #360: URL: https://github.com/apache/datafusion-comet/pull/360#issuecomment-2960623336 @huaxingao Is this PR still needed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] feat: Only create one native plan for a query on an executor [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #1203: URL: https://github.com/apache/datafusion-comet/pull/1203#issuecomment-2960622532 I am closing this PR since it has not been active lately. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] feat: Only create one native plan for a query on an executor [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #1203: feat: Only create one native plan for a query on an executor URL: https://github.com/apache/datafusion-comet/pull/1203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [PoC] Add API for tracking distinct arrays in `MemoryPool` by reference count [datafusion]

2025-06-10 Thread via GitHub
Dandandan commented on PR #16359: URL: https://github.com/apache/datafusion/pull/16359#issuecomment-2960579460 Hm - `.slice()` of course creates another `Arc`, so we actually have to use the buffers `Arc` rather than arrays. -- This is an automated message from the Apache Git Service. To

[PR] chore: Stop Running Spark SQL tests for Spark 3.5.4 and 3.5.5 [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove opened a new pull request, #1870: URL: https://github.com/apache/datafusion-comet/pull/1870 ## Which issue does this PR close? N/A ## Rationale for this change Reduce developer overhead of keeping multiple diff files up-to-date for Spark 3.5 and

Re: [D] Search Pushdown (e.g. Vector Search) Into Table Providers [datafusion]

2025-06-10 Thread via GitHub
GitHub user backkem added a comment to the discussion: Search Pushdown (e.g. Vector Search) Into Table Providers FWIW there is an existing implementation of the "full logical plan pushdown" in [datafusion-contrib/datafusion-federation](https://github.com/datafusion-contrib/datafusion-federatio

Re: [PR] chore: Skip some Spark SQL test runs on PRs [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #1868: chore: Skip some Spark SQL test runs on PRs URL: https://github.com/apache/datafusion-comet/pull/1868 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] Add API for tracking distinct arrays [datafusion]

2025-06-10 Thread via GitHub
Dandandan opened a new pull request, #16359: URL: https://github.com/apache/datafusion/pull/16359 ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes teste

Re: [PR] chore: Enable more Spark SQL tests [datafusion-comet]

2025-06-10 Thread via GitHub
codecov-commenter commented on PR #1869: URL: https://github.com/apache/datafusion-comet/pull/1869#issuecomment-2960381202 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1869?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] [EPIC] Improve shuffle performance [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on issue #1123: URL: https://github.com/apache/datafusion-comet/issues/1123#issuecomment-2960359323 I'm going to go ahead and close this now that almost all of the issues linked to this epic have been implemented. We have definitely seen a significant improvement in shu

  1   2   >