[PR] Minor: fix extend sqllogical consistent with main test [datafusion]

2025-03-10 Thread via GitHub
zhuqi-lucas opened a new pull request, #15145: URL: https://github.com/apache/datafusion/pull/15145 ## Which issue does this PR close? fix extend sqllogical consistent with main test ## Rationale for this change Similar to: https://github.com/apache/datafusion/issu

Re: [PR] Implement tree rendering for `SortPreservingMergeExec` [datafusion]

2025-03-10 Thread via GitHub
2010YOUY01 commented on PR #15140: URL: https://github.com/apache/datafusion/pull/15140#issuecomment-2712575632 Thank you for making this happen. I have a suggestion: I think the only field needed inside `SPM` is the sort keys, how about making it consistent with those in `SortExec`?

Re: [PR] add support for `with` clauses (CTEs) in `delete` statements [datafusion-sqlparser-rs]

2025-03-10 Thread via GitHub
lovasoa commented on code in PR #1764: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1764#discussion_r1988500551 ## src/parser/mod.rs: ## @@ -10202,19 +10209,25 @@ impl<'a> Parser<'a> { } } +/// Parse a `WITH` clause, i.e. a `WITH` keyword foll

Re: [PR] add support for `with` clauses (CTEs) in `delete` statements [datafusion-sqlparser-rs]

2025-03-10 Thread via GitHub
lovasoa commented on code in PR #1764: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1764#discussion_r1988500551 ## src/parser/mod.rs: ## @@ -10202,19 +10209,25 @@ impl<'a> Parser<'a> { } } +/// Parse a `WITH` clause, i.e. a `WITH` keyword foll

Re: [I] Change mapping of SQL `VARCHAR` from `Utf8` to `Utf8View` [datafusion]

2025-03-10 Thread via GitHub
zhuqi-lucas commented on issue #15096: URL: https://github.com/apache/datafusion/issues/15096#issuecomment-2712795395 Create the ticket for avro: - [ ] Support Utf8View for avro [#7262](https://github.com/apache/arrow-rs/issues/7262) -- This is an automated message from the

Re: [PR] shell script to collect Benchmarks [datafusion]

2025-03-10 Thread via GitHub
logan-keede commented on PR #15144: URL: https://github.com/apache/datafusion/pull/15144#issuecomment-2712783895 One problem that I sometimes encounter is that cargo decides to use `arrow-arith v53.4.0` for particular releases which ends up giving compilation error. I’m not sure why this

Re: [PR] Renaming Internal Structs [datafusion-python]

2025-03-10 Thread via GitHub
Spaarsh commented on code in PR #1059: URL: https://github.com/apache/datafusion-python/pull/1059#discussion_r1988468214 ## src/expr.rs: ## @@ -100,22 +100,37 @@ pub mod window; use sort_expr::{to_sort_expressions, PySortExpr}; +// Define the new RawExpr struct and impleme

[PR] shell script to collect Benchmarks [datafusion]

2025-03-10 Thread via GitHub
logan-keede opened a new pull request, #15144: URL: https://github.com/apache/datafusion/pull/15144 ## Which issue does this PR close? - Part of #5504 ## Rationale for this change > Here is a suggestion on how to proceed with this project: > 1. Create the converte

Re: [PR] add support for `with` clauses (CTEs) in `delete` statements [datafusion-sqlparser-rs]

2025-03-10 Thread via GitHub
iffyio commented on code in PR #1764: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1764#discussion_r1988453837 ## src/parser/mod.rs: ## @@ -10202,19 +10209,25 @@ impl<'a> Parser<'a> { } } +/// Parse a `WITH` clause, i.e. a `WITH` keyword follo

Re: [I] `ScalarValue::to_array` panics when getting statistics for List column [datafusion]

2025-03-10 Thread via GitHub
trueleo closed issue #5706: `ScalarValue::to_array` panics when getting statistics for List column URL: https://github.com/apache/datafusion/issues/5706 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Add all missing table options to be handled in any order [datafusion-sqlparser-rs]

2025-03-10 Thread via GitHub
iffyio commented on code in PR #1747: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1747#discussion_r1988442245 ## src/ast/dml.rs: ## @@ -138,6 +143,30 @@ pub struct CreateTable { pub engine: Option, pub comment: Option, pub auto_increment_offset: O

[I] Invalid schema for unions in ViewTable [datafusion]

2025-03-10 Thread via GitHub
Friede80 opened a new issue, #15134: URL: https://github.com/apache/datafusion/issues/15134 ### Describe the bug When a ViewTable is created, the plan is run through the `Analyzer` with the `ExpandWildcardRule` and `TypeCoercion` rules. When this ViewTable is later inlined, it is run

Re: [PR] perf: unwrap cast for comparing ints =/!= strings [datafusion]

2025-03-10 Thread via GitHub
alan910127 commented on code in PR #15110: URL: https://github.com/apache/datafusion/pull/15110#discussion_r1988197176 ## datafusion/optimizer/src/simplify_expressions/unwrap_cast.rs: ## @@ -177,6 +192,45 @@ pub(super) fn is_cast_expr_and_support_unwrap_cast_in_comparison_for_i

Re: [PR] Minor: Fix invalid query in test [datafusion]

2025-03-10 Thread via GitHub
jayzhan211 merged PR #15131: URL: https://github.com/apache/datafusion/pull/15131 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

[PR] fixed PushDownFilter bug [15047] [datafusion]

2025-03-10 Thread via GitHub
Jiashu-Hu opened a new pull request, #15142: URL: https://github.com/apache/datafusion/pull/15142 …revent this specific situation ## Which issue does this PR close? - Closes #[15047](https://github.com/apache/datafusion/issues/15047). ## Rationale for this change

Re: [I] beautify default column names [datafusion]

2025-03-10 Thread via GitHub
NevroHelios commented on issue #2027: URL: https://github.com/apache/datafusion/issues/2027#issuecomment-2712675671 Since it is still open can I work on it and submit a pr? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [I] Rewrite `datafusion-sqlancer` in Rust [datafusion]

2025-03-10 Thread via GitHub
2010YOUY01 commented on issue #14535: URL: https://github.com/apache/datafusion/issues/14535#issuecomment-2712646798 > Hello, I am interested in applying to work on this project for GSoC. After reading through [#11030](https://github.com/apache/datafusion/issues/11030) , it looks like the t

Re: [PR] Implement tree explain for `RepartitionExec` and `WorkTableExec` [datafusion]

2025-03-10 Thread via GitHub
2010YOUY01 commented on code in PR #15137: URL: https://github.com/apache/datafusion/pull/15137#discussion_r1988371514 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -179,19 +185,31 @@ physical_plan 06)└─┬─┘ 07)┌─┴

[PR] feat: Attach `Diagnostic` to more than one column errors in scalar_subquery and in_subquery [datafusion]

2025-03-10 Thread via GitHub
changsun20 opened a new pull request, #15143: URL: https://github.com/apache/datafusion/pull/15143 ## Which issue does this PR close? - Closes #14438. ## Rationale for this change This pull request enhances diagnostic information by attaching the `Diagnos

Re: [PR] feat/improve ruff test coverage [datafusion-python]

2025-03-10 Thread via GitHub
CrystalZhou0529 commented on code in PR #1055: URL: https://github.com/apache/datafusion-python/pull/1055#discussion_r1988370240 ## python/datafusion/udf.py: ## @@ -111,7 +111,27 @@ def __call__(self, *args: Expr) -> Expr: args_raw = [arg.expr for arg in args]

Re: [PR] Implement tree explain for AggregateExec [datafusion]

2025-03-10 Thread via GitHub
Weijun-H commented on code in PR #15103: URL: https://github.com/apache/datafusion/pull/15103#discussion_r1988354902 ## datafusion/physical-plan/src/aggregates/mod.rs: ## @@ -809,8 +809,60 @@ impl DisplayAs for AggregateExec { } } Displ

Re: [PR] Renaming Internal Structs [datafusion-python]

2025-03-10 Thread via GitHub
Spaarsh commented on code in PR #1059: URL: https://github.com/apache/datafusion-python/pull/1059#discussion_r1988360570 ## src/expr.rs: ## @@ -100,22 +100,37 @@ pub mod window; use sort_expr::{to_sort_expressions, PySortExpr}; +// Define the new RawExpr struct and impleme

Re: [I] Make it easier to run TPCH queries with datafusion-cli [datafusion]

2025-03-10 Thread via GitHub
matthewmturner commented on issue #14608: URL: https://github.com/apache/datafusion/issues/14608#issuecomment-2712539973 @clflushopt this is _awesome_. Once you release I will likely add this to [dft](https://github.com/datafusion-contrib/datafusion-dft). -- This is an automated message

Re: [I] Change in behavior for deep structure columns with the latest sql parser upgrade [datafusion]

2025-03-10 Thread via GitHub
chenkovsky commented on issue #15118: URL: https://github.com/apache/datafusion/issues/15118#issuecomment-2710431152 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[I] Implement tree explain for `CoalesceBatchesExec` [datafusion]

2025-03-10 Thread via GitHub
Standing-Man opened a new issue, #15141: URL: https://github.com/apache/datafusion/issues/15141 ### Is your feature request related to a problem or challenge? Part of #14914 ### Describe the solution you'd like _No response_ ### Describe alternatives you've conside

Re: [I] Implement tree explain for `CoalesceBatchesExec` [datafusion]

2025-03-10 Thread via GitHub
Standing-Man commented on issue #15141: URL: https://github.com/apache/datafusion/issues/15141#issuecomment-2712465888 @irenjj -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] Make it easier to run TPCH queries with datafusion-cli [datafusion]

2025-03-10 Thread via GitHub
clflushopt commented on issue #14608: URL: https://github.com/apache/datafusion/issues/14608#issuecomment-2712464986 For anyone following this issue I have a full port here https://github.com/clflushopt/tpchgen-rs and I am working on completing a first release (I have issues to track that m

Re: [I] Implement tree explain for `CoalesceBatchesExec` [datafusion]

2025-03-10 Thread via GitHub
Standing-Man commented on issue #15141: URL: https://github.com/apache/datafusion/issues/15141#issuecomment-2712465988 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] Implement tree rendering for `SortPreservingMergeExec` [datafusion]

2025-03-10 Thread via GitHub
Standing-Man opened a new pull request, #15140: URL: https://github.com/apache/datafusion/pull/15140 ## Which issue does this PR close? - Closes #15139 and Part of #14914. ## Rationale for this change ## What changes are included in this PR? Imp

Re: [I] Implement tree explain for `SortPreservingMergeExec` [datafusion]

2025-03-10 Thread via GitHub
Standing-Man commented on issue #15139: URL: https://github.com/apache/datafusion/issues/15139#issuecomment-2712433934 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] WIP: test parquet modular encryption support [datafusion]

2025-03-10 Thread via GitHub
alamb opened a new pull request, #15133: URL: https://github.com/apache/datafusion/pull/15133 ## Which issue does this PR close? - related to https://github.com/apache/arrow-rs/pull/6637 ## Rationale for this change I am using this PR to help verify that the chang

Re: [PR] support run mutiple queries in TPC-H benchmark [datafusion-ray]

2025-03-10 Thread via GitHub
zhangx commented on PR #82: URL: https://github.com/apache/datafusion-ray/pull/82#issuecomment-2712446883 > @zhangx thank you for submitting this! > > I submitted one at the same time with a similar fix, combined with a few other small changes that came up during benchmarking.

[PR] Implement tree explain for `RepartitionExec` and `WorkTableExec` [datafusion]

2025-03-10 Thread via GitHub
Standing-Man opened a new pull request, #15137: URL: https://github.com/apache/datafusion/pull/15137 ## Which issue does this PR close? - Closes #15097 and part of #14914. ## Rationale for this change ## What changes are included in this PR?

[I] Implement tree explain for `SortPreservingMergeExec` [datafusion]

2025-03-10 Thread via GitHub
Standing-Man opened a new issue, #15139: URL: https://github.com/apache/datafusion/issues/15139 ### Is your feature request related to a problem or challenge? Part of #14914 ### Describe the solution you'd like _No response_ ### Describe alternatives you've conside

Re: [I] Implement tree explain for `PlaceholderRowExec` [datafusion]

2025-03-10 Thread via GitHub
pranavJibhakate commented on issue #15138: URL: https://github.com/apache/datafusion/issues/15138#issuecomment-2712342026 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] chore: add an "expr_planners" method to SessionState [datafusion]

2025-03-10 Thread via GitHub
niebayes commented on PR #15119: URL: https://github.com/apache/datafusion/pull/15119#issuecomment-2712347513 @alamb I wonder if we can remove the `register_expr_planners` and `expr_planners` from the `FunctionRegistry` trait. I have checked the codebase and they're only used by a test. And

Re: [PR] Add tests for simplification and coercion of `SessionContext::create_physical_expr` [datafusion]

2025-03-10 Thread via GitHub
jayzhan211 merged PR #15034: URL: https://github.com/apache/datafusion/pull/15034 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] preserve sql formatting through a parse + display roundtrip (partial implementation) [datafusion-sqlparser-rs]

2025-03-10 Thread via GitHub
github-actions[bot] closed pull request #1636: preserve sql formatting through a parse + display roundtrip (partial implementation) URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1636 -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Find a way to communicate the ordering of a file back with the existi… [datafusion]

2025-03-10 Thread via GitHub
github-actions[bot] closed pull request #13933: Find a way to communicate the ordering of a file back with the existi… URL: https://github.com/apache/datafusion/pull/13933 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] feat: add `register_metadata` function for `GroupsAccumulator` [datafusion]

2025-03-10 Thread via GitHub
jayzhan211 commented on code in PR #15022: URL: https://github.com/apache/datafusion/pull/15022#discussion_r1988229738 ## datafusion/expr-common/src/groups_accumulator.rs: ## @@ -251,3 +261,18 @@ pub trait GroupsAccumulator: Send { /// compute, not `O(num_groups)` fn s

Re: [I] Building project takes a *long* time (esp compilation time for `datafusion` core crate) [datafusion]

2025-03-10 Thread via GitHub
tustvold commented on issue #13814: URL: https://github.com/apache/datafusion/issues/13814#issuecomment-2711684269 IIRC that relates to type checking expressions, and therefore this would suggest the compiler is spending a lot of time resolving generics. At least historically non-boxed asyn

Re: [I] Implement tree explain for `PlaceholderRowExec` [datafusion]

2025-03-10 Thread via GitHub
Standing-Man commented on issue #15138: URL: https://github.com/apache/datafusion/issues/15138#issuecomment-2712286093 @irenjj -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] feat: add `register_metadata` function for `GroupsAccumulator` [datafusion]

2025-03-10 Thread via GitHub
jayzhan211 commented on code in PR #15022: URL: https://github.com/apache/datafusion/pull/15022#discussion_r1988229738 ## datafusion/expr-common/src/groups_accumulator.rs: ## @@ -251,3 +261,18 @@ pub trait GroupsAccumulator: Send { /// compute, not `O(num_groups)` fn s

Re: [PR] Expand wildcard to actual expressions in `prepare_select_exprs` [datafusion]

2025-03-10 Thread via GitHub
alamb commented on code in PR #15090: URL: https://github.com/apache/datafusion/pull/15090#discussion_r1987814170 ## datafusion/sqllogictest/test_files/order.slt: ## @@ -985,13 +985,20 @@ drop table ambiguity_test; statement ok create table t(a0 int, a int, b int, c int) as va

Re: [PR] feat: add `register_metadata` function for `GroupsAccumulator` [datafusion]

2025-03-10 Thread via GitHub
jayzhan211 commented on code in PR #15022: URL: https://github.com/apache/datafusion/pull/15022#discussion_r1988229738 ## datafusion/expr-common/src/groups_accumulator.rs: ## @@ -251,3 +261,18 @@ pub trait GroupsAccumulator: Send { /// compute, not `O(num_groups)` fn s

Re: [PR] fix: unparse for subqueryalias [datafusion]

2025-03-10 Thread via GitHub
goldmedal commented on code in PR #15068: URL: https://github.com/apache/datafusion/pull/15068#discussion_r1987731975 ## datafusion/core/tests/sql/select.rs: ## @@ -350,3 +351,48 @@ async fn test_version_function() { assert_eq!(version.value(0), expected_version); } + +#

Re: [I] Building project takes a *long* time (esp compilation time for `datafusion` core crate) [datafusion]

2025-03-10 Thread via GitHub
comphead commented on issue #13814: URL: https://github.com/apache/datafusion/issues/13814#issuecomment-2711668760 might be compiler related? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Int64 as default type for make_array function empty or null case [datafusion]

2025-03-10 Thread via GitHub
jayzhan211 commented on code in PR #10790: URL: https://github.com/apache/datafusion/pull/10790#discussion_r1988207529 ## datafusion/functions-array/src/make_array.rs: ## @@ -131,6 +131,11 @@ impl ScalarUDFImpl for MakeArray { } } +// Empty array is a special case that i

Re: [PR] fix: unparse for subqueryalias [datafusion]

2025-03-10 Thread via GitHub
alamb commented on code in PR #15068: URL: https://github.com/apache/datafusion/pull/15068#discussion_r1987794389 ## datafusion/core/tests/sql/select.rs: ## @@ -350,3 +351,48 @@ async fn test_version_function() { assert_eq!(version.value(0), expected_version); } + +#[tok

Re: [PR] perf: unwrap cast for comparing ints =/!= strings [datafusion]

2025-03-10 Thread via GitHub
alan910127 commented on code in PR #15110: URL: https://github.com/apache/datafusion/pull/15110#discussion_r1988192389 ## datafusion/optimizer/src/simplify_expressions/unwrap_cast.rs: ## @@ -177,6 +192,45 @@ pub(super) fn is_cast_expr_and_support_unwrap_cast_in_comparison_for_i

Re: [PR] Minor: Fix invalid query in test [datafusion]

2025-03-10 Thread via GitHub
jayzhan211 commented on PR #15131: URL: https://github.com/apache/datafusion/pull/15131#issuecomment-2712172487 Thanks @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Add tests for simplification and coercion of `SessionContext::create_physical_expr` [datafusion]

2025-03-10 Thread via GitHub
jayzhan211 commented on PR #15034: URL: https://github.com/apache/datafusion/pull/15034#issuecomment-2712168720 Thanks @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] perf: unwrap cast for comparing ints =/!= strings [datafusion]

2025-03-10 Thread via GitHub
alan910127 commented on code in PR #15110: URL: https://github.com/apache/datafusion/pull/15110#discussion_r1988192389 ## datafusion/optimizer/src/simplify_expressions/unwrap_cast.rs: ## @@ -177,6 +192,45 @@ pub(super) fn is_cast_expr_and_support_unwrap_cast_in_comparison_for_i

Re: [I] Change naming of rust exposed structs to ease debugging [datafusion-python]

2025-03-10 Thread via GitHub
kylebarron commented on issue #853: URL: https://github.com/apache/datafusion-python/issues/853#issuecomment-2711787803 fwiw I always define my classes exported to Python with a Py prefix on the rust side, and then rename the actual export from within the pyclass macro -- This is an auto

Re: [I] `flatten` should be single-step, not recursive [datafusion]

2025-03-10 Thread via GitHub
delamarch3 commented on issue #13757: URL: https://github.com/apache/datafusion/issues/13757#issuecomment-2711632124 Hi @logan-keede, is it ok if I pick this up? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] feat: Add `datafusion-spark` crate [datafusion]

2025-03-10 Thread via GitHub
andygrove closed pull request #14392: feat: Add `datafusion-spark` crate URL: https://github.com/apache/datafusion/pull/14392 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Fix broken `serde` feature [datafusion]

2025-03-10 Thread via GitHub
vadimpiven commented on code in PR #15124: URL: https://github.com/apache/datafusion/pull/15124#discussion_r1988170629 ## datafusion/core/Cargo.toml: ## @@ -79,7 +79,7 @@ recursive_protection = [ "datafusion-physical-optimizer/recursive_protection", "datafusion-sql/rec

Re: [PR] Fix broken `serde` feature [datafusion]

2025-03-10 Thread via GitHub
vadimpiven commented on PR #15124: URL: https://github.com/apache/datafusion/pull/15124#issuecomment-2712113391 @Weijun-H added test, please check that it is in the correct place. Also I was not sure if I should make a separate pipeline invocation, please let me know if I should -- This

Re: [PR] Per file filter evaluation [datafusion]

2025-03-10 Thread via GitHub
adriangb commented on code in PR #15057: URL: https://github.com/apache/datafusion/pull/15057#discussion_r1988140038 ## datafusion/datasource-parquet/src/opener.rs: ## @@ -111,18 +109,18 @@ impl FileOpener for ParquetOpener { .schema_adapter_factory .cr

[PR] chore: remove ScalarUDFImpl::return_type_from_exprs [datafusion]

2025-03-10 Thread via GitHub
Blizzara opened a new pull request, #15130: URL: https://github.com/apache/datafusion/pull/15130 use `return_type_from_args` instead ## Which issue does this PR close? - Closes #14729 ## Rationale for this change Implementing `return_type_from_exprs` is almost

Re: [PR] fix: enable full decimal to decimal support [datafusion-comet]

2025-03-10 Thread via GitHub
andygrove commented on code in PR #1385: URL: https://github.com/apache/datafusion-comet/pull/1385#discussion_r1987807688 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -1210,27 +1213,36 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlanH

Re: [PR] Per file filter evaluation [datafusion]

2025-03-10 Thread via GitHub
adriangb commented on PR #15057: URL: https://github.com/apache/datafusion/pull/15057#issuecomment-2712042344 The example is now working and even does stats pruning of shredded columns 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Update python min version to 3.9 [datafusion-python]

2025-03-10 Thread via GitHub
kevinjqliu commented on PR #1043: URL: https://github.com/apache/datafusion-python/pull/1043#issuecomment-2711411064 @timsaucer nope this LGTM. I double check the changes in `examples/ffi-table-provider/Cargo.lock` -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Blog: Using Ordering for Better Plans in Apache DataFusion [datafusion-site]

2025-03-10 Thread via GitHub
ozankabak commented on code in PR #58: URL: https://github.com/apache/datafusion-site/pull/58#discussion_r1987861882 ## content/blog/2025-03-05-ordering-analysis.md: ## @@ -0,0 +1,353 @@ +--- +layout: post +title: Analysis of Ordering for Better Plans +date: 2025-03-05 +author:

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-03-10 Thread via GitHub
logan-keede commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2711982900 > > > Hey [@logan-keede](https://github.com/logan-keede) I would think this ticket is a good fit for GSoC [#14510](https://github.com/apache/datafusion/issues/14510) > >

Re: [PR] fix: Adjust CometTestShuffleMemoryAllocator instantiation [datafusion-comet]

2025-03-10 Thread via GitHub
viirya commented on code in PR #1485: URL: https://github.com/apache/datafusion-comet/pull/1485#discussion_r1988101908 ## spark/src/main/java/org/apache/spark/shuffle/comet/CometShuffleMemoryAllocator.java: ## @@ -48,30 +48,30 @@ public final class CometShuffleMemoryAllocator ex

[PR] Minor: Fix invalid query in test [datafusion]

2025-03-10 Thread via GitHub
alamb opened a new pull request, #15131: URL: https://github.com/apache/datafusion/pull/15131 ## Which issue does this PR close? - Related to https://github.com/apache/datafusion/pull/15090 ## Rationale for this change In https://github.com/apache/datafusion/pull/15090 @j

Re: [PR] fix: Adjust CometTestShuffleMemoryAllocator instantiation [datafusion-comet]

2025-03-10 Thread via GitHub
andygrove commented on code in PR #1485: URL: https://github.com/apache/datafusion-comet/pull/1485#discussion_r1988084656 ## spark/src/main/java/org/apache/spark/shuffle/comet/CometShuffleMemoryAllocator.java: ## @@ -48,30 +48,30 @@ public final class CometShuffleMemoryAllocator

Re: [I] PR builds failing in Spark SQL tests with org.xerial.snappy import issue [datafusion-comet]

2025-03-10 Thread via GitHub
andygrove commented on issue #1492: URL: https://github.com/apache/datafusion-comet/issues/1492#issuecomment-2711940237 deleting the caches in GitHub actions resolved this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] fix: Adjust CometTestShuffleMemoryAllocator instantiation [datafusion-comet]

2025-03-10 Thread via GitHub
andygrove commented on code in PR #1485: URL: https://github.com/apache/datafusion-comet/pull/1485#discussion_r1988082049 ## spark/src/main/java/org/apache/spark/shuffle/comet/CometShuffleMemoryAllocator.java: ## @@ -48,30 +48,30 @@ public final class CometShuffleMemoryAllocator

[PR] add support for with clauses in delete statements [datafusion-sqlparser-rs]

2025-03-10 Thread via GitHub
lovasoa opened a new pull request, #1764: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1764 fixes https://github.com/apache/datafusion-sqlparser-rs/issues/1763 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-03-10 Thread via GitHub
niebayes commented on PR #14689: URL: https://github.com/apache/datafusion/pull/14689#issuecomment-2710184716 Hi, does anyone observe performance regression of aggregation query after this PR was merged? Not merely `count(*)` but almost all aggregation functions. -- This is an automated m

Re: [PR] fix: Adjust CometTestShuffleMemoryAllocator instantiation [datafusion-comet]

2025-03-10 Thread via GitHub
andygrove commented on code in PR #1485: URL: https://github.com/apache/datafusion-comet/pull/1485#discussion_r1987723154 ## spark/src/main/java/org/apache/spark/shuffle/comet/CometShuffleMemoryAllocator.java: ## @@ -48,30 +48,30 @@ public final class CometShuffleMemoryAllocator

Re: [PR] fix: Adjust CometTestShuffleMemoryAllocator instantiation [datafusion-comet]

2025-03-10 Thread via GitHub
andygrove commented on code in PR #1485: URL: https://github.com/apache/datafusion-comet/pull/1485#discussion_r1988060641 ## spark/src/main/java/org/apache/spark/shuffle/comet/CometShuffleMemoryAllocator.java: ## @@ -48,30 +48,30 @@ public final class CometShuffleMemoryAllocator

Re: [I] [EPIC] Complete `SQL EXPLAIN` Tree Rendering [datafusion]

2025-03-10 Thread via GitHub
alamb commented on issue #14914: URL: https://github.com/apache/datafusion/issues/14914#issuecomment-2711612710 > > Can make `tree` a subcommand of `explain`, like `explain tree `. > > It looks like this is something that [#15021](https://github.com/apache/datafusion/issues/15021) is

[I] CTEs (`WITH` clauses) in `UPDATE`, `DELETE` and `INSERT` statements [datafusion-sqlparser-rs]

2025-03-10 Thread via GitHub
lovasoa opened a new issue, #1763: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1763 SQLite, Postgres, and Mssql support WITH clauses attached to data-modifying expressions. ```sql with x(y) as (select 1) UPDATE demo SET Name = 'j'; ``` ```sql with x(

Re: [I] Update python min version to 3.9 [datafusion-python]

2025-03-10 Thread via GitHub
timsaucer commented on issue #1042: URL: https://github.com/apache/datafusion-python/issues/1042#issuecomment-2711462228 Note: previous comment was just to test the github workflow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] Minor: Fix invalid query in test [datafusion]

2025-03-10 Thread via GitHub
alamb commented on code in PR #15131: URL: https://github.com/apache/datafusion/pull/15131#discussion_r1987868432 ## datafusion/sqllogictest/test_files/order.slt: ## @@ -986,17 +986,26 @@ statement ok create table t(a0 int, a int, b int, c int) as values (1, 2, 3, 4), (5, 6, 7,

Re: [PR] fix: enable full decimal to decimal support [datafusion-comet]

2025-03-10 Thread via GitHub
himadripal commented on code in PR #1385: URL: https://github.com/apache/datafusion-comet/pull/1385#discussion_r1987983268 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -1210,27 +1213,36 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlan

Re: [PR] Renaming Internal Structs [datafusion-python]

2025-03-10 Thread via GitHub
timsaucer commented on code in PR #1059: URL: https://github.com/apache/datafusion-python/pull/1059#discussion_r1987943713 ## src/expr.rs: ## @@ -100,22 +100,37 @@ pub mod window; use sort_expr::{to_sort_expressions, PySortExpr}; +// Define the new RawExpr struct and imple

Re: [I] Remove the need for registering an ObjectStore for remote files [datafusion-python]

2025-03-10 Thread via GitHub
kevinjqliu commented on issue #899: URL: https://github.com/apache/datafusion-python/issues/899#issuecomment-2711464518 I also ran into this issue integrating `iceberg-python` with `datafusion-python` using the FFI Table Provider. The integration goes from `iceberg-python` -> `iceberg-rust

Re: [PR] fix: enable full decimal to decimal support [datafusion-comet]

2025-03-10 Thread via GitHub
himadripal commented on code in PR #1385: URL: https://github.com/apache/datafusion-comet/pull/1385#discussion_r1987886072 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -1210,27 +1213,36 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlan

Re: [PR] fix: enable full decimal to decimal support [datafusion-comet]

2025-03-10 Thread via GitHub
kazuyukitanimura commented on code in PR #1385: URL: https://github.com/apache/datafusion-comet/pull/1385#discussion_r1987953695 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -1210,27 +1213,36 @@ class CometCastSuite extends CometTestBase with AdaptiveSpa

Re: [PR] fix: enable full decimal to decimal support [datafusion-comet]

2025-03-10 Thread via GitHub
himadripal commented on code in PR #1385: URL: https://github.com/apache/datafusion-comet/pull/1385#discussion_r1987983268 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -1210,27 +1213,36 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlan

Re: [I] Improve Parsing for KV Format in `tree` explain. [datafusion]

2025-03-10 Thread via GitHub
alamb closed issue #15098: Improve Parsing for KV Format in `tree` explain. URL: https://github.com/apache/datafusion/issues/15098 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Refactor EnforceDistribution test cases to demonstrate dependencies across optimizer runs. [datafusion]

2025-03-10 Thread via GitHub
alamb commented on PR #15074: URL: https://github.com/apache/datafusion/pull/15074#issuecomment-2710816307 I'll plan to merge this later today unless anyone else would like more time to review -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Enable take comments to assign issues to users [datafusion-python]

2025-03-10 Thread via GitHub
timsaucer merged PR #1058: URL: https://github.com/apache/datafusion-python/pull/1058 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

Re: [PR] Fix invalid schema for unions in ViewTables [datafusion]

2025-03-10 Thread via GitHub
Friede80 commented on PR #15135: URL: https://github.com/apache/datafusion/pull/15135#issuecomment-271158 I'm not sure if there are still valid uses of `coerce_union_schema` given only the set of logical plans, but if we can't change the api of a public function, it would be easy enough

Re: [I] Update python min version to 3.9 [datafusion-python]

2025-03-10 Thread via GitHub
timsaucer closed issue #1042: Update python min version to 3.9 URL: https://github.com/apache/datafusion-python/issues/1042 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] perf: unwrap cast for comparing ints =/!= strings [datafusion]

2025-03-10 Thread via GitHub
alan910127 commented on code in PR #15110: URL: https://github.com/apache/datafusion/pull/15110#discussion_r1987355491 ## datafusion/optimizer/src/simplify_expressions/unwrap_cast.rs: ## @@ -177,6 +192,33 @@ pub(super) fn is_cast_expr_and_support_unwrap_cast_in_comparison_for_i

Re: [PR] perf: unwrap cast for comparing ints =/!= strings [datafusion]

2025-03-10 Thread via GitHub
alan910127 commented on code in PR #15110: URL: https://github.com/apache/datafusion/pull/15110#discussion_r1987353874 ## datafusion/optimizer/src/simplify_expressions/unwrap_cast.rs: ## @@ -177,6 +192,33 @@ pub(super) fn is_cast_expr_and_support_unwrap_cast_in_comparison_for_i

Re: [PR] chore: Stop disabling readside padding in TPC stability suite [datafusion-comet]

2025-03-10 Thread via GitHub
codecov-commenter commented on PR #1491: URL: https://github.com/apache/datafusion-comet/pull/1491#issuecomment-2710718614 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1491?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] perf: unwrap cast for comparing ints =/!= strings [datafusion]

2025-03-10 Thread via GitHub
alan910127 commented on code in PR #15110: URL: https://github.com/apache/datafusion/pull/15110#discussion_r1987350998 ## datafusion/optimizer/src/simplify_expressions/unwrap_cast.rs: ## @@ -177,6 +192,33 @@ pub(super) fn is_cast_expr_and_support_unwrap_cast_in_comparison_for_i

Re: [I] Serde feature is broken [datafusion]

2025-03-10 Thread via GitHub
vadimpiven closed issue #15122: Serde feature is broken URL: https://github.com/apache/datafusion/issues/15122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [I] Serde feature is broken [datafusion]

2025-03-10 Thread via GitHub
vadimpiven commented on issue #15122: URL: https://github.com/apache/datafusion/issues/15122#issuecomment-2710635830 Please reopen, closed by mistake by merging PR in fork. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [I] TPC-H benchmark does not run q15 [datafusion-ray]

2025-03-10 Thread via GitHub
zhangx commented on issue #81: URL: https://github.com/apache/datafusion-ray/issues/81#issuecomment-2710625487 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] chore: remove deprecated variants of UDF's invoke (invoke, invoke_no_args, invoke_batch) [datafusion]

2025-03-10 Thread via GitHub
Blizzara commented on code in PR #15123: URL: https://github.com/apache/datafusion/pull/15123#discussion_r1987288818 ## datafusion/core/tests/physical_optimizer/projection_pushdown.rs: ## @@ -89,6 +92,10 @@ impl ScalarUDFImpl for DummyUDF { fn return_type(&self, _arg_types:

Re: [I] March 2025 ASF Board Report (March 12) [datafusion]

2025-03-10 Thread via GitHub
alamb commented on issue #13713: URL: https://github.com/apache/datafusion/issues/13713#issuecomment-2710480146 I have incorporated @robtandy and @kevinjqliu 's comments. Here is the current draft ``` ## Description: The mission of Apache DataFusion is the creation and maintenan

Re: [PR] Order Requirement Analysis [datafusion-site]

2025-03-10 Thread via GitHub
alamb commented on PR #58: URL: https://github.com/apache/datafusion-site/pull/58#issuecomment-2710581497 Giving it another read now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] chore: remove deprecated variants of UDF's invoke (invoke, invoke_no_args, invoke_batch) [datafusion]

2025-03-10 Thread via GitHub
Weijun-H commented on code in PR #15123: URL: https://github.com/apache/datafusion/pull/15123#discussion_r1987269201 ## datafusion/core/tests/physical_optimizer/projection_pushdown.rs: ## @@ -89,6 +92,10 @@ impl ScalarUDFImpl for DummyUDF { fn return_type(&self, _arg_types:

  1   2   >