[PR] Minor: Optimize byte view benchmark to add more groups and more testing cases. [datafusion]

2025-07-22 Thread via GitHub
zhuqi-lucas opened a new pull request, #16862: URL: https://github.com/apache/datafusion/pull/16862 ## Which issue does this PR close? Follow-up https://github.com/apache/datafusion/pull/16826 1. I found some cache will affect this benchmark, so i added more groups in this PR.

Re: [PR] dissallow pushdown of volatile PhysicalExprs [datafusion]

2025-07-22 Thread via GitHub
theirix commented on PR #16861: URL: https://github.com/apache/datafusion/pull/16861#issuecomment-3105891200 Thank you! I'll check my cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Fix `next_up` and `next_down` behavior for zero float values [datafusion]

2025-07-22 Thread via GitHub
berkaysynnada commented on PR #16745: URL: https://github.com/apache/datafusion/pull/16745#issuecomment-3105944943 Hi again @liamzwbao. We’ve discussed this with @ozankabak, and the actual fix should be on the `PartialOrd` implementation of ScalarValue of floats. The comparison currently us

Re: [PR] test: Fix flaky join tests [datafusion]

2025-07-22 Thread via GitHub
2010YOUY01 commented on code in PR #16860: URL: https://github.com/apache/datafusion/pull/16860#discussion_r2224349405 ## datafusion/sqllogictest/test_files/joins.slt: ## @@ -4164,23 +4164,40 @@ AS VALUES (3, 3, true), (3, 3, false); -query B -SELECT * FROM t0 FULL JOIN

[PR] Snowflake: Numeric prefix for stage name part [datafusion-sqlparser-rs]

2025-07-22 Thread via GitHub
yoavcloud opened a new pull request, #1966: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1966 Add support for a numeric prefix for a stage name -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] feat: Upgrade to the official DataFusion 49.0.0 release [datafusion-comet]

2025-07-22 Thread via GitHub
dharanad commented on PR #1997: URL: https://github.com/apache/datafusion-comet/pull/1997#issuecomment-3105504827 > @dharanad any plans on this one? I can take this if you busy @comphead Been occupied at work. Please feel free to pick this one up -- This is an automated message fro

Re: [PR] feat: Upgrade to the official DataFusion 49.0.0 release [datafusion-comet]

2025-07-22 Thread via GitHub
dharanad commented on PR #1997: URL: https://github.com/apache/datafusion-comet/pull/1997#issuecomment-3105504858 > @dharanad any plans on this one? I can take this if you busy @comphead Been occupied at work. Please feel free to pick this one up -- This is an automated message fro

Re: [PR] dissallow pushdown of volatile PhysicalExprs [datafusion]

2025-07-22 Thread via GitHub
adriangb commented on PR #16861: URL: https://github.com/apache/datafusion/pull/16861#issuecomment-3105503133 @theirix could you take a look? Are there any other expressions we should dissallow? @alamb are there any existing APIs to get "volatility" from a `PhysicalExpr`? If not shou

[PR] dissallow pushdown of volatile PhysicalExprs [datafusion]

2025-07-22 Thread via GitHub
adriangb opened a new pull request, #16861: URL: https://github.com/apache/datafusion/pull/16861 Closes #16545 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[PR] test: Fix flaky join tests [datafusion]

2025-07-22 Thread via GitHub
2010YOUY01 opened a new pull request, #16860: URL: https://github.com/apache/datafusion/pull/16860 ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included in this PR? When I was working on a j

Re: [PR] Deprecate `ExprSchema` functions [datafusion]

2025-07-22 Thread via GitHub
github-actions[bot] closed pull request #15847: Deprecate `ExprSchema` functions URL: https://github.com/apache/datafusion/pull/15847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] feat: add macros for DataFusionError variants [datafusion]

2025-07-22 Thread via GitHub
github-actions[bot] commented on PR #15946: URL: https://github.com/apache/datafusion/pull/15946#issuecomment-3105395460 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] fix: Allow ORDER BY aggregates not present in SELECT list [datafusion]

2025-07-22 Thread via GitHub
github-actions[bot] closed pull request #15876: fix: Allow ORDER BY aggregates not present in SELECT list URL: https://github.com/apache/datafusion/pull/15876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Fix `datafusion-cli` memory leak by using `snmalloc` [datafusion]

2025-07-22 Thread via GitHub
github-actions[bot] closed pull request #15963: Fix `datafusion-cli` memory leak by using `snmalloc` URL: https://github.com/apache/datafusion/pull/15963 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Demonstrate wrong statistics reported from parquet [datafusion]

2025-07-22 Thread via GitHub
github-actions[bot] closed pull request #15977: Demonstrate wrong statistics reported from parquet URL: https://github.com/apache/datafusion/pull/15977 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Optimize hash partitioning for cache friendliness [datafusion]

2025-07-22 Thread via GitHub
github-actions[bot] closed pull request #15981: Optimize hash partitioning for cache friendliness URL: https://github.com/apache/datafusion/pull/15981 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Fix Correlated Subquery With Depth Larger Than One [datafusion]

2025-07-22 Thread via GitHub
github-actions[bot] commented on PR #16060: URL: https://github.com/apache/datafusion/pull/16060#issuecomment-3105395323 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [I] Release DataFusion `49.0.0` (July 2025) [datafusion]

2025-07-22 Thread via GitHub
comphead commented on issue #16235: URL: https://github.com/apache/datafusion/issues/16235#issuecomment-3105350373 Testing on Comet, seeing some test failures -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [I] Release DataFusion `49.0.0` (July 2025) [datafusion]

2025-07-22 Thread via GitHub
shehabgamin commented on issue #16235: URL: https://github.com/apache/datafusion/issues/16235#issuecomment-3105316541 Tested on Sail. Everything looks good! https://github.com/lakehq/sail/pull/653 -- This is an automated message from the Apache Git Service. To respond to the message

[PR] speedup `date_trunc` (~7x faster) in some cases [datafusion]

2025-07-22 Thread via GitHub
waynexia opened a new pull request, #16859: URL: https://github.com/apache/datafusion/pull/16859 ## Which issue does this PR close? - follow-up of #14593 ## Rationale for this change Follows the comment https://github.com/apache/datafusion/pull/14593#disc

Re: [I] Physical plan pushdown for volatile predicates [datafusion]

2025-07-22 Thread via GitHub
adriangb commented on issue #16545: URL: https://github.com/apache/datafusion/issues/16545#issuecomment-3104993647 It seems reasonable to me to blacklist some filters. But given that users can create arbitrary trait implementations for PhysicalExpr we obviously can't black list everything.

Re: [PR] feat: Upgrade to the official DataFusion 49.0.0 release [datafusion-comet]

2025-07-22 Thread via GitHub
comphead commented on PR #1997: URL: https://github.com/apache/datafusion-comet/pull/1997#issuecomment-3104955716 @dharanad any plans on this one? I can take this if you busy -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [I] Physical plan pushdown for volatile predicates [datafusion]

2025-07-22 Thread via GitHub
theirix commented on issue #16545: URL: https://github.com/apache/datafusion/issues/16545#issuecomment-3104952628 > > I expect the physical plan optimiser doesn't perform pushdown of volatile predicates. > > I am not sure -- does this result in wrong results? We don't observe i

[PR] MySQL: ALTER TABLE RENAME AS [datafusion-sqlparser-rs]

2025-07-22 Thread via GitHub
altmannmarcelo opened a new pull request, #1965: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1965 Add support for RENAME AS in ALTER TABLE. This is a MySQL extension. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Ensure Substrait consumer can handle expressions in VirtualTable [datafusion]

2025-07-22 Thread via GitHub
vbarua commented on code in PR #16857: URL: https://github.com/apache/datafusion/pull/16857#discussion_r2223862112 ## datafusion/substrait/tests/testdata/test_plans/select_count_from_select_1_virtual_table_expressions.substrait.json: ## @@ -0,0 +1,94 @@ +{ Review Comment: mi

Re: [PR] Fixes 3 bugs during serialization and deserialization of physical plans [datafusion]

2025-07-22 Thread via GitHub
NGA-TRAN commented on code in PR #16858: URL: https://github.com/apache/datafusion/pull/16858#discussion_r2223808490 ## datafusion/core/src/physical_planner.rs: ## @@ -1358,6 +1358,9 @@ impl DefaultPhysicalPlanner { physical_name(expr), ))?]

Re: [PR] Fixes 3 bugs during serialization and deserialization of physical plans [datafusion]

2025-07-22 Thread via GitHub
NGA-TRAN commented on code in PR #16858: URL: https://github.com/apache/datafusion/pull/16858#discussion_r2223805868 ## datafusion/proto/proto/datafusion.proto: ## @@ -859,6 +859,7 @@ message PhysicalScalarUdfNode { optional bytes fun_definition = 3; datafusion_common.Arro

[PR] Fixes 3 bugs during serialization and deserialization of physical plans [datafusion]

2025-07-22 Thread via GitHub
NGA-TRAN opened a new pull request, #16858: URL: https://github.com/apache/datafusion/pull/16858 ## Which issue does this PR close? - Closes #16772 ## Rationale for this change Serialization and deserialization of physical plans are currently incomplete

Re: [PR] Fix: common_sub_expression_eliminate optimizer rule failed [datafusion]

2025-07-22 Thread via GitHub
Col-Waltz commented on PR #16066: URL: https://github.com/apache/datafusion/pull/16066#issuecomment-3104751477 Thanks again for your comment, I tried to compact the code but it leads to additional trait implementations in another parts of the project, so I assumed that it will be better to

Re: [PR] Fix: common_sub_expression_eliminate optimizer rule failed [datafusion]

2025-07-22 Thread via GitHub
Col-Waltz commented on code in PR #16066: URL: https://github.com/apache/datafusion/pull/16066#discussion_r2223769176 ## datafusion/optimizer/src/common_subexpr_eliminate.rs: ## @@ -316,6 +316,19 @@ impl CommonSubexprEliminate { } => {

Re: [PR] Fix: common_sub_expression_eliminate optimizer rule failed [datafusion]

2025-07-22 Thread via GitHub
Col-Waltz commented on code in PR #16066: URL: https://github.com/apache/datafusion/pull/16066#discussion_r2223769176 ## datafusion/optimizer/src/common_subexpr_eliminate.rs: ## @@ -316,6 +316,19 @@ impl CommonSubexprEliminate { } => {

Re: [I] Improve performance on ClickBench [datafusion-comet]

2025-07-22 Thread via GitHub
Iskander14yo commented on issue #2035: URL: https://github.com/apache/datafusion-comet/issues/2035#issuecomment-3104690263 @parthchandra `query.py` in the original PR contains the whole SparkSession configuration. Here you can see following: ```python df = spark.read.parquet("hits.par

Re: [I] Improve performance on ClickBench [datafusion-comet]

2025-07-22 Thread via GitHub
parthchandra commented on issue #2035: URL: https://github.com/apache/datafusion-comet/issues/2035#issuecomment-3104670999 Thanks @Iskander14yo. I'll try that. Do you know how the values for EventDate are meant to be interpreted? I tried days in unix epoch but that also filtered out all t

Re: [I] Only 4 tpc-h queries have matching physical plans before serialization and after deserialization [datafusion]

2025-07-22 Thread via GitHub
NGA-TRAN commented on issue #16772: URL: https://github.com/apache/datafusion/issues/16772#issuecomment-3104669997 There are total 3 bugs during serialization and deserialization. I have narrowed down all repros and fixes for them. The PR will be out soon -- This is an automated message f

Re: [PR] feat: enhance support for Decimal128 and Decimal256 [datafusion]

2025-07-22 Thread via GitHub
theirix commented on code in PR #16831: URL: https://github.com/apache/datafusion/pull/16831#discussion_r2223673060 ## datafusion/optimizer/src/simplify_expressions/utils.rs: ## @@ -168,10 +133,17 @@ pub fn is_one(s: &Expr) -> bool { Expr::Literal(ScalarValue::Float64(S

Re: [PR] feat: enhance support for Decimal128 and Decimal256 [datafusion]

2025-07-22 Thread via GitHub
findepi commented on code in PR #16831: URL: https://github.com/apache/datafusion/pull/16831#discussion_r2223630607 ## datafusion/common/src/scalar/mod.rs: ## @@ -1790,6 +1808,27 @@ impl ScalarValue { (Self::Float64(Some(l)), Self::Float64(Some(r))) => {

Re: [PR] feat: enhance support for Decimal128 and Decimal256 [datafusion]

2025-07-22 Thread via GitHub
theirix commented on code in PR #16831: URL: https://github.com/apache/datafusion/pull/16831#discussion_r2223625696 ## datafusion/common/src/scalar/mod.rs: ## @@ -1790,6 +1808,27 @@ impl ScalarValue { (Self::Float64(Some(l)), Self::Float64(Some(r))) => {

Re: [PR] feat: enhance support for Decimal128 and Decimal256 [datafusion]

2025-07-22 Thread via GitHub
theirix commented on code in PR #16831: URL: https://github.com/apache/datafusion/pull/16831#discussion_r2223620746 ## datafusion/common/src/scalar/mod.rs: ## @@ -1382,6 +1382,12 @@ impl ScalarValue { DataType::Float16 => ScalarValue::Float16(Some(f16::from_f32(1.0

[PR] Ensure Substrait consumer can handle expressions in VirtualTable [datafusion]

2025-07-22 Thread via GitHub
lorenarosati opened a new pull request, #16857: URL: https://github.com/apache/datafusion/pull/16857 ## Which issue does this PR close? - Closes #16363. ## Rationale for this change Virtual tables in Substrait can have either literal values or expressions, an

[PR] MINOR: add unit tests for chr function [datafusion]

2025-07-22 Thread via GitHub
waynexia opened a new pull request, #16856: URL: https://github.com/apache/datafusion/pull/16856 ## Which issue does this PR close? - Closes #. ## Rationale for this change I wonder if `chr()` works for larger or negative input, as I read from [`encode_ut

Re: [PR] Speed up `chr` UDF (~4x faster) [datafusion]

2025-07-22 Thread via GitHub
waynexia commented on PR #14700: URL: https://github.com/apache/datafusion/pull/14700#issuecomment-3104237037 > how long does it take to compile the benches library on running `cargo bench` for me it takes ~1min to compile `cargo bench --bench chr` -- This is an automated message f

Re: [PR] Address memory over-accounting in array_agg [datafusion]

2025-07-22 Thread via GitHub
gabotechs commented on code in PR #16816: URL: https://github.com/apache/datafusion/pull/16816#discussion_r2223443238 ## datafusion/functions-aggregate/src/array_agg.rs: ## @@ -315,11 +313,7 @@ impl Accumulator for ArrayAggAccumulator { }; if !val.is_empty()

Re: [PR] feat: support literal for ARRAY top level [datafusion-comet]

2025-07-22 Thread via GitHub
comphead commented on PR #1978: URL: https://github.com/apache/datafusion-comet/pull/1978#issuecomment-3104162980 Depends on https://github.com/apache/datafusion-comet/pull/1997 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Feat: Impl array flatten func [datafusion-comet]

2025-07-22 Thread via GitHub
comphead commented on code in PR #2039: URL: https://github.com/apache/datafusion-comet/pull/2039#discussion_r2223413371 ## spark/src/main/scala/org/apache/comet/serde/arrays.scala: ## @@ -378,3 +378,38 @@ object CometCreateArray extends CometExpressionSerde { } } } + +

Re: [PR] Feat: Impl array flatten func [datafusion-comet]

2025-07-22 Thread via GitHub
comphead commented on code in PR #2039: URL: https://github.com/apache/datafusion-comet/pull/2039#discussion_r2223409537 ## docs/spark_expressions_support.md: ## @@ -98,7 +98,7 @@ - [x] arrays_overlap - [ ] arrays_zip - [x] element_at - - [ ] flatten + - [x] flatten Revie

Re: [PR] Chore: Improve array contains test coverage [datafusion-comet]

2025-07-22 Thread via GitHub
comphead commented on code in PR #2030: URL: https://github.com/apache/datafusion-comet/pull/2030#discussion_r2223402109 ## spark/src/main/scala/org/apache/comet/serde/arrays.scala: ## @@ -136,7 +136,7 @@ object CometArrayAppend extends CometExpressionSerde with IncompatExpr {

Re: [PR] Address memory over-accounting in array_agg [datafusion]

2025-07-22 Thread via GitHub
findepi commented on code in PR #16816: URL: https://github.com/apache/datafusion/pull/16816#discussion_r2223260832 ## datafusion/functions-aggregate/src/array_agg.rs: ## @@ -315,11 +313,7 @@ impl Accumulator for ArrayAggAccumulator { }; if !val.is_empty() {

Re: [PR] Improve async_udf example and docs [datafusion]

2025-07-22 Thread via GitHub
findepi commented on code in PR #16846: URL: https://github.com/apache/datafusion/pull/16846#discussion_r2223254701 ## datafusion-examples/examples/async_udf.rs: ## @@ -15,104 +15,104 @@ // specific language governing permissions and limitations // under the License. -use ar

Re: [PR] Improve async_udf example and docs [datafusion]

2025-07-22 Thread via GitHub
findepi commented on code in PR #16846: URL: https://github.com/apache/datafusion/pull/16846#discussion_r2223250443 ## docs/source/library-user-guide/functions/adding-udfs.md: ## @@ -419,6 +432,7 @@ impl ScalarUDFImpl for AsyncUpper { Ok(DataType::Utf8) } +//

Re: [PR] fix: clean up [iceberg] integration APIs [datafusion-comet]

2025-07-22 Thread via GitHub
huaxingao commented on PR #2032: URL: https://github.com/apache/datafusion-comet/pull/2032#issuecomment-3103829451 @parthchandra I have put back the methods and marked them deprecated. Could you please take one more look? -- This is an automated message from the Apache Git Service. To re

[PR] Feat: Impl array flatten func [datafusion-comet]

2025-07-22 Thread via GitHub
kazantsev-maksim opened a new pull request, #2039: URL: https://github.com/apache/datafusion-comet/pull/2039 ## Which issue does this PR close? Related to Epic: https://github.com/apache/datafusion-comet/issues/1042 array_except: flatten(array(array(1, 2), array(3, 4))) => [1,2,3,4]

Re: [I] Only 4 tpc-h queries have matching physical plans before serialization and after deserialization [datafusion]

2025-07-22 Thread via GitHub
NGA-TRAN commented on issue #16772: URL: https://github.com/apache/datafusion/issues/16772#issuecomment-3103487562 I am investigating this and hopefully will have a fix -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [branch-49] chore: use `equals_datatype` for `BinaryExpr`. Cherry pick to DF 49.0 [datafusion]

2025-07-22 Thread via GitHub
comphead commented on PR #16847: URL: https://github.com/apache/datafusion/pull/16847#issuecomment-3103215052 Thanks @xudong963 for the review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Docs: Update Upgrading.md to reflect 49.0.0 is released [datafusion]

2025-07-22 Thread via GitHub
alamb commented on PR #16853: URL: https://github.com/apache/datafusion/pull/16853#issuecomment-3103194835 I'll plan to merge this PR once the DataFusion 49 release is complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] feat: Allow tree explain format width to be customizable [datafusion]

2025-07-22 Thread via GitHub
alamb merged PR #16827: URL: https://github.com/apache/datafusion/pull/16827 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] chore(deps): bump aws-credential-types from 1.2.3 to 1.2.4 [datafusion]

2025-07-22 Thread via GitHub
findepi merged PR #16815: URL: https://github.com/apache/datafusion/pull/16815 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafu

[PR] Snowflake: GRANT CREATE SCHEMA, GRANT .. ON ALL FUNCTIONS IN SCHEMA [datafusion-sqlparser-rs]

2025-07-22 Thread via GitHub
yoavcloud opened a new pull request, #1964: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1964 This PR addresses two issues: 1. Added missing `CREATE SCHEMA` privilege 2. Added missing `ON ALL FUNCTIONS IN SCHEMA' target object for grant -- This is an automated message

Re: [I] Release DataFusion `49.0.0` (July 2025) [datafusion]

2025-07-22 Thread via GitHub
alamb commented on issue #16235: URL: https://github.com/apache/datafusion/issues/16235#issuecomment-3102465161 I made a few documentation PRs as well: - https://github.com/apache/datafusion/pull/16855 - https://github.com/apache/datafusion/pull/16853 - https://github.com/apache/data

[PR] [main] Update version to 49.0.0, add 49.0.0 changelog [datafusion]

2025-07-22 Thread via GitHub
alamb opened a new pull request, #16855: URL: https://github.com/apache/datafusion/pull/16855 ## Which issue does this PR close? - part of https://github.com/apache/datafusion/issues/16235 ## Rationale for this change Let's bring the changes from the 49 branch to main so

Re: [PR] [branch-49] Final Changelog Tweaks [datafusion]

2025-07-22 Thread via GitHub
alamb merged PR #16852: URL: https://github.com/apache/datafusion/pull/16852 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

[PR] Fix: unnest with alias reports error [datafusion]

2025-07-22 Thread via GitHub
xudong963 opened a new pull request, #16854: URL: https://github.com/apache/datafusion/pull/16854 ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes teste

[PR] Docs: Update Upgrading.md to reflect 49.0.0 is released [datafusion]

2025-07-22 Thread via GitHub
alamb opened a new pull request, #16853: URL: https://github.com/apache/datafusion/pull/16853 ## Which issue does this PR close? - part of https://github.com/apache/datafusion/issues/16235 - part of https://github.com/apache/datafusion/issues/16799 ## Rationale for this chang

Re: [I] Memory accounting model discussion [datafusion]

2025-07-22 Thread via GitHub
notfilippo commented on issue #16841: URL: https://github.com/apache/datafusion/issues/16841#issuecomment-3102449595 According to the docs in [`MemoryPool`](https://docs.rs/datafusion/latest/datafusion/execution/memory_pool/trait.MemoryPool.html#memory-management-design): > DataFusion

[PR] [branch-49] Final Changelog Tweaks [datafusion]

2025-07-22 Thread via GitHub
alamb opened a new pull request, #16852: URL: https://github.com/apache/datafusion/pull/16852 ## Which issue does this PR close? - part of https://github.com/apache/datafusion/issues/16235 - related to https://github.com/apache/datafusion/pull/16847 ## Rationale for this

Re: [I] Optimize concatenation of complex data type, such as list, struct [datafusion]

2025-07-22 Thread via GitHub
xudong963 commented on issue #16838: URL: https://github.com/apache/datafusion/issues/16838#issuecomment-3102442296 Thanks @zhuqi-lucas .Our scenario is `list(struct{})`, and the inner fields of struct are like: ```rust let schema = Arc::new(Schema::new(vec![ Field::new("col1",

Re: [PR] Fix flaky test case in joins.slt [datafusion]

2025-07-22 Thread via GitHub
xudong963 merged PR #16849: URL: https://github.com/apache/datafusion/pull/16849 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] chore(deps): bump sysinfo from 0.35.2 to 0.36.1 [datafusion]

2025-07-22 Thread via GitHub
xudong963 merged PR #16850: URL: https://github.com/apache/datafusion/pull/16850 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Improve async_udf example and docs [datafusion]

2025-07-22 Thread via GitHub
alamb commented on code in PR #16846: URL: https://github.com/apache/datafusion/pull/16846#discussion_r306119 ## datafusion-examples/examples/async_udf.rs: ## @@ -15,104 +15,104 @@ // specific language governing permissions and limitations // under the License. -use arro

Re: [PR] Improve async_udf example and docs [datafusion]

2025-07-22 Thread via GitHub
xudong963 commented on code in PR #16846: URL: https://github.com/apache/datafusion/pull/16846#discussion_r295505 ## datafusion-examples/examples/async_udf.rs: ## @@ -15,104 +15,104 @@ // specific language governing permissions and limitations // under the License. -use

Re: [I] Release DataFusion `49.0.0` (July 2025) [datafusion]

2025-07-22 Thread via GitHub
alamb commented on issue #16235: URL: https://github.com/apache/datafusion/issues/16235#issuecomment-3102413513 Thanks @xudong963 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [I] Release DataFusion `49.0.0` (July 2025) [datafusion]

2025-07-22 Thread via GitHub
xudong963 commented on issue #16235: URL: https://github.com/apache/datafusion/issues/16235#issuecomment-3102384393 Vote process: https://lists.apache.org/thread/c2nq0tloxz0xg1dnhdbjgs19kroynfb1 -- This is an automated message from the Apache Git Service. To respond to the message, please

[PR] fix: `PlaceholderRowExec::partition_statistics` [datafusion]

2025-07-22 Thread via GitHub
crepererum opened a new pull request, #16851: URL: https://github.com/apache/datafusion/pull/16851 ## Which issue does this PR close? \- ## Rationale for this change The current stats are very vague and can actually be better. ## What changes are included in this PR? Us

Re: [I] Release DataFusion `49.0.0` (July 2025) [datafusion]

2025-07-22 Thread via GitHub
xudong963 commented on issue #16235: URL: https://github.com/apache/datafusion/issues/16235#issuecomment-3102356529 > [@alamb](https://github.com/alamb) I will be done testing by tomorrow EOD. If that blocks the RC, I can always test the RC. +1, I'll test mv lib with RC1 -- This is

Re: [I] Release DataFusion `49.0.0` (July 2025) [datafusion]

2025-07-22 Thread via GitHub
xudong963 commented on issue #16235: URL: https://github.com/apache/datafusion/issues/16235#issuecomment-3102346353 FYI, I'm doing the voting process -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Improve async_udf example and docs [datafusion]

2025-07-22 Thread via GitHub
alamb commented on code in PR #16846: URL: https://github.com/apache/datafusion/pull/16846#discussion_r178654 ## docs/source/library-user-guide/functions/adding-udfs.md: ## @@ -434,13 +448,17 @@ impl AsyncScalarUDFImpl for AsyncUpper { Some(10) } +/// Thi

Re: [PR] Improve async_udf example and docs [datafusion]

2025-07-22 Thread via GitHub
alamb commented on code in PR #16846: URL: https://github.com/apache/datafusion/pull/16846#discussion_r172192 ## docs/source/library-user-guide/functions/adding-udfs.md: ## @@ -345,12 +354,17 @@ async fn main() { } ``` -## Adding a Scalar Async UDF +## Adding a Async Sca

Re: [PR] [branch-49] chore: use `equals_datatype` for `BinaryExpr`. Cherry pick to DF 49.0 [datafusion]

2025-07-22 Thread via GitHub
alamb commented on PR #16847: URL: https://github.com/apache/datafusion/pull/16847#issuecomment-3102220306 Thanks @comphead and @xudong963 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] [branch-49] chore: use `equals_datatype` for `BinaryExpr`. Cherry pick to DF 49.0 [datafusion]

2025-07-22 Thread via GitHub
alamb merged PR #16847: URL: https://github.com/apache/datafusion/pull/16847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Blog: Fix page overflow [datafusion-site]

2025-07-22 Thread via GitHub
alamb merged PR #92: URL: https://github.com/apache/datafusion-site/pull/92 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusio

Re: [PR] Blog: Fix page overflow [datafusion-site]

2025-07-22 Thread via GitHub
alamb commented on PR #92: URL: https://github.com/apache/datafusion-site/pull/92#issuecomment-3102153976 > Sorry I’m not available all week. I’ll try to catch up on things next Monday. No worries -- I'll merge this one and we can always adjust in the future if we need to. Tha

Re: [PR] Add note to upgrade guide about MSRV update [datafusion]

2025-07-22 Thread via GitHub
alamb merged PR #16845: URL: https://github.com/apache/datafusion/pull/16845 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Add note to upgrade guide about MSRV update [datafusion]

2025-07-22 Thread via GitHub
alamb commented on PR #16845: URL: https://github.com/apache/datafusion/pull/16845#issuecomment-3102149601 https://github.com/user-attachments/assets/cffd0caf-03d4-4a38-90ee-10fa83d6a22b"; /> Thank you @findepi 🤦 -- This is an automated message from the Apache Git Service.

[PR] chore(deps): bump sysinfo from 0.35.2 to 0.36.1 [datafusion]

2025-07-22 Thread via GitHub
dependabot[bot] opened a new pull request, #16850: URL: https://github.com/apache/datafusion/pull/16850 Bumps [sysinfo](https://github.com/GuillaumeGomez/sysinfo) from 0.35.2 to 0.36.1. Changelog Sourced from https://github.com/GuillaumeGomez/sysinfo/blob/master/CHANGELOG.md";>sysi

Re: [PR] chore(deps): bump sysinfo from 0.35.2 to 0.36.0 [datafusion]

2025-07-22 Thread via GitHub
dependabot[bot] closed pull request #16747: chore(deps): bump sysinfo from 0.35.2 to 0.36.0 URL: https://github.com/apache/datafusion/pull/16747 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] chore(deps): bump sysinfo from 0.35.2 to 0.36.0 [datafusion]

2025-07-22 Thread via GitHub
dependabot[bot] commented on PR #16747: URL: https://github.com/apache/datafusion/pull/16747#issuecomment-3101819137 Superseded by #16850. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[I] URL in doc comments is returning HTTP 404 (not found) [datafusion-sqlparser-rs]

2025-07-22 Thread via GitHub
jcsherin opened a new issue, #1963: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1963 While reviewing https://github.com/apache/datafusion/pull/16839#issuecomment-3097357784 found these broken links in the doc comments. 1. The ClickHouse `attach-partitionpart` link m

Re: [I] CI: Check broken links in src doc comments [datafusion]

2025-07-22 Thread via GitHub
jcsherin commented on issue #16840: URL: https://github.com/apache/datafusion/issues/16840#issuecomment-3101422403 @Adez017 Thank you. These are the command I used. To generate rust docs (same as `ci/scripts/rust_docs.sh`): ```text $ RUSTDOCFLAGS="-D warnings" cargo doc --documen

Re: [PR] Address memory over-accounting in array_agg [datafusion]

2025-07-22 Thread via GitHub
gabotechs commented on code in PR #16816: URL: https://github.com/apache/datafusion/pull/16816#discussion_r2221469424 ## datafusion/functions-aggregate/src/array_agg.rs: ## @@ -315,11 +313,7 @@ impl Accumulator for ArrayAggAccumulator { }; if !val.is_empty()

Re: [PR] Address memory over-accounting in array_agg [datafusion]

2025-07-22 Thread via GitHub
gabotechs commented on code in PR #16816: URL: https://github.com/apache/datafusion/pull/16816#discussion_r2221469424 ## datafusion/functions-aggregate/src/array_agg.rs: ## @@ -315,11 +313,7 @@ impl Accumulator for ArrayAggAccumulator { }; if !val.is_empty()

Re: [PR] Address memory over-accounting in array_agg [datafusion]

2025-07-22 Thread via GitHub
gabotechs commented on code in PR #16816: URL: https://github.com/apache/datafusion/pull/16816#discussion_r2221469424 ## datafusion/functions-aggregate/src/array_agg.rs: ## @@ -315,11 +313,7 @@ impl Accumulator for ArrayAggAccumulator { }; if !val.is_empty()

Re: [I] CI: Check broken links in src doc comments [datafusion]

2025-07-22 Thread via GitHub
Adez017 commented on issue #16840: URL: https://github.com/apache/datafusion/issues/16840#issuecomment-3101400436 > > hi [@2010YOUY01](https://github.com/2010YOUY01) i would try to validate the concerned issue and find out a way > > Thank you. > > You can try `lychee` mentioned

Re: [PR] Fix flaky test case in joins.slt [datafusion]

2025-07-22 Thread via GitHub
findepi commented on PR #16849: URL: https://github.com/apache/datafusion/pull/16849#issuecomment-3101398359 cc @parkma99, @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [I] CI: Check broken links in src doc comments [datafusion]

2025-07-22 Thread via GitHub
2010YOUY01 commented on issue #16840: URL: https://github.com/apache/datafusion/issues/16840#issuecomment-3101392719 > hi [@2010YOUY01](https://github.com/2010YOUY01) i would try to validate the concerned issue and find out a way Thank you. You can try `lychee` mentioned by ht

Re: [PR] chore(deps): bump aws-credential-types from 1.2.3 to 1.2.4 [datafusion]

2025-07-22 Thread via GitHub
findepi commented on PR #16815: URL: https://github.com/apache/datafusion/pull/16815#issuecomment-3101386952 This failed due to a flaky test 🙁 - https://github.com/apache/datafusion/pull/16849 -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[PR] Fix flaky test case in joins.slt [datafusion]

2025-07-22 Thread via GitHub
findepi opened a new pull request, #16849: URL: https://github.com/apache/datafusion/pull/16849 The query lacks total ordering, so `rowsort` is appropriate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Address memory over-accounting in array_agg [datafusion]

2025-07-22 Thread via GitHub
findepi commented on code in PR #16816: URL: https://github.com/apache/datafusion/pull/16816#discussion_r2221442844 ## datafusion/functions-aggregate/src/array_agg.rs: ## @@ -315,11 +313,7 @@ impl Accumulator for ArrayAggAccumulator { }; if !val.is_empty() {

Re: [PR] Improve async_udf example and docs [datafusion]

2025-07-22 Thread via GitHub
findepi commented on code in PR #16846: URL: https://github.com/apache/datafusion/pull/16846#discussion_r2221418135 ## datafusion-examples/examples/async_udf.rs: ## @@ -127,118 +123,45 @@ fn animal() -> Result { Ok(RecordBatch::try_new(schema, vec![id_array, name_array])?)