Re: [I] Simple Functions [datafusion]

2025-02-13 Thread via GitHub
shehabgamin commented on issue #12635: URL: https://github.com/apache/datafusion/issues/12635#issuecomment-2655822219 @findepi Thank you for thoroughly exploring this topic and creating such a detailed design document! @linhr and I have reviewed it and have some initial thoughts:

Re: [I] Parser error with GROUP BY with multiple filters on DataFusion 45 [datafusion]

2025-02-13 Thread via GitHub
2010YOUY01 commented on issue #14633: URL: https://github.com/apache/datafusion/issues/14633#issuecomment-2655916784 > > So as a workaround we could use any dialect that supports it (e.g. postgresql), gotcha. > > That sounds like it should work. From some googling it looks like the `

[PR] Differentiate LEFT JOIN from LEFT OUTER JOIN [datafusion-sqlparser-rs]

2025-02-13 Thread via GitHub
mvzink opened a new pull request, #1726: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1726 And same for RIGHT JOIN. Although these are generally syntactically equivalent, I believe it is preferable to preserve the original input when rendering. -- This is an automate

Re: [PR] Minor: remove unnecessary `update_plan_from_children` call [datafusion]

2025-02-13 Thread via GitHub
wiedld commented on PR #14650: URL: https://github.com/apache/datafusion/pull/14650#issuecomment-2657959828 > Thank you for starting to work on this code @xudong963 > > BTW I believe @wiedld is also plans to work to make this code better, see > > > Maybe you can work togethe

Re: [I] Attach `Diagnostic` to "wrong number of arguments" error [datafusion]

2025-02-13 Thread via GitHub
Chen-Yuan-Lai commented on issue #14432: URL: https://github.com/apache/datafusion/issues/14432#issuecomment-2658302749 Hi @eliaperantoni sorry for the long delay. I’ve been taking some time to familiarize myself with this issue. I found that the "wrong number of argument" error may occur i

Re: [PR] Minor: remove unnecessary `update_plan_from_children` call [datafusion]

2025-02-13 Thread via GitHub
xudong963 commented on PR #14650: URL: https://github.com/apache/datafusion/pull/14650#issuecomment-2658391157 > Also, it looks like the failing CI tests are related to the sort enforcement. Such as not removing unnecessary sorts in some of the children. The problem is that the code s

Re: [PR] Minor: remove unnecessary `update_plan_from_children` call [datafusion]

2025-02-13 Thread via GitHub
xudong963 commented on PR #14650: URL: https://github.com/apache/datafusion/pull/14650#issuecomment-2658392362 > I can make sure to add you as a reviewers on those. Please lmk if/how you want to coordinate. Sure, please add me. -- This is an automated message from the Apache Git Se

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-13 Thread via GitHub
jayzhan211 commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1955414348 ## datafusion/expr-common/src/signature.rs: ## @@ -365,7 +365,12 @@ impl TypeSignature { } } -/// get all possible types for the given `Type

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-13 Thread via GitHub
jayzhan211 commented on PR #14440: URL: https://github.com/apache/datafusion/pull/14440#issuecomment-2658030158 Enum looks good to me but "adding user defined coercion variant" is not clear to me since I think the Coercible signature is user-defined builtin, they can change the type they ne

Re: [I] Documentation regarding running/regenerating stability test plans [datafusion-comet]

2025-02-13 Thread via GitHub
andygrove commented on issue #1393: URL: https://github.com/apache/datafusion-comet/issues/1393#issuecomment-2658030599 I agree that the documentation is in need of improvement. I also found that the files get generated to `$SPARK_HOME`, so I usually set this env var to point to the root o

Re: [I] Documentation regarding running/regenerating stability test plans [datafusion-comet]

2025-02-13 Thread via GitHub
andygrove commented on issue #1393: URL: https://github.com/apache/datafusion-comet/issues/1393#issuecomment-2658031676 It is also necessary to use different Java versions for different Spark versions. I use JDK 11 for Spark 3.x.x and JDK 17 for Spark 4.x.x -- This is an automated messag

[I] Feature: support Timestamp with TZ for function `to_unixtime` [datafusion]

2025-02-13 Thread via GitHub
xudong963 opened a new issue, #14659: URL: https://github.com/apache/datafusion/issues/14659 ### Is your feature request related to a problem or challenge? ```sql > select to_unixtime(arrow_cast('2020-01-01 00:10:20.123'::timestamp, 'Timestamp(Second, Some("America/New_York"))'));

[I] Support SQL pipe operator [datafusion]

2025-02-13 Thread via GitHub
simonvandel opened a new issue, #14660: URL: https://github.com/apache/datafusion/issues/14660 ### Is your feature request related to a problem or challenge? Google BigQuery is releasing support for a pipe operator in their supported Sql dialect. See https://cloud.google.com/bigquery/

Re: [I] Attach `Diagnostic` to "wrong number of arguments" error [datafusion]

2025-02-13 Thread via GitHub
eliaperantoni commented on issue #14432: URL: https://github.com/apache/datafusion/issues/14432#issuecomment-2658498131 Hey @Chen-Yuan-Lai, absolutely no problem! Was checking in to see if you needed any help 😊. I see, that's a bit unfortunate. It seems like the first message is produced by

Re: [PR] feat/adding scalar-variable expression [datafusion-sqlparser-rs]

2025-02-13 Thread via GitHub
github-actions[bot] commented on PR #1260: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1260#issuecomment-2658092811 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or

Re: [PR] feat: Support On-Demand Repartition [datafusion]

2025-02-13 Thread via GitHub
Weijun-H commented on PR #14411: URL: https://github.com/apache/datafusion/pull/14411#issuecomment-2658410453 > > > > I wonder why tpch_mem_sf10 is slower for some queries? Might it be possible the created memtable is not created evenly because of the new round robin (that might be fixable

Re: [I] External sorting not working for (maybe only for string columns??) [datafusion]

2025-02-13 Thread via GitHub
xuchen-plus commented on issue #12136: URL: https://github.com/apache/datafusion/issues/12136#issuecomment-2655841477 > Thanks [@xuchen-plus](https://github.com/xuchen-plus) , unluckily after change to 32MB for sort_spill_reservation_bytes, it still failed for me, i am not sure which i am m

Re: [I] Documentation regarding running/regenerating stability test plans [datafusion-comet]

2025-02-13 Thread via GitHub
wForget commented on issue #1393: URL: https://github.com/apache/datafusion-comet/issues/1393#issuecomment-2655858437 You can try adding `-am` like `./mvnw -am -pl spark ...`, or run your command after `mvn install`. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] feat: add table source to DML proto to eliminate need for table lookup after deserialisation [datafusion]

2025-02-13 Thread via GitHub
alamb commented on code in PR #14631: URL: https://github.com/apache/datafusion/pull/14631#discussion_r1954394569 ## datafusion/core/src/datasource/memory.rs: ## @@ -648,9 +649,14 @@ mod tests { // Create a table scan logical plan to read from the source table

Re: [I] External sorting not working for (maybe only for string columns??) [datafusion]

2025-02-13 Thread via GitHub
alamb commented on issue #12136: URL: https://github.com/apache/datafusion/issues/12136#issuecomment-2656430639 > I have also encountered the same problem with string views. > > DataFusion uses `interleave` function to produce merged batches, and `interleave` tends to produce batches

Re: [I] Test DataFusion 45.0.0 with Sail [datafusion]

2025-02-13 Thread via GitHub
alamb commented on issue #14408: URL: https://github.com/apache/datafusion/issues/14408#issuecomment-2656435262 Since we have released DF 45 I think we can claim the testing is complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] CSVExec projection pushdown [datafusion]

2025-02-13 Thread via GitHub
alamb commented on issue #14161: URL: https://github.com/apache/datafusion/issues/14161#issuecomment-2656437386 Now that we have DataSouceExec is this still an issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Test DataFusion 45.0.0 with Sail [datafusion]

2025-02-13 Thread via GitHub
alamb closed issue #14408: Test DataFusion 45.0.0 with Sail URL: https://github.com/apache/datafusion/issues/14408 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [I] External sorting not working for (maybe only for string columns??) [datafusion]

2025-02-13 Thread via GitHub
Kontinuation commented on issue #12136: URL: https://github.com/apache/datafusion/issues/12136#issuecomment-2656449956 > I think the fix for [apache/arrow-rs#6779](https://github.com/apache/arrow-rs/pull/6779) is in DataFusion 45 -- does this still happen? I am using the latest main

Re: [I] Reduce duplicate Projection when PARQUET_PUSHDOWN_FILTERS is on. [datafusion]

2025-02-13 Thread via GitHub
alamb closed issue #4398: Reduce duplicate Projection when PARQUET_PUSHDOWN_FILTERS is on. URL: https://github.com/apache/datafusion/issues/4398 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [I] Reduce duplicate Projection when PARQUET_PUSHDOWN_FILTERS is on. [datafusion]

2025-02-13 Thread via GitHub
alamb commented on issue #4398: URL: https://github.com/apache/datafusion/issues/4398#issuecomment-2656452977 The current plan on main has no extra ProjectonExec: ``` (venv) andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion/benchmarks/data/tpch_sf1$ datafusion-cli DataFusi

Re: [PR] fix: AQE creating a non-supported Final HashAggregate post-shuffle [datafusion-comet]

2025-02-13 Thread via GitHub
EmilyMatt commented on PR #1390: URL: https://github.com/apache/datafusion-comet/pull/1390#issuecomment-2656455632 @andygrove I've updated "test final min/max/count with result expressions" in the Aggregate suite to verify this more deeply, and I think many of the queries in the stability

Re: [I] Support lazy `SchemaProvider`s [datafusion]

2025-02-13 Thread via GitHub
alamb closed issue #1792: Support lazy `SchemaProvider`s URL: https://github.com/apache/datafusion/issues/1792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [PR] fix: Reduce cast.rs logic from parquet_support.rs for experimental native readers [datafusion-comet]

2025-02-13 Thread via GitHub
parthchandra commented on PR #1387: URL: https://github.com/apache/datafusion-comet/pull/1387#issuecomment-2656383868 More info - with `native_iceberg_compat`: ``` from_type: Timestamp(Nanosecond, None) to_type: Timestamp(Microsecond, Some("America/Los_Angeles")) array: Primi

Re: [PR] Support bounds evaluation for temporal data types [datafusion]

2025-02-13 Thread via GitHub
ch-sc commented on PR #14523: URL: https://github.com/apache/datafusion/pull/14523#issuecomment-2656384285 > I try to add support for NotEq. If I find a solution, I can fix it within this PR itself—if that's okay with you. Sure, feel free to make any adjustment as you see fit. -- T

Re: [I] External sorting not working for (maybe only for string columns??) [datafusion]

2025-02-13 Thread via GitHub
Kontinuation commented on issue #12136: URL: https://github.com/apache/datafusion/issues/12136#issuecomment-2656400964 I have also encountered the same problem with string views. DataFusion uses `interleave` function to produce merged batches, and `interleave` tends to produce batches

[PR] Add support --mem-pool-type and --memory-limit options for all benchmarks [datafusion]

2025-02-13 Thread via GitHub
Kontinuation opened a new pull request, #14642: URL: https://github.com/apache/datafusion/pull/14642 ## Which issue does this PR close? - Closes #14641. ## Rationale for this change I had to run sort-tpch queries with memory limit when testing fixes for memory re

Re: [I] Make `datafusion::physical_optimizer::projection_pushdown` public [datafusion]

2025-02-13 Thread via GitHub
alamb closed issue #8529: Make `datafusion::physical_optimizer::projection_pushdown` public URL: https://github.com/apache/datafusion/issues/8529 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Make `datafusion::physical_optimizer::projection_pushdown` public [datafusion]

2025-02-13 Thread via GitHub
alamb commented on issue #8529: URL: https://github.com/apache/datafusion/issues/8529#issuecomment-2656443656 It appears `projection_pushdown` is now public: https://docs.rs/datafusion/latest/datafusion/physical_optimizer/projection_pushdown/index.html -- This is an automated message f

Re: [PR] feat: metadata columns [datafusion]

2025-02-13 Thread via GitHub
chenkovsky commented on PR #14057: URL: https://github.com/apache/datafusion/pull/14057#issuecomment-2656443184 > I am sorry I have not been following along this PR closely -- is it ready for a final review @chenkovsky and @adriangb ? @alamb The discussion was very lively. fro

[PR] refactor: Make catalog datasource [datafusion]

2025-02-13 Thread via GitHub
logan-keede opened a new pull request, #14643: URL: https://github.com/apache/datafusion/pull/14643 ## Which issue does this PR close? - part of #1 ## Rationale for this change refer to discussion in https://github.com/apache/datafusion/pull/14616 ## Wh

Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2025-02-13 Thread via GitHub
alamb commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2656486069 I think we will claim datafusion has pretty good structured type support now, so let's close this epic and treat additional work as new issues / items -- This is an automated mess

Re: [I] Add the ability to create tables with deeply nested schemas in SQL [datafusion]

2025-02-13 Thread via GitHub
alamb commented on issue #11746: URL: https://github.com/apache/datafusion/issues/11746#issuecomment-2656489040 I think this is woring wekk now ``` > create table t as values ( [ {'name': 'n1', 'info': {'color': 'r', 'size': 1}}, {'name': 'n2', 'info'

Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2025-02-13 Thread via GitHub
alamb closed issue #2326: [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) URL: https://github.com/apache/datafusion/issues/2326 -- This is an automated message from the Apache Git Service. To respond to the message, please log

[PR] bug: Fix memory reservation and allocation problems for SortExec [datafusion]

2025-02-13 Thread via GitHub
Kontinuation opened a new pull request, #14644: URL: https://github.com/apache/datafusion/pull/14644 ## Which issue does this PR close? - Hopefully it closes #10073, but it is still an incomplete solution. ## Rationale for this change I had a hard time making Data

Re: [I] Add the ability to create tables with deeply nested schemas in SQL [datafusion]

2025-02-13 Thread via GitHub
alamb closed issue #11746: Add the ability to create tables with deeply nested schemas in SQL URL: https://github.com/apache/datafusion/issues/11746 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Update Community Events in concepts-readings-events.md [datafusion]

2025-02-13 Thread via GitHub
oznur-synnada commented on PR #14629: URL: https://github.com/apache/datafusion/pull/14629#issuecomment-2656493238 Sure, @alamb I'll dive in and come up with a list of links that could be added to this page. -- This is an automated message from the Apache Git Service. To respond to the me

[PR] Update features / status documentation page [datafusion]

2025-02-13 Thread via GitHub
alamb opened a new pull request, #14645: URL: https://github.com/apache/datafusion/pull/14645 ## Which issue does this PR close? - closes https://github.com/apache/datafusion/issues/825 - closes https://github.com/apache/datafusion/issues/1222 ## Rationale for this change

Re: [I] Add documentation for support for skipping Parquet row groups [datafusion]

2025-02-13 Thread via GitHub
alamb commented on issue #825: URL: https://github.com/apache/datafusion/issues/825#issuecomment-2656506160 Proposed to add in - https://github.com/apache/datafusion/pull/14645 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] Add note about structured field access to supported SQL docs [datafusion]

2025-02-13 Thread via GitHub
alamb commented on issue #1222: URL: https://github.com/apache/datafusion/issues/1222#issuecomment-2656506031 Proposed to add in - https://github.com/apache/datafusion/pull/14645 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[I] doc: Add Sleeper to known users list [datafusion]

2025-02-13 Thread via GitHub
m09526 opened a new issue, #14646: URL: https://github.com/apache/datafusion/issues/14646 ### Is your feature request related to a problem or challenge? [Sleeper](https://github.com/gchq/sleeper) is GCHQ's open-source, cloud-native, log-structured merge tree based, scalable key-value

Re: [I] Exact filter pushdown can be duplicated in some instances [datafusion]

2025-02-13 Thread via GitHub
alamb commented on issue #1116: URL: https://github.com/apache/datafusion/issues/1116#issuecomment-2656521042 When exact pushdown is enabled, there is no filter being applied in DataFusion anymore 🎉 Specifically, there is no `FilterExec` in this plan: ```sql > set datafusio

Re: [I] Exact filter pushdown can be duplicated in some instances [datafusion]

2025-02-13 Thread via GitHub
alamb closed issue #1116: Exact filter pushdown can be duplicated in some instances URL: https://github.com/apache/datafusion/issues/1116 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] doc: Add Sleeper to known users list [datafusion]

2025-02-13 Thread via GitHub
alamb commented on issue #14646: URL: https://github.com/apache/datafusion/issues/14646#issuecomment-2656525714 Thanks @m09526 -- this is a great idea -- can you make a PR to do so in https://github.com/apache/datafusion/blob/3802adf685e828ee6fbac3d16f1890e9ef33a34f/docs/source/user

Re: [PR] Implement predicate pruning for not like expressions [datafusion]

2025-02-13 Thread via GitHub
UBarney commented on PR #14567: URL: https://github.com/apache/datafusion/pull/14567#issuecomment-2656543100 @alamb This transformation is correct in this case according to this [doc](https://docs.rs/datafusion/latest/datafusion/physical_optimizer/pruning/struct.PruningPredicate.h

Re: [PR] feat: add table source to DML proto to eliminate need for table lookup after deserialisation [datafusion]

2025-02-13 Thread via GitHub
milenkovicm commented on PR #14631: URL: https://github.com/apache/datafusion/pull/14631#issuecomment-2655995363 thanks @alamb will address comments and do cleanup later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] support simple/cross lateral joins [datafusion]

2025-02-13 Thread via GitHub
alamb commented on PR #14595: URL: https://github.com/apache/datafusion/pull/14595#issuecomment-2656160457 > So back to the semantics of the lateral join, Thank you @skyzh -- this is an amazing description. THank you -- hopefully we can include that in the source code eventually once

Re: [PR] fix: Reduce cast.rs logic from parquet_support.rs for experimental native readers [datafusion-comet]

2025-02-13 Thread via GitHub
mbutrovich commented on PR #1387: URL: https://github.com/apache/datafusion-comet/pull/1387#issuecomment-2656275607 > So it looks like this PR does not have a negative impact on `native_iceberg_compat` Well that's good news! -- This is an automated message from the Apache Git Service.

Re: [PR] fix: Reduce cast.rs logic from parquet_support.rs for experimental native readers [datafusion-comet]

2025-02-13 Thread via GitHub
mbutrovich commented on code in PR #1387: URL: https://github.com/apache/datafusion-comet/pull/1387#discussion_r1954307910 ## native/spark-expr/src/utils.rs: ## @@ -72,9 +72,6 @@ pub fn array_with_timezone( Some(DataType::Timestamp(_, Some(_))) => {

Re: [PR] fix: Reduce cast.rs logic from parquet_support.rs for experimental native readers [datafusion-comet]

2025-02-13 Thread via GitHub
mbutrovich commented on PR #1387: URL: https://github.com/apache/datafusion-comet/pull/1387#issuecomment-2656274826 > I had the impression that this PR had originally reduced the number of failures in native_iceberg_compat as well but that is no longer true after the cleanup. Is that corre

Re: [PR] function: Allow more expressive array signatures [datafusion]

2025-02-13 Thread via GitHub
jayzhan211 commented on PR #14532: URL: https://github.com/apache/datafusion/pull/14532#issuecomment-2656285292 `array_element` doesn't change the list so not coerce to list makes sense to me. -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] feat: Add support for --mem-pool-type and --memory-limit options to multiple benchmarks [datafusion]

2025-02-13 Thread via GitHub
Kontinuation commented on PR #14642: URL: https://github.com/apache/datafusion/pull/14642#issuecomment-2656545407 `sort_spill_reservation_bytes` is also an important configuration to tune for benchmarks involving sorts, so I think we may also want to add it to benchmarking tools. -- This

Re: [PR] Update Community Events in concepts-readings-events.md [datafusion]

2025-02-13 Thread via GitHub
alamb commented on PR #14629: URL: https://github.com/apache/datafusion/pull/14629#issuecomment-2656544122 Thank you ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [I] Pushdown projection below NestedLoopJoinExec [datafusion]

2025-02-13 Thread via GitHub
Dandandan commented on issue #14161: URL: https://github.com/apache/datafusion/issues/14161#issuecomment-2656539839 I took the liberty to edit the title according what @berkaysynnada shared. It seems it could use a test to show the embedded projection from `NestedLoopJoinExec` isn't pushe

Re: [PR] Implement predicate pruning for not like expressions [datafusion]

2025-02-13 Thread via GitHub
alamb commented on PR #14567: URL: https://github.com/apache/datafusion/pull/14567#issuecomment-2656570666 > @alamb Thanks for such a detailed example and explanation. I think this transformation is correct in this case. > > according to this [doc](https://docs.rs/datafusion/latest/d

Re: [PR] Add supports for Hive's `SELECT ... GROUP BY .. GROUPING SETS` syntax [datafusion-sqlparser-rs]

2025-02-13 Thread via GitHub
wugeer commented on code in PR #1653: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1653#discussion_r1954510319 ## src/ast/query.rs: ## @@ -2537,6 +2542,19 @@ impl fmt::Display for GroupByWithModifier { GroupByWithModifier::Rollup => write!(f, "WITH R

Re: [PR] bug: Fix memory reservation and allocation problems for SortExec [datafusion]

2025-02-13 Thread via GitHub
Kontinuation commented on code in PR #14644: URL: https://github.com/apache/datafusion/pull/14644#discussion_r1954497832 ## datafusion/physical-plan/src/sorts/sort.rs: ## @@ -369,7 +356,7 @@ impl ExternalSorter { .with_metrics(self.metrics.baseline.clone())

Re: [PR] Implement predicate pruning for not like expressions [datafusion]

2025-02-13 Thread via GitHub
alamb commented on code in PR #14567: URL: https://github.com/apache/datafusion/pull/14567#discussion_r1954515692 ## datafusion/physical-optimizer/src/pruning.rs: ## @@ -1710,6 +1711,76 @@ fn build_like_match( Some(combined) } +// For predicate `col NOT LIKE 'const_prefi

Re: [PR] Add supports for Hive's `SELECT ... GROUP BY .. GROUPING SETS` syntax [datafusion-sqlparser-rs]

2025-02-13 Thread via GitHub
wugeer commented on code in PR #1653: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1653#discussion_r1954497259 ## src/parser/mod.rs: ## @@ -9092,6 +9092,16 @@ impl<'a> Parser<'a> { }); } } +if self.di

Re: [PR] bug: Fix memory reservation and allocation problems for SortExec [datafusion]

2025-02-13 Thread via GitHub
Kontinuation commented on code in PR #14644: URL: https://github.com/apache/datafusion/pull/14644#discussion_r1954496042 ## datafusion/physical-plan/src/sorts/stream.rs: ## @@ -171,20 +173,32 @@ impl std::fmt::Debug for FieldCursorStream { } impl FieldCursorStream { -pu

Re: [PR] function: Allow more expressive array signatures [datafusion]

2025-02-13 Thread via GitHub
jkosh44 commented on PR #14532: URL: https://github.com/apache/datafusion/pull/14532#issuecomment-2656668319 Thanks for the review @jayzhan211 and for all of your helpful feedback! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Implement predicate pruning for not like expressions [datafusion]

2025-02-13 Thread via GitHub
alamb commented on PR #14567: URL: https://github.com/apache/datafusion/pull/14567#issuecomment-2656655495 > `t.parquet` in is [zip file](https://github.com/user-attachments/files/18784196/t.zip) (github doesn't allow upload parquet file 😓) BTW you can create such files using datafu

Re: [PR] feat: metadata columns [datafusion]

2025-02-13 Thread via GitHub
adriangb commented on PR #14057: URL: https://github.com/apache/datafusion/pull/14057#issuecomment-2656813742 yep I agree. TLDR Andrew is: 1. There are two approaches, #14057 (this PR) and #14362. 2. Fundamentally it is not super clear how to define the feature set of a "system column

Re: [PR] refactor: Make catalog datasource [datafusion]

2025-02-13 Thread via GitHub
logan-keede commented on PR #14643: URL: https://github.com/apache/datafusion/pull/14643#issuecomment-2656628651 do we want `DataSource` trait in `datasource` crate? will that be a breaking change since we won't be able to re-export `DataSource` in `datafusion_physical_plan` without non-tri

Re: [PR] refactor: Make catalog datasource [datafusion]

2025-02-13 Thread via GitHub
logan-keede commented on PR #14643: URL: https://github.com/apache/datafusion/pull/14643#issuecomment-2656630232 > do we want `DataSource` trait in `datasource` crate? will that be a breaking change since we won't be able to re-export `DataSource` in `datafusion_physical_plan` without non-t

Re: [PR] Add supports for Hive's `SELECT ... GROUP BY .. GROUPING SETS` syntax [datafusion-sqlparser-rs]

2025-02-13 Thread via GitHub
wugeer commented on code in PR #1653: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1653#discussion_r1954521877 ## src/parser/mod.rs: ## @@ -9092,6 +9092,16 @@ impl<'a> Parser<'a> { }); } } +if self.di

Re: [PR] Add supports for Hive's `SELECT ... GROUP BY .. GROUPING SETS` syntax [datafusion-sqlparser-rs]

2025-02-13 Thread via GitHub
wugeer commented on PR #1653: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1653#issuecomment-2656643152 @iffyio Thank you for your guidance! I have resubmitted the code and hope to receive your review again. :) -- This is an automated message from the Apache Git Service. T

Re: [I] Logically repartition files by row splits [datafusion]

2025-02-13 Thread via GitHub
AdamGS commented on issue #14607: URL: https://github.com/apache/datafusion/issues/14607#issuecomment-2656656413 that seems like an appropriate place, I'll try and play around with that sometime this week and I'll share if I get anything I like. -- This is an automated message from the Ap

Re: [PR] refactor: Make catalog datasource [datafusion]

2025-02-13 Thread via GitHub
jayzhan211 commented on PR #14643: URL: https://github.com/apache/datafusion/pull/14643#issuecomment-2656705498 We should! It is in physical-plan because there is no datasource then. It should be part of datasource crate -- This is an automated message from the Apache Git Service. To resp

Re: [PR] bug: improve schema checking for `insert into` cases [datafusion]

2025-02-13 Thread via GitHub
jonahgao commented on code in PR #14572: URL: https://github.com/apache/datafusion/pull/14572#discussion_r1954582396 ## datafusion/common/src/dfschema.rs: ## @@ -1028,21 +1028,41 @@ impl SchemaExt for Schema { }) } -fn logically_equivalent_names_and_types

Re: [PR] bug: improve schema checking for `insert into` cases [datafusion]

2025-02-13 Thread via GitHub
jonahgao commented on code in PR #14572: URL: https://github.com/apache/datafusion/pull/14572#discussion_r1954592060 ## datafusion/common/src/dfschema.rs: ## @@ -1028,21 +1028,41 @@ impl SchemaExt for Schema { }) } -fn logically_equivalent_names_and_types

[PR] fix: add to_timestamp_nanos [datafusion-python]

2025-02-13 Thread via GitHub
chenkovsky opened a new pull request, #1020: URL: https://github.com/apache/datafusion-python/pull/1020 # Which issue does this PR close? No # Rationale for this change to_timestamp_nanos has a wrapper, but not exported from rust. # What changes are included in th

Re: [PR] bug: improve schema checking for `insert into` cases [datafusion]

2025-02-13 Thread via GitHub
jonahgao commented on code in PR #14572: URL: https://github.com/apache/datafusion/pull/14572#discussion_r1954600078 ## datafusion/common/src/dfschema.rs: ## @@ -1028,21 +1028,32 @@ impl SchemaExt for Schema { }) } -fn logically_equivalent_names_and_types

Re: [PR] Implement predicate pruning for not like expressions [datafusion]

2025-02-13 Thread via GitHub
UBarney commented on code in PR #14567: URL: https://github.com/apache/datafusion/pull/14567#discussion_r1954594478 ## datafusion/physical-optimizer/src/pruning.rs: ## @@ -4061,6 +4132,106 @@ mod tests { prune_with_expr(expr, &schema, &statistics, expected_ret); }

Re: [PR] bug: improve schema checking for `insert into` cases [datafusion]

2025-02-13 Thread via GitHub
jonahgao commented on code in PR #14572: URL: https://github.com/apache/datafusion/pull/14572#discussion_r1954616404 ## datafusion/sqllogictest/test_files/insert_to_external.slt: ## @@ -60,17 +60,16 @@ STORED AS parquet LOCATION 'test_files/scratch/insert_to_external/parquet_ty

[PR] docs: Add Sleeper to list of known users [datafusion]

2025-02-13 Thread via GitHub
m09526 opened a new pull request, #14648: URL: https://github.com/apache/datafusion/pull/14648 ## Which issue does this PR close? - Closes #14646. ## Rationale for this change ## What changes are included in this PR? Modifies [https://githu

Re: [I] Parser error with GROUP BY with multiple filters on DataFusion 45 [datafusion]

2025-02-13 Thread via GitHub
jkosh44 commented on issue #14633: URL: https://github.com/apache/datafusion/issues/14633#issuecomment-2656871273 > > > So as a workaround we could use any dialect that supports it (e.g. postgresql), gotcha. > > > > > > That sounds like it should work. From some googling it looks

Re: [I] Attach `Diagnostic` to "incompatible type in unary expression" error [datafusion]

2025-02-13 Thread via GitHub
eliaperantoni commented on issue #14433: URL: https://github.com/apache/datafusion/issues/14433#issuecomment-2656881495 Hey @alan910127 how is it going with this ticket :) Can I help with anything? -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [I] Attach `Diagnostic` to "function x does not exist" error [datafusion]

2025-02-13 Thread via GitHub
eliaperantoni commented on issue #14430: URL: https://github.com/apache/datafusion/issues/14430#issuecomment-2656885264 Hey @onlyjackfrost how is it going with this ticket :) Can I help with anything? -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [I] Attach `Diagnostic` to "wrong number of arguments" error [datafusion]

2025-02-13 Thread via GitHub
eliaperantoni commented on issue #14432: URL: https://github.com/apache/datafusion/issues/14432#issuecomment-2656884080 Hey @Chen-Yuan-Lai how is it going with this ticket :) Can I help with anything? -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [I] Attach `Diagnostic` to "more than one column in subquery" error [datafusion]

2025-02-13 Thread via GitHub
eliaperantoni commented on issue #14438: URL: https://github.com/apache/datafusion/issues/14438#issuecomment-2656882826 Hey @irenji how is it going with this ticket :) Can I help with anything? -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] Attach `Diagnostic` to syntax errors [datafusion]

2025-02-13 Thread via GitHub
eliaperantoni commented on issue #14437: URL: https://github.com/apache/datafusion/issues/14437#issuecomment-2656882461 Hey @irenjj how is it going with this ticket :) Can I help with anything? -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] Attach `Diagnostic` to "duplicate table name" error [datafusion]

2025-02-13 Thread via GitHub
eliaperantoni commented on issue #14436: URL: https://github.com/apache/datafusion/issues/14436#issuecomment-2656883539 Hey @zjregee how is it going with this ticket :) Can I help with anything? -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] Emit warning with attached `Diagnostic` when doing `= NULL` [datafusion]

2025-02-13 Thread via GitHub
eliaperantoni commented on issue #14434: URL: https://github.com/apache/datafusion/issues/14434#issuecomment-2656882060 Hey @ugoa how is it going with this ticket :) Can I help with anything? -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] Attach `Diagnostic` to "invalid function argument types" error [datafusion]

2025-02-13 Thread via GitHub
eliaperantoni commented on issue #14431: URL: https://github.com/apache/datafusion/issues/14431#issuecomment-2656884665 Hey @dentiny how is it going with this ticket :) Can I help with anything? -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] multiply overflow in stats.rs [datafusion]

2025-02-13 Thread via GitHub
Omega359 commented on issue #13775: URL: https://github.com/apache/datafusion/issues/13775#issuecomment-2656916888 I think the bug is still there but I haven't seen it in a while tbh so whatever changes were made to the sqllogictests seem to no longer trigger it. On Thu, Feb 13,

Re: [PR] bug: improve schema checking for `insert into` cases [datafusion]

2025-02-13 Thread via GitHub
zhuqi-lucas commented on code in PR #14572: URL: https://github.com/apache/datafusion/pull/14572#discussion_r1954710305 ## datafusion/common/src/dfschema.rs: ## @@ -1028,21 +1028,41 @@ impl SchemaExt for Schema { }) } -fn logically_equivalent_names_and_ty

Re: [PR] perf: Drop RowConverter from GroupOrderingPartial [datafusion]

2025-02-13 Thread via GitHub
ctsk commented on PR #14566: URL: https://github.com/apache/datafusion/pull/14566#issuecomment-2656916325 Thank you for taking the time to review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] function: Allow more expressive array signatures [datafusion]

2025-02-13 Thread via GitHub
jkosh44 commented on PR #14532: URL: https://github.com/apache/datafusion/pull/14532#issuecomment-2656959911 I think this might also be an API change? I don't have the permissions to add the tag though. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] Little changes "cache control" [datafusion]

2025-02-13 Thread via GitHub
alamb commented on PR #14611: URL: https://github.com/apache/datafusion/pull/14611#issuecomment-2656230770 Sorry for the back and forth @Ramjee194 From my perspective the issue is that I don't understand the implications of this change The ticket and this PR baiscally say "th

Re: [I] coercion of input types in `coalesce` leads to type unsupported arrow cast [datafusion]

2025-02-13 Thread via GitHub
alamb closed issue #14581: coercion of input types in `coalesce` leads to type unsupported arrow cast URL: https://github.com/apache/datafusion/issues/14581 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] minor: simplify `union_extract` code [datafusion]

2025-02-13 Thread via GitHub
alamb opened a new pull request, #14640: URL: https://github.com/apache/datafusion/pull/14640 Draft until https://github.com/apache/datafusion/pull/12116 is done ## Which issue does this PR close? - follow on to https://github.com/apache/datafusion/pull/12116

Re: [PR] Fallback to Utf8View for `Dict(_, Utf8View)` in `type_union_resolution_coercion` [datafusion]

2025-02-13 Thread via GitHub
alamb merged PR #14602: URL: https://github.com/apache/datafusion/pull/14602 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Update Community Events in concepts-readings-events.md [datafusion]

2025-02-13 Thread via GitHub
alamb commented on PR #14629: URL: https://github.com/apache/datafusion/pull/14629#issuecomment-2656242973 THANK YOU! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Update Community Events in concepts-readings-events.md [datafusion]

2025-02-13 Thread via GitHub
alamb commented on PR #14629: URL: https://github.com/apache/datafusion/pull/14629#issuecomment-2656247444 @oznur-synnada I wonder if you might be willing to add some additional links to this page. For example there are some links on the weekly summaries I write that maybe could be

  1   2   >