Re: [I] Support `Time` Parquet Data Page Statistics [datafusion]

2024-07-02 Thread via GitHub
MohamedAbdeen21 commented on issue #4: URL: https://github.com/apache/datafusion/issues/4#issuecomment-2189530823 Hi @myeunee, you can just comment "take" and it will be automatically assigned to you. -- This is an automated message from the Apache Git Service. To respond to the m

[PR] Pyo3 refactorings [datafusion-python]

2024-07-02 Thread via GitHub
Michael-J-Ward opened a new pull request, #740: URL: https://github.com/apache/datafusion-python/pull/740 # Which issue does this PR close? Part of #727. # Rationale for this change [pyo3 recommend](https://github.com/PyO3/pyo3/pull/4274) using the Python token from a `Bound

[PR] Fix overflow in pow [datafusion]

2024-07-02 Thread via GitHub
LorrensP-2158466 opened a new pull request, #11124: URL: https://github.com/apache/datafusion/pull/11124 ## Which issue does this PR close? Closes #11075. ## Rationale for this change ## What changes are included in this PR? - check overflow for integer

Re: [I] Running tests uses 50.1GB on Ubuntu [datafusion]

2024-07-02 Thread via GitHub
jcsherin commented on issue #11105: URL: https://github.com/apache/datafusion/issues/11105#issuecomment-2189678641 On Ubuntu, ```sh $ lsb_release -rc Release: 22.04 Codename:jammy ``` After `cargo clean` and running `cargo t`: ```sh $ du -h -d2 target

Re: [PR] fix: modulo op with negative zero divisor produces Nan [datafusion-comet]

2024-07-02 Thread via GitHub
kazuyukitanimura commented on code in PR #585: URL: https://github.com/apache/datafusion-comet/pull/585#discussion_r1653344873 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -2240,7 +2240,15 @@ object QueryPlanSerde extends Logging with ShimQueryPlan

[I] Support Map as a ScalarValue [datafusion]

2024-07-02 Thread via GitHub
Blizzara opened a new issue, #11128: URL: https://github.com/apache/datafusion/issues/11128 ### Is your feature request related to a problem or challenge? ScalarValue does not currently support Maps, see e.g. https://github.com/apache/datafusion/issues/8262#issuecomment-1852700799.

[PR] feat: Support Map type in Substrait conversions [datafusion]

2024-07-02 Thread via GitHub
Blizzara opened a new pull request, #11129: URL: https://github.com/apache/datafusion/pull/11129 ## Which issue does this PR close? Closes #. ## Rationale for this change We don't currently support Map types in the Substrait consumer/producer at all. As Map type

Re: [PR] feat: Add support for Timestamp data types in data page statistics. [datafusion]

2024-07-02 Thread via GitHub
efredine commented on code in PR #11123: URL: https://github.com/apache/datafusion/pull/11123#discussion_r1654781564 ## datafusion/core/src/datasource/physical_plan/parquet/statistics.rs: ## @@ -713,6 +692,15 @@ macro_rules! get_data_page_statistics { )),

Re: [PR] feat: Add support for Timestamp data types in data page statistics. [datafusion]

2024-07-02 Thread via GitHub
alamb merged PR #11123: URL: https://github.com/apache/datafusion/pull/11123 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

[PR] fix: Ignore nullability in Substrait structs [datafusion]

2024-07-02 Thread via GitHub
Blizzara opened a new pull request, #11130: URL: https://github.com/apache/datafusion/pull/11130 Similar to https://github.com/apache/datafusion/pull/10874/files ## Which issue does this PR close? Closes #. ## Rationale for this change Arrow requires sc

Re: [I] Large `OR` list overflows the stack [datafusion]

2024-07-02 Thread via GitHub
alamb commented on issue #9375: URL: https://github.com/apache/datafusion/issues/9375#issuecomment-2191659377 > we could restrict the minimum elements of CommutativeExpr to 3. That could make sense My biggest concern with this proposal is its potential impact on backwards compa

Re: [I] Support `Timestamp` Parquet Data Page Statistics [datafusion]

2024-07-02 Thread via GitHub
alamb closed issue #2: Support `Timestamp` Parquet Data Page Statistics URL: https://github.com/apache/datafusion/issues/2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [I] bug: CAST timestamp to string ignores timezone prior to Spark 3.4 [datafusion-comet]

2024-07-02 Thread via GitHub
suibianwanwank commented on issue #468: URL: https://github.com/apache/datafusion-comet/issues/468#issuecomment-2194576912 I've recently been learning about the project and can be assigned to learn about it if this issue hasn't already been resolved,thanks -- This is an automated message

Re: [PR] Support Date Parquet Data Page Statistics [datafusion]

2024-07-02 Thread via GitHub
dharanad commented on PR #11135: URL: https://github.com/apache/datafusion/pull/11135#issuecomment-2194639375 hello @alamb can you please help me with a review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[I] Support unparsing the Value Plan of Array (List) to SQL String [datafusion]

2024-07-02 Thread via GitHub
goldmedal opened a new issue, #11144: URL: https://github.com/apache/datafusion/issues/11144 ### Is your feature request related to a problem or challenge? In `unparser/expr.rs`, list scalar values haven't been supported yet. ```rust ScalarValue::FixedSizeList(_a) => not_impl_err

Re: [I] ASOF join support / Specialize Range Joins [datafusion]

2024-07-02 Thread via GitHub
ozankabak commented on issue #318: URL: https://github.com/apache/datafusion/issues/318#issuecomment-2194718023 @jirislav I'm not sure if you chose your tone intentionally to be this way and I have to say I don't really appreciate it. Nevertheless I will try to answer as best as I ca

Re: [I] Comet can't use CometShuffleManager on Yarn Cluster [datafusion-comet]

2024-07-02 Thread via GitHub
dpengpeng closed issue #592: Comet can't use CometShuffleManager on Yarn Cluster URL: https://github.com/apache/datafusion-comet/issues/592 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [I] Comet can't use CometShuffleManager on Yarn Cluster [datafusion-comet]

2024-07-02 Thread via GitHub
dpengpeng commented on issue #592: URL: https://github.com/apache/datafusion-comet/issues/592#issuecomment-2194740384 this problem is not a bug -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] Support `Dictionary` in Parquet Metadata Statistics [datafusion]

2024-07-02 Thread via GitHub
efredine commented on issue #11145: URL: https://github.com/apache/datafusion/issues/11145#issuecomment-2196997212 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[PR] minor: consolidate `gcd` related tests [datafusion]

2024-07-02 Thread via GitHub
jonahgao opened a new pull request, #11164: URL: https://github.com/apache/datafusion/pull/11164 ## Which issue does this PR close? N/A ## Rationale for this change ## What changes are included in this PR? The tests for the `GCD` function exist in both `scalar.slt

Re: [PR] expose table name in proto extension codec [datafusion]

2024-07-02 Thread via GitHub
alamb commented on PR #11139: URL: https://github.com/apache/datafusion/pull/11139#issuecomment-2197044780 Thanks again @leoyvens -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] expose table name in proto extension codec [datafusion]

2024-07-02 Thread via GitHub
alamb merged PR #11139: URL: https://github.com/apache/datafusion/pull/11139 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] [draft] Add `LogicalType`, try to support user-defined types [datafusion]

2024-07-02 Thread via GitHub
notfilippo commented on code in PR #11160: URL: https://github.com/apache/datafusion/pull/11160#discussion_r1659356418 ## datafusion/common/src/logical_type/extension.rs: ## @@ -0,0 +1,289 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

Re: [I] Fix CometShuffleManager to adapt IndexShuffleBlockResolver signature change [datafusion-comet]

2024-07-02 Thread via GitHub
viirya commented on issue #609: URL: https://github.com/apache/datafusion-comet/issues/609#issuecomment-2197789674 That Spark PR is targeting the `master` branch, how does it do with Spark 3.4? In any way, I think it is only an issue once you use Spark 3.4 with that signature change

Re: [PR] Better Cast name for display [datafusion]

2024-07-02 Thread via GitHub
github-actions[bot] commented on PR #10276: URL: https://github.com/apache/datafusion/pull/10276#issuecomment-2197840913 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [I] `spark.comet.memory.overhead.min` not respected when submitting jobs with Comet with Spark on Kubernetes [datafusion-comet]

2024-07-02 Thread via GitHub
viirya commented on issue #605: URL: https://github.com/apache/datafusion-comet/issues/605#issuecomment-2197796786 If `CometDriverPlugin` is loaded, it should overwrite executor overhead by existing executor overhead + Comet overhead. -- This is an automated message from the Apache Git

Re: [PR] fix: Support dictionary type in parquet metadata statistics. [datafusion]

2024-07-02 Thread via GitHub
efredine commented on code in PR #11169: URL: https://github.com/apache/datafusion/pull/11169#discussion_r1659431592 ## datafusion/core/src/datasource/file_format/parquet.rs: ## @@ -1439,6 +1441,57 @@ mod tests { Ok(()) } +#[tokio::test] +async fn test_st

Re: [PR] fix: Support dictionary type in parquet metadata statistics. [datafusion]

2024-07-02 Thread via GitHub
efredine commented on PR #11169: URL: https://github.com/apache/datafusion/pull/11169#issuecomment-2197801141 Thanks @alamb @appletreeisyellow - I should have reviewed the tests more closely - thanks for the feedback. I will make the adjustments now. -- This is an automated message from t

Re: [PR] fix(typo): unqualifed to unqualified [datafusion]

2024-07-02 Thread via GitHub
waynexia merged PR #11159: URL: https://github.com/apache/datafusion/pull/11159 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] RFC / Prototype user defined sql planner might look like [datafusion]

2024-07-02 Thread via GitHub
jayzhan211 commented on PR #11168: URL: https://github.com/apache/datafusion/pull/11168#issuecomment-2197850460 I think it is a great idea overall. There are only two things that I think we could improve on 1. Avoid clone like what you have mentioned, I don't have a good solution for now

[PR] feat: add example for copy to [datafusion]

2024-07-02 Thread via GitHub
tshauck opened a new pull request, #11174: URL: https://github.com/apache/datafusion/pull/11174 ## Which issue does this PR close? Closes #11079 ## Rationale for this change Adds an example of for how to `COPY table TO` a custom file format. ## What changes

[I] Add support for SessionState in supports_filters_pushdown for a Custom Data Source [datafusion]

2024-07-02 Thread via GitHub
cisaacson opened a new issue, #11193: URL: https://github.com/apache/datafusion/issues/11193 ### Is your feature request related to a problem or challenge? We need the ability to get the `TaskContext.task_id` any place where a Custom Data Source is invoked. As it stands currently, the

Re: [I] Upgrade window UDF api [datafusion-python]

2024-07-02 Thread via GitHub
timsaucer commented on issue #730: URL: https://github.com/apache/datafusion-python/issues/730#issuecomment-2200117630 @Michael-J-Ward I know this issue was already closed but I have proposed a method to remove the work around. -- This is an automated message from the Apache Git Service.

Re: [PR] fix: Support Substrait's compound names also for window functions [datafusion]

2024-07-02 Thread via GitHub
alamb commented on PR #11163: URL: https://github.com/apache/datafusion/pull/11163#issuecomment-2200185730 Thanks @Blizzara -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Remove `Min`/`Max` references from `AggregateStatistics` [datafusion]

2024-07-02 Thread via GitHub
alamb commented on issue #11153: URL: https://github.com/apache/datafusion/issues/11153#issuecomment-2200193003 Thanks @Rachelint -- I also have some time later this week to help with this issue (if it might help to have one or two examples as we work through https://github.com/apache/data

Re: [I] Add support for SessionState in supports_filters_pushdown for a Custom Data Source [datafusion]

2024-07-02 Thread via GitHub
alamb commented on issue #11193: URL: https://github.com/apache/datafusion/issues/11193#issuecomment-2200205905 I think the usecase of passing some sort of state to `TableProvider` methods is a good and useful thing https://docs.rs/datafusion/latest/datafusion/datasource/provider/tra

[PR] Tsaucer/find window fn [datafusion-python]

2024-07-02 Thread via GitHub
timsaucer opened a new pull request, #747: URL: https://github.com/apache/datafusion-python/pull/747 # Which issue does this PR close? Closes #730. # Rationale for this change The current fix is a work around for the `sum` function. We have hard coded it to use the aggr

Re: [PR] fix: Support Substrait's compound names also for window functions [datafusion]

2024-07-02 Thread via GitHub
Blizzara commented on code in PR #11163: URL: https://github.com/apache/datafusion/pull/11163#discussion_r1660998022 ## datafusion/substrait/src/logical_plan/consumer.rs: ## @@ -1247,36 +1164,50 @@ pub async fn from_substrait_rex( None => substrait_err!("Cast expres

Re: [PR] fix: Support Substrait's compound names also for window functions [datafusion]

2024-07-02 Thread via GitHub
alamb commented on code in PR #11163: URL: https://github.com/apache/datafusion/pull/11163#discussion_r1661072390 ## datafusion/substrait/src/logical_plan/consumer.rs: ## @@ -1247,36 +1164,50 @@ pub async fn from_substrait_rex( None => substrait_err!("Cast expressio

Re: [PR] fix: Support Substrait's compound names also for window functions [datafusion]

2024-07-02 Thread via GitHub
alamb merged PR #11163: URL: https://github.com/apache/datafusion/pull/11163 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Feat: Implement hf:// / "hugging face" integration in datafusion-cli [datafusion]

2024-07-02 Thread via GitHub
xinlifoobar commented on PR #10792: URL: https://github.com/apache/datafusion/pull/10792#issuecomment-2203151992 > Sorry for the delay -- I paln to review this tomorrow Sorry, I lost Github connections for a couple of days and just returned. Also Thanks. please take your time. -

[PR] Implement user defined planner for extract [datafusion]

2024-07-02 Thread via GitHub
xinlifoobar opened a new pull request, #11215: URL: https://github.com/apache/datafusion/pull/11215 ## Which issue does this PR close? Part of #11207 ## Rationale for this change ## What changes are included in this PR? ## Are these changes

Re: [PR] fix: Incorrect LEFT JOIN evaluation result on OR conditions [datafusion]

2024-07-02 Thread via GitHub
ozankabak commented on code in PR #11203: URL: https://github.com/apache/datafusion/pull/11203#discussion_r1662486115 ## datafusion/optimizer/src/push_down_filter.rs: ## @@ -441,11 +442,11 @@ fn push_down_all_join( // Extract from OR clause, generate new predicates for bo

Re: [PR] fix: Incorrect LEFT JOIN evaluation result on OR conditions [datafusion]

2024-07-02 Thread via GitHub
mustafasrepo commented on code in PR #11203: URL: https://github.com/apache/datafusion/pull/11203#discussion_r1662527290 ## datafusion/optimizer/src/push_down_filter.rs: ## @@ -441,11 +442,11 @@ fn push_down_all_join( // Extract from OR clause, generate new predicates for

Re: [PR] fix: Incorrect LEFT JOIN evaluation result on OR conditions [datafusion]

2024-07-02 Thread via GitHub
viirya commented on code in PR #11203: URL: https://github.com/apache/datafusion/pull/11203#discussion_r1662511033 ## datafusion/sqllogictest/test_files/join.slt: ## @@ -793,3 +793,148 @@ DROP TABLE companies statement ok DROP TABLE leads + + +## Test ON clause predicate

Re: [PR] Introduce user defined SQL planner API [datafusion]

2024-07-02 Thread via GitHub
alamb commented on code in PR #11180: URL: https://github.com/apache/datafusion/pull/11180#discussion_r1661062684 ## datafusion/expr/src/planner.rs: ## @@ -114,8 +114,12 @@ pub trait UserDefinedSQLPlanner { /// An operator with two arguments to plan /// -/// Note `left` and

Re: [PR] Covert grouping to udaf [datafusion]

2024-07-02 Thread via GitHub
jayzhan211 commented on PR #11147: URL: https://github.com/apache/datafusion/pull/11147#issuecomment-2203205893 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] Covert grouping to udaf [datafusion]

2024-07-02 Thread via GitHub
jayzhan211 merged PR #11147: URL: https://github.com/apache/datafusion/pull/11147 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Allow user defined SQL planners to be registered [datafusion]

2024-07-02 Thread via GitHub
jayzhan211 merged PR #11208: URL: https://github.com/apache/datafusion/pull/11208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Recursive `unnest` [datafusion]

2024-07-02 Thread via GitHub
jayzhan211 merged PR #11062: URL: https://github.com/apache/datafusion/pull/11062 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Enable `clone_on_ref_ptr`clippy lint on some crates [datafusion]

2024-07-02 Thread via GitHub
ozankabak commented on PR #11157: URL: https://github.com/apache/datafusion/pull/11157#issuecomment-2203111458 This is a great change. However, instead of doing this in such a big PR, can we open an epic and do it in multiple small PRs one crate at a time? Doing that will be much more frien

Re: [PR] Covert grouping to udaf [datafusion]

2024-07-02 Thread via GitHub
alamb commented on PR #11147: URL: https://github.com/apache/datafusion/pull/11147#issuecomment-2203114522 Thanks @Rachelint -- this is great. > Remember to add the test in roundtrip_expr_api in datafusion/proto/tests/cases/roundtrip_logical_plan.rs, others LGTM. As I am sli

Re: [PR] Allow user defined SQL planners to be registered [datafusion]

2024-07-02 Thread via GitHub
jayzhan211 commented on code in PR #11208: URL: https://github.com/apache/datafusion/pull/11208#discussion_r1662564114 ## datafusion/core/tests/user_defined/user_defined_sql_planner.rs: ## @@ -0,0 +1,88 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or mor

Re: [PR] fix: Incorrect LEFT JOIN evaluation result on OR conditions [datafusion]

2024-07-02 Thread via GitHub
viirya commented on code in PR #11203: URL: https://github.com/apache/datafusion/pull/11203#discussion_r1662578537 ## datafusion/optimizer/src/push_down_filter.rs: ## @@ -441,11 +442,11 @@ fn push_down_all_join( // Extract from OR clause, generate new predicates for both

Re: [PR] fix: Incorrect LEFT JOIN evaluation result on OR conditions [datafusion]

2024-07-02 Thread via GitHub
ozankabak commented on code in PR #11203: URL: https://github.com/apache/datafusion/pull/11203#discussion_r1662588440 ## datafusion/optimizer/src/push_down_filter.rs: ## @@ -435,22 +436,37 @@ fn push_down_all_join( { right_push.push(on)

[PR] Support DuckDB style stuct syntax [datafusion]

2024-07-02 Thread via GitHub
jayzhan211 opened a new pull request, #11214: URL: https://github.com/apache/datafusion/pull/11214 ## Which issue does this PR close? Closes #9820 . ## Rationale for this change ## What changes are included in this PR? ## Are these changes t

Re: [PR] Minor: Move MemoryCatalog*Provider into a module, improve comments [datafusion]

2024-07-02 Thread via GitHub
jonahgao commented on code in PR #11183: URL: https://github.com/apache/datafusion/pull/11183#discussion_r1662614512 ## datafusion/core/src/catalog/mod.rs: ## @@ -16,16 +16,30 @@ // under the License. //! Interfaces and default implementations of catalogs and schemas. +//! +

Re: [PR] fix: Incorrect LEFT JOIN evaluation result on OR conditions [datafusion]

2024-07-02 Thread via GitHub
mustafasrepo commented on PR #11203: URL: https://github.com/apache/datafusion/pull/11203#issuecomment-2203369671 > Looks like my comment [#11203 (comment)](https://github.com/apache/datafusion/pull/11203#discussion_r1662588836) was in progress when @viirya changed the code in [17b2d43](ht

Re: [I] Create fixed size list table with syntax [] [datafusion]

2024-07-02 Thread via GitHub
vaibhawvipul commented on issue #10303: URL: https://github.com/apache/datafusion/issues/10303#issuecomment-2203160525 > I think we may be able to use the syntax https://docs.rs/sqlparser/latest/sqlparser/ast/enum.DataType.html#variant.Array > > So something like this to create a 256

Re: [PR] Enhance short circuit handling in `CommonSubexprEliminate` [datafusion]

2024-07-02 Thread via GitHub
alamb commented on code in PR #11197: URL: https://github.com/apache/datafusion/pull/11197#discussion_r1662520975 ## datafusion/optimizer/src/common_subexpr_eliminate.rs: ## @@ -1012,19 +1013,22 @@ impl TreeNodeRewriter for CommonSubexprRewriter<'_, '_> { self.alia

Re: [PR] Covert grouping to udaf [datafusion]

2024-07-02 Thread via GitHub
Rachelint commented on PR #11147: URL: https://github.com/apache/datafusion/pull/11147#issuecomment-2203194934 Thanks for > Thanks @Rachelint -- this is great. > > > Remember to add the test in roundtrip_expr_api in datafusion/proto/tests/cases/roundtrip_logical_plan.rs, other

Re: [I] Convert `Grouping` to UDAF [datafusion]

2024-07-02 Thread via GitHub
jayzhan211 closed issue #10906: Convert `Grouping` to UDAF URL: https://github.com/apache/datafusion/issues/10906 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Covert grouping to udaf [datafusion]

2024-07-02 Thread via GitHub
alamb commented on PR #11147: URL: https://github.com/apache/datafusion/pull/11147#issuecomment-2203215896 > Thanks @alamb for helping, and sorry for the delay. > Thanks all for the reviews @jayzhan211 @dharanad No worries -- thank you for the contributions@ -- This is an automat

Re: [PR] Make statistics_from_parquet_meta a sync function [datafusion]

2024-07-02 Thread via GitHub
alamb commented on PR #11205: URL: https://github.com/apache/datafusion/pull/11205#issuecomment-2203221134 Thanks again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Make statistics_from_parquet_meta a sync function [datafusion]

2024-07-02 Thread via GitHub
alamb merged PR #11205: URL: https://github.com/apache/datafusion/pull/11205 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] fix: Incorrect LEFT JOIN evaluation result on OR conditions [datafusion]

2024-07-02 Thread via GitHub
alamb commented on PR #11203: URL: https://github.com/apache/datafusion/pull/11203#issuecomment-2203273038 Looks like my comment https://github.com/apache/datafusion/pull/11203#discussion_r1662588836 was in progress when @viirya changed the code in https://github.com/apache/datafusion/pul

Re: [I] Support recursive unnest [datafusion]

2024-07-02 Thread via GitHub
jayzhan211 closed issue #10660: Support recursive unnest URL: https://github.com/apache/datafusion/issues/10660 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] [draft] Add `LogicalType`, try to support user-defined types [datafusion]

2024-07-02 Thread via GitHub
notfilippo commented on code in PR #11160: URL: https://github.com/apache/datafusion/pull/11160#discussion_r1662576117 ## datafusion/common/src/logical_type/extension.rs: ## @@ -0,0 +1,289 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

Re: [I] Register SQL planners in `SessionState::new()` [datafusion]

2024-07-02 Thread via GitHub
dharanad commented on issue #11216: URL: https://github.com/apache/datafusion/issues/11216#issuecomment-2203366285 hello @jayzhan211 , i would like to work on this. QQ: Is this issue any straight as it sounds or there is more to it? -- This is an automated message from the Apache Git Serv

Re: [PR] fix: Incorrect LEFT JOIN evaluation result on OR conditions [datafusion]

2024-07-02 Thread via GitHub
alamb commented on code in PR #11203: URL: https://github.com/apache/datafusion/pull/11203#discussion_r1662588836 ## datafusion/optimizer/src/push_down_filter.rs: ## @@ -441,11 +442,11 @@ fn push_down_all_join( // Extract from OR clause, generate new predicates for both s

Re: [PR] Enhance short circuit handling in `CommonSubexprEliminate` [datafusion]

2024-07-02 Thread via GitHub
haohuaijin commented on code in PR #11197: URL: https://github.com/apache/datafusion/pull/11197#discussion_r1662621500 ## datafusion/optimizer/src/common_subexpr_eliminate.rs: ## @@ -1012,19 +1013,22 @@ impl TreeNodeRewriter for CommonSubexprRewriter<'_, '_> { self

Re: [PR] fix: Incorrect LEFT JOIN evaluation result on OR conditions [datafusion]

2024-07-02 Thread via GitHub
ozankabak commented on code in PR #11203: URL: https://github.com/apache/datafusion/pull/11203#discussion_r1662625529 ## datafusion/optimizer/src/push_down_filter.rs: ## @@ -441,11 +442,11 @@ fn push_down_all_join( // Extract from OR clause, generate new predicates for bo

Re: [PR] initial prettier unparse [datafusion]

2024-07-02 Thread via GitHub
MohamedAbdeen21 commented on code in PR #11186: URL: https://github.com/apache/datafusion/pull/11186#discussion_r1662866175 ## datafusion/sql/src/unparser/expr.rs: ## @@ -618,6 +681,50 @@ impl Unparser<'_> { } } +// TODO: operator precedence should be defined

Re: [PR] fix: Improve error "BroadcastExchange is not supported" [datafusion-comet]

2024-07-02 Thread via GitHub
parthchandra commented on PR #577: URL: https://github.com/apache/datafusion-comet/pull/577#issuecomment-2203886808 Rebased on main, but funnily did not get any conflicts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [I] Register SQL planners in `SessionState::new()` [datafusion]

2024-07-02 Thread via GitHub
dharanad commented on issue #11216: URL: https://github.com/apache/datafusion/issues/11216#issuecomment-2203903859 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[I] Implement user defined planner for `date_part` [datafusion]

2024-07-02 Thread via GitHub
dharanad opened a new issue, #11220: URL: https://github.com/apache/datafusion/issues/11220 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/11207 ### Describe the solution you'd like _No response_ #

[I] Implement user defined planner for `create_struct` [datafusion]

2024-07-02 Thread via GitHub
dharanad opened a new issue, #11221: URL: https://github.com/apache/datafusion/issues/11221 ### Is your feature request related to a problem or challenge? Part of #11207 ### Describe the solution you'd like _No response_ ### Describe alternatives you've considered

[I] Implement user defined planner for `create_named_struct` [datafusion]

2024-07-02 Thread via GitHub
dharanad opened a new issue, #11222: URL: https://github.com/apache/datafusion/issues/11222 ### Is your feature request related to a problem or challenge? Part of #11207 ### Describe the solution you'd like _No response_ ### Describe alternatives you've considered

[I] Implement user defined planner for `sql_overlay_to_expr` [datafusion]

2024-07-02 Thread via GitHub
dharanad opened a new issue, #11223: URL: https://github.com/apache/datafusion/issues/11223 ### Is your feature request related to a problem or challenge? Part of #11207 ### Describe the solution you'd like _No response_ ### Describe alternatives you've considered

Re: [PR] feat: Implement initial framework for cost-based optimizations to avoid moving to Comet in some cases [datafusion-comet]

2024-07-02 Thread via GitHub
parthchandra commented on code in PR #618: URL: https://github.com/apache/datafusion-comet/pull/618#discussion_r1662916008 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -192,6 +192,50 @@ class CometSparkSessionExtensions } } + /** +

Re: [PR] initial prettier unparse [datafusion]

2024-07-02 Thread via GitHub
MohamedAbdeen21 commented on code in PR #11186: URL: https://github.com/apache/datafusion/pull/11186#discussion_r1662925217 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## @@ -314,3 +314,53 @@ fn test_table_references_in_plan_to_sql() { "SELECT \"table\".id, \"table\".

Re: [I] Optimize `to_timestamp` [datafusion]

2024-07-02 Thread via GitHub
alamb closed issue #9090: Optimize `to_timestamp` URL: https://github.com/apache/datafusion/issues/9090 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] [Epic] Complete pulling out special SQL planning from the Sql Parser [datafusion]

2024-07-02 Thread via GitHub
dharanad commented on issue #11207: URL: https://github.com/apache/datafusion/issues/11207#issuecomment-2203932975 I've created issues for a couple of tasks. Please let me know if you think anything needs updating in the descriptions. I'm new here and learning from the experienced folks lik

Re: [I] Implement user defined planner for `create_named_struct` [datafusion]

2024-07-02 Thread via GitHub
dharanad commented on issue #11222: URL: https://github.com/apache/datafusion/issues/11222#issuecomment-2203935220 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] Implement user defined planner for `sql_overlay_to_expr` [datafusion]

2024-07-02 Thread via GitHub
dharanad commented on issue #11223: URL: https://github.com/apache/datafusion/issues/11223#issuecomment-2203935381 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[PR] Implement ScalarValue::Map [datafusion]

2024-07-02 Thread via GitHub
goldmedal opened a new pull request, #11224: URL: https://github.com/apache/datafusion/pull/11224 ## Which issue does this PR close? Closes #11128 ## Rationale for this change ## What changes are included in this PR? ## Are these changes te

[PR] Fix running examples readme [datafusion]

2024-07-02 Thread via GitHub
findepi opened a new pull request, #11225: URL: https://github.com/apache/datafusion/pull/11225 Some examples are runnable from any place (e.g. `csv_sql`), but some expect a specific working directory (e.g. `regexp`). Running from `datafusion-examples/examples` is tested on CI so guaranteed

[PR] Fix docs wordings [datafusion]

2024-07-02 Thread via GitHub
findepi opened a new pull request, #11226: URL: https://github.com/apache/datafusion/pull/11226 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] July 10, 2024 ASF Board Report [datafusion]

2024-07-02 Thread via GitHub
alamb commented on issue #10282: URL: https://github.com/apache/datafusion/issues/10282#issuecomment-2203947017 Here is a draft for anyones comments / review: https://docs.google.com/document/d/1lV-cFZGHCSrTiaLW1gyEMDKW-9nf47UW8xK19QCqbVk/edit -- This is an automated message from the

Re: [PR] Add example of dataframe API aggregations [datafusion]

2024-07-02 Thread via GitHub
findepi commented on PR #11219: URL: https://github.com/apache/datafusion/pull/11219#issuecomment-2203961980 cc @alamb @jayzhan211 @edmondop -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] fix: Be more lenient in interpreting input args for builtin window functions [datafusion]

2024-07-02 Thread via GitHub
alamb commented on PR #11199: URL: https://github.com/apache/datafusion/pull/11199#issuecomment-2203979952 🤔 looks like a CI test is failing now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Support FixedSizedBinaryArray Parquet Data Page Statistics [datafusion]

2024-07-02 Thread via GitHub
dharanad commented on code in PR #11200: URL: https://github.com/apache/datafusion/pull/11200#discussion_r1662961525 ## datafusion/core/src/datasource/physical_plan/parquet/statistics.rs: ## @@ -903,6 +917,21 @@ macro_rules! get_data_page_statistics {

[I] BinaryOp supporting multiple parameters in Substrait [datafusion]

2024-07-02 Thread via GitHub
Lordworms opened a new issue, #11227: URL: https://github.com/apache/datafusion/issues/11227 ### Is your feature request related to a problem or challenge? When I was doing #10710 , and I was integrated tpc-h query 2, I find this bug here https://github.com/apache/datafusion/asset

Re: [PR] Add ANSI support for Subtract #535 [datafusion-comet]

2024-07-02 Thread via GitHub
dharanad commented on code in PR #593: URL: https://github.com/apache/datafusion-comet/pull/593#discussion_r1662968648 ## core/src/execution/proto/expr.proto: ## @@ -219,6 +219,7 @@ message Subtract { Expr right = 2; bool fail_on_error = 3; Review Comment: Okay. It mak

Re: [PR] Add ANSI support for Subtract #535 [datafusion-comet]

2024-07-02 Thread via GitHub
dharanad commented on PR #593: URL: https://github.com/apache/datafusion-comet/pull/593#issuecomment-2204005207 I am planning to implement my solution as a extension to work done in #616 . Will resume once that PR is merged -- This is an automated message from the Apache Git Service. To r

Re: [PR] feat: Add support for CreateNamedStruct [datafusion-comet]

2024-07-02 Thread via GitHub
andygrove commented on code in PR #620: URL: https://github.com/apache/datafusion-comet/pull/620#discussion_r1662973530 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -2141,6 +2141,25 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde w

Re: [PR] feat: IsNaN expression in Comet [datafusion-comet]

2024-07-02 Thread via GitHub
eejbyfeldt commented on PR #612: URL: https://github.com/apache/datafusion-comet/pull/612#issuecomment-2204024830 > Could you also update `docs/source/user-guide/expressions.md` to show that we now support `is_nan`? Good idea, added. -- This is an automated message from the Apa

Re: [PR] feat: Add support for CreateNamedStruct [datafusion-comet]

2024-07-02 Thread via GitHub
eejbyfeldt commented on code in PR #620: URL: https://github.com/apache/datafusion-comet/pull/620#discussion_r1662997372 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -2141,6 +2141,25 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde

Re: [PR] feat: Add support for CreateNamedStruct [datafusion-comet]

2024-07-02 Thread via GitHub
eejbyfeldt commented on code in PR #620: URL: https://github.com/apache/datafusion-comet/pull/620#discussion_r1663002521 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -2626,6 +2645,11 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde

Re: [PR] Enhance short circuit handling in `CommonSubexprEliminate` [datafusion]

2024-07-02 Thread via GitHub
peter-toth commented on code in PR #11197: URL: https://github.com/apache/datafusion/pull/11197#discussion_r1663008804 ## datafusion/optimizer/src/common_subexpr_eliminate.rs: ## @@ -1799,4 +1803,34 @@ mod test { assert!(result.len() == 1); Ok(()) } + +

  1   2   >