Re: [I] Extract registering default features from `SessionState` and into its own function [datafusion]

2024-07-08 Thread via GitHub
jayzhan211 commented on issue #11320: URL: https://github.com/apache/datafusion/issues/11320#issuecomment-2213204802 Should we cut it even more granularly to several of `register_*` functions or configure them like existing `create_default_catalog_and_schema`? -- This is an automated mess

[I] Support to uparse logical plan with timestamp cast to string [datafusion]

2024-07-08 Thread via GitHub
sgrebnov opened a new issue, #11325: URL: https://github.com/apache/datafusion/issues/11325 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/9494 to enhance support for Unparsing LogicalPlan to SQL String. Conver

Re: [PR] Implement TPCH substrait integration test, support tpch_4 and tpch_5 [datafusion]

2024-07-08 Thread via GitHub
Blizzara commented on code in PR #11311: URL: https://github.com/apache/datafusion/pull/11311#discussion_r1668128566 ## datafusion/substrait/tests/cases/consumer_integration.rs: ## @@ -90,6 +90,40 @@ mod tests { Ok(ctx) } +async fn create_context_tpch4() -> R

[PR] Support to uparse logical plans with timestamp cast to string [datafusion]

2024-07-08 Thread via GitHub
sgrebnov opened a new pull request, #11326: URL: https://github.com/apache/datafusion/pull/11326 ## Which issue does this PR close? Closes #11325 ## Rationale for this change ## What changes are included in this PR? Convert Arrow timestamp to `ast::DateTime::Ti

[PR] Implement user defined planner for sql_substring_to_expr [datafusion]

2024-07-08 Thread via GitHub
xinlifoobar opened a new pull request, #11327: URL: https://github.com/apache/datafusion/pull/11327 ## Which issue does this PR close? Closes #11245 ## Rationale for this change ## What changes are included in this PR? ## Are these changes

Re: [PR] Introduce user defined SQL planner API [datafusion]

2024-07-08 Thread via GitHub
lewiszlw commented on PR #11180: URL: https://github.com/apache/datafusion/pull/11180#issuecomment-2213398266 Sorry, I didn't follow up this topic before. I'm thinking if we can have one UserDefinedSQLPlanner which all methods have default official implementation, users just need to impl me

[PR] Update substrait requirement from 0.35.0 to 0.36.0 [datafusion]

2024-07-08 Thread via GitHub
dependabot[bot] opened a new pull request, #11328: URL: https://github.com/apache/datafusion/pull/11328 Updates the requirements on [substrait](https://github.com/substrait-io/substrait-rs) to permit the latest version. Release notes Sourced from https://github.com/substrait-io/su

Re: [I] `SanityCheckPlan` failed on `sql_planning`1 benchmark: , Plan("Child: [\"ProjectionExec: expr=[]\", \" CoalesceBatchesExec: target_batch_size=8192\", [datafusion]

2024-07-08 Thread via GitHub
berkaysynnada commented on issue #11322: URL: https://github.com/apache/datafusion/issues/11322#issuecomment-2213416478 The problem is in tpc-ds/9. The issue arises because `NestedLoopJoinExec` requires SinglePartition from the left side, but MemoryExec sends 0 partitions. We will discuss a

Re: [PR] Support `IS NULL` and `IS NOT NULL` on Unions [datafusion]

2024-07-08 Thread via GitHub
samuelcolvin commented on code in PR #11321: URL: https://github.com/apache/datafusion/pull/11321#discussion_r1668254864 ## datafusion/physical-expr/src/expressions/is_null.rs: ## @@ -145,4 +201,65 @@ mod tests { Ok(()) } + +fn union_fields() -> UnionFields {

Re: [PR] Support `IS NULL` and `IS NOT NULL` on Unions [datafusion]

2024-07-08 Thread via GitHub
samuelcolvin commented on code in PR #11321: URL: https://github.com/apache/datafusion/pull/11321#discussion_r1668257787 ## datafusion/physical-expr/src/expressions/is_null.rs: ## @@ -100,6 +110,49 @@ impl PhysicalExpr for IsNullExpr { } } +pub(crate) fn union_is_null(un

Re: [PR] Support `IS NULL` and `IS NOT NULL` on Unions [datafusion]

2024-07-08 Thread via GitHub
samuelcolvin commented on code in PR #11321: URL: https://github.com/apache/datafusion/pull/11321#discussion_r1668259962 ## datafusion/physical-expr/src/expressions/is_null.rs: ## @@ -74,9 +77,16 @@ impl PhysicalExpr for IsNullExpr { fn evaluate(&self, batch: &RecordBatch)

[PR] fix: When consuming Substrait, temporarily rename clashing duplicate columns [datafusion]

2024-07-08 Thread via GitHub
Blizzara opened a new pull request, #11329: URL: https://github.com/apache/datafusion/pull/11329 ## Which issue does this PR close? Related to https://github.com/apache/datafusion/issues/10815, and follows up from https://github.com/apache/datafusion/pull/11049#issuecomment-2185795038

[PR] [Draft] option enable_options_value_normalization [datafusion]

2024-07-08 Thread via GitHub
xinlifoobar opened a new pull request, #11330: URL: https://github.com/apache/datafusion/pull/11330 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes teste

Re: [I] Stop changing the case for COPY TO option values [datafusion]

2024-07-08 Thread via GitHub
xinlifoobar commented on issue #10853: URL: https://github.com/apache/datafusion/issues/10853#issuecomment-2213448883 > enable_options_value_normalization I draft #11330 to discuss this. Let me know your thoughts! -- This is an automated message from the Apache Git Service. To respo

Re: [PR] feat: ANSI support for Add [datafusion-comet]

2024-07-08 Thread via GitHub
dharanad commented on code in PR #616: URL: https://github.com/apache/datafusion-comet/pull/616#discussion_r1668297829 ## core/src/execution/datafusion/expressions/binary.rs: ## @@ -0,0 +1,202 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

Re: [PR] allow alias in predicate [datafusion]

2024-07-08 Thread via GitHub
samuelcolvin commented on PR #11307: URL: https://github.com/apache/datafusion/pull/11307#issuecomment-2213504048 I've updated this to un-alias the predicate. I'm not sure if this was your suggestion, but I don't think it's necessarily to un-alias recursively within `Filter::try_new`

Re: [PR] Improve stats convert performance for Binary/String/Boolean arrays [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11319: URL: https://github.com/apache/datafusion/pull/11319 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Improve performance of DataPage statistics extraction using StringBuilder [datafusion]

2024-07-08 Thread via GitHub
alamb closed issue #11281: Improve performance of DataPage statistics extraction using StringBuilder URL: https://github.com/apache/datafusion/issues/11281 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Fix typos in datafusion-examples/datafusion-cli/docs [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11259: URL: https://github.com/apache/datafusion/pull/11259 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] feat: ANSI support for Add [datafusion-comet]

2024-07-08 Thread via GitHub
dharanad commented on code in PR #616: URL: https://github.com/apache/datafusion-comet/pull/616#discussion_r1668351284 ## core/src/execution/datafusion/expressions/binary.rs: ## @@ -0,0 +1,202 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

Re: [PR] Introduce user defined SQL planner API [datafusion]

2024-07-08 Thread via GitHub
alamb commented on PR #11180: URL: https://github.com/apache/datafusion/pull/11180#issuecomment-2213584028 > We have to loop UserDefinedSQLPlanner instances in current design, it seems not very efficient and has much boilerplate code. And these UserDefinedSQLPlanner instaces might h

[PR] Minor: Fix Failing TPC-DS Test [datafusion]

2024-07-08 Thread via GitHub
berkaysynnada opened a new pull request, #11331: URL: https://github.com/apache/datafusion/pull/11331 ## Which issue does this PR close? Closes #11322. ## Rationale for this change We need to generate at least one partition to have a valid plan that confo

Re: [PR] HashJoin can preserve the right ordering when join type is Right [datafusion]

2024-07-08 Thread via GitHub
ozankabak commented on PR #11276: URL: https://github.com/apache/datafusion/pull/11276#issuecomment-2213646661 Seems like we have incorporated all the feedback. I will merge this later on today unless we get more feedback for improvements. Thanks everyone -- This is an automated message f

Re: [PR] [Draft] option enable_options_value_normalization [datafusion]

2024-07-08 Thread via GitHub
berkaysynnada commented on PR #11330: URL: https://github.com/apache/datafusion/pull/11330#issuecomment-2213653483 Thanks @xinlifoobar working on this issue. What I suggest is that we can let people to define their own normalizations. Rather than directing all values to lowercase, pe

Re: [I] Separate Spark-compatibility expressions into a library [datafusion-comet]

2024-07-08 Thread via GitHub
Blizzara commented on issue #630: URL: https://github.com/apache/datafusion-comet/issues/630#issuecomment-2213760276 > I created #637 so that we can start splitting code out into separate crates. Nice, glad to hear this resonates and thanks! > I also started looking at what wo

Re: [I] Implement SQLancer (a end-to-end SQL fuzz testing library) [datafusion]

2024-07-08 Thread via GitHub
2010YOUY01 commented on issue #11030: URL: https://github.com/apache/datafusion/issues/11030#issuecomment-2213769947 The initial implementation is done (with ~10 bugs found 👀 ) The code is now at https://github.com/datafusion-contrib/datafusion-sqllancer, also with a more detailed descri

Re: [PR] [Draft] option enable_options_value_normalization [datafusion]

2024-07-08 Thread via GitHub
xinlifoobar commented on PR #11330: URL: https://github.com/apache/datafusion/pull/11330#issuecomment-2213809531 > Thanks @xinlifoobar working on this issue. > > What I suggest is that we can let people to define their own normalizations. Rather than directing all values to lowercase,

Re: [PR] Minor: Fix Failing TPC-DS Test [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11331: URL: https://github.com/apache/datafusion/pull/11331 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] `SanityCheckPlan` failed on `sql_planning`1 benchmark: , Plan("Child: [\"ProjectionExec: expr=[]\", \" CoalesceBatchesExec: target_batch_size=8192\", [datafusion]

2024-07-08 Thread via GitHub
alamb closed issue #11322: `SanityCheckPlan` failed on `sql_planning`1 benchmark: , Plan("Child: [\"ProjectionExec: expr=[]\", \" CoalesceBatchesExec: target_batch_size=8192\", URL: https://github.com/apache/datafusion/issues/11322 -- This is an automated message from the Apache Git Servic

Re: [PR] HashJoin can preserve the right ordering when join type is Right [datafusion]

2024-07-08 Thread via GitHub
ozankabak merged PR #11276: URL: https://github.com/apache/datafusion/pull/11276 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Update substrait requirement from 0.35.0 to 0.36.0 [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11328: URL: https://github.com/apache/datafusion/pull/11328 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Support to uparse logical plans with timestamp cast to string [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11326: URL: https://github.com/apache/datafusion/pull/11326 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Support to uparse logical plan with timestamp cast to string [datafusion]

2024-07-08 Thread via GitHub
alamb closed issue #11325: Support to uparse logical plan with timestamp cast to string URL: https://github.com/apache/datafusion/issues/11325 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Implement user defined planner for sql_substring_to_expr [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11327: URL: https://github.com/apache/datafusion/pull/11327 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] RFC: Make it easier to call window functions via expression API (and add example) [datafusion]

2024-07-08 Thread via GitHub
timsaucer commented on PR #6746: URL: https://github.com/apache/datafusion/pull/6746#issuecomment-2213934754 I should have time to work on this near the end of the week or early next week. I'm in nearing the end of a large PR for `datafusion-python` and can dive back on this topic in once t

[I] NestedLoopJoin can preserve the order of right table [datafusion]

2024-07-08 Thread via GitHub
berkaysynnada opened a new issue, #11332: URL: https://github.com/apache/datafusion/issues/11332 ### Is your feature request related to a problem or challenge? `NestedLoopJoinExec` can preserve the order of the right table, if it has some, similar to `HashJoinExec`. ### Describ

Re: [I] Extract registering default features from `SessionState` and into its own function [datafusion]

2024-07-08 Thread via GitHub
Omega359 commented on issue #11320: URL: https://github.com/apache/datafusion/issues/11320#issuecomment-2214007578 I was actually looking at this last week interestingly enough. My thought was to have a parent `register_all_defaults` which delegates to individual functions for registering d

Re: [PR] [Draft] option enable_options_value_normalization [datafusion]

2024-07-08 Thread via GitHub
berkaysynnada commented on PR #11330: URL: https://github.com/apache/datafusion/pull/11330#issuecomment-2214018210 > > Thanks @xinlifoobar working on this issue. > > What I suggest is that we can let people to define their own normalizations. Rather than directing all values to lowercase,

Re: [PR] [Draft] option enable_options_value_normalization [datafusion]

2024-07-08 Thread via GitHub
xinlifoobar commented on PR #11330: URL: https://github.com/apache/datafusion/pull/11330#issuecomment-2214086218 > But IMO it won't be a good idea... I thought the *configs are static values. What should you do if you would keep a config file for function values? -- This is an auto

Re: [I] Reduce test duplication in tests for data page stattistics [datafusion]

2024-07-08 Thread via GitHub
alamb commented on issue #11000: URL: https://github.com/apache/datafusion/issues/11000#issuecomment-2214099764 I propose we consolidate / review the tests when we move this code upstream: https://github.com/apache/arrow-rs/issues/4328 -- This is an automated message from the Apache Git

Re: [PR] allow alias in predicate [datafusion]

2024-07-08 Thread via GitHub
jonahgao commented on code in PR #11307: URL: https://github.com/apache/datafusion/pull/11307#discussion_r1668653670 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2130,16 +2130,10 @@ impl Filter { } } -// filter predicates should not be alia

Re: [PR] allow alias in predicate [datafusion]

2024-07-08 Thread via GitHub
jonahgao commented on code in PR #11307: URL: https://github.com/apache/datafusion/pull/11307#discussion_r1668666851 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2130,16 +2130,10 @@ impl Filter { } } -// filter predicates should not be alia

Re: [I] Consider renaming `UserDefinedSQLPlanner` to `ExprPlanner` [datafusion]

2024-07-08 Thread via GitHub
andygrove commented on issue #11304: URL: https://github.com/apache/datafusion/issues/11304#issuecomment-2214253308 @alamb @samuelcolvin This seems to be the only issue blocking the release. Is anyone planning on working on this? If not, I'll pick it up. -- This is an automated message f

[PR] Fix bug when pushing projection under joins [datafusion]

2024-07-08 Thread via GitHub
jonahgao opened a new pull request, #11333: URL: https://github.com/apache/datafusion/pull/11333 ## Which issue does this PR close? Closes #11269. Closes #11275. ## Rationale for this change The bug occurs at the [`new_indices_for_join_filter`](https://github.com/

Re: [I] Consider renaming `UserDefinedSQLPlanner` to `ExprPlanner` [datafusion]

2024-07-08 Thread via GitHub
samuelcolvin commented on issue #11304: URL: https://github.com/apache/datafusion/issues/11304#issuecomment-2214323216 Please do. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] allow alias in predicate [datafusion]

2024-07-08 Thread via GitHub
samuelcolvin commented on code in PR #11307: URL: https://github.com/apache/datafusion/pull/11307#discussion_r1668798569 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2130,16 +2130,10 @@ impl Filter { } } -// filter predicates should not be

Re: [PR] Fix bug when pushing projection under joins [datafusion]

2024-07-08 Thread via GitHub
jonahgao commented on code in PR #11333: URL: https://github.com/apache/datafusion/pull/11333#discussion_r1668799533 ## datafusion/sqllogictest/test_files/join.slt: ## @@ -986,3 +986,61 @@ DROP TABLE employees statement ok DROP TABLE department + + +# Test issue: https://git

Re: [I] Release DataFusion `40.0.0` [datafusion]

2024-07-08 Thread via GitHub
samuelcolvin commented on issue #11077: URL: https://github.com/apache/datafusion/issues/11077#issuecomment-2214341580 Please could we include #11307, and maybe even #11321? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Improve `DataFrame` Users Guide [datafusion]

2024-07-08 Thread via GitHub
efredine commented on PR #11324: URL: https://github.com/apache/datafusion/pull/11324#issuecomment-2214357564 Reviewing this sparks a lot of broader throughts for me. First of all, I'm not sure we need the distinction between "user guide" and "library user guide" when it comes to dat

Re: [PR] feat: ANSI support for Add [datafusion-comet]

2024-07-08 Thread via GitHub
planga82 commented on code in PR #616: URL: https://github.com/apache/datafusion-comet/pull/616#discussion_r1668815147 ## core/src/execution/datafusion/expressions/binary.rs: ## @@ -0,0 +1,202 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

Re: [PR] feat: ANSI support for Add [datafusion-comet]

2024-07-08 Thread via GitHub
planga82 commented on code in PR #616: URL: https://github.com/apache/datafusion-comet/pull/616#discussion_r1668815147 ## core/src/execution/datafusion/expressions/binary.rs: ## @@ -0,0 +1,202 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

Re: [PR] allow alias in predicate [datafusion]

2024-07-08 Thread via GitHub
jonahgao commented on code in PR #11307: URL: https://github.com/apache/datafusion/pull/11307#discussion_r1668840897 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2130,16 +2130,10 @@ impl Filter { } } -// filter predicates should not be alia

Re: [PR] Fix bug when pushing projection under joins [datafusion]

2024-07-08 Thread via GitHub
jonahgao commented on PR #11333: URL: https://github.com/apache/datafusion/pull/11333#issuecomment-2214432019 @berkaysynnada, could you please help review this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[PR] adding an unused dependency checker [datafusion-comet]

2024-07-08 Thread via GitHub
vaibhawvipul opened a new pull request, #640: URL: https://github.com/apache/datafusion-comet/pull/640 ## Which issue does this PR close? Closes #. ## Rationale for this change Removing unused dependencies. Better for streamlining the project and frees up

Re: [PR] chore: Convert Rust project into a workspace [datafusion-comet]

2024-07-08 Thread via GitHub
andygrove merged PR #637: URL: https://github.com/apache/datafusion-comet/pull/637 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] feat: adding an unused dependency checker [datafusion-comet]

2024-07-08 Thread via GitHub
vaibhawvipul closed pull request #640: feat: adding an unused dependency checker URL: https://github.com/apache/datafusion-comet/pull/640 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] feat: adding an unused dependency checker [datafusion-comet]

2024-07-08 Thread via GitHub
vaibhawvipul commented on PR #640: URL: https://github.com/apache/datafusion-comet/pull/640#issuecomment-2214503801 lots of conflicts. i will resubmit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] allow alias in predicate [datafusion]

2024-07-08 Thread via GitHub
andygrove commented on code in PR #11307: URL: https://github.com/apache/datafusion/pull/11307#discussion_r1668897084 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2130,16 +2130,10 @@ impl Filter { } } -// filter predicates should not be ali

Re: [PR] Improve and test dataframe API examples in docs [datafusion]

2024-07-08 Thread via GitHub
efredine commented on code in PR #11290: URL: https://github.com/apache/datafusion/pull/11290#discussion_r1668904890 ## docs/source/library-user-guide/using-the-dataframe-api.md: ## @@ -19,129 +19,267 @@ # Using the DataFrame API -## What is a DataFrame +The [Users Guide] i

[PR] add cargo machete to remove udeps [datafusion-comet]

2024-07-08 Thread via GitHub
vaibhawvipul opened a new pull request, #641: URL: https://github.com/apache/datafusion-comet/pull/641 ## Which issue does this PR close? Closes #. ## Rationale for this change Removing unused dependencies. Better for streamlining the project and frees up

Re: [PR] Improve `DataFrame` Users Guide [datafusion]

2024-07-08 Thread via GitHub
comphead commented on code in PR #11324: URL: https://github.com/apache/datafusion/pull/11324#discussion_r1668931470 ## docs/source/user-guide/dataframe.md: ## @@ -19,17 +19,30 @@ # DataFrame API -A DataFrame represents a logical set of rows with the same named columns, si

Re: [PR] Improve and test dataframe API examples in docs [datafusion]

2024-07-08 Thread via GitHub
alamb commented on code in PR #11290: URL: https://github.com/apache/datafusion/pull/11290#discussion_r1668938569 ## docs/source/library-user-guide/using-the-dataframe-api.md: ## @@ -19,129 +19,267 @@ # Using the DataFrame API -## What is a DataFrame +The [Users Guide] intr

Re: [I] DataFusion weekly project plan (Andrew Lamb) - July 8, 2024 [datafusion]

2024-07-08 Thread via GitHub
alamb commented on issue #11334: URL: https://github.com/apache/datafusion/issues/11334#issuecomment-2214636404 Review Queue Arrow - [ ] https://github.com/apache/arrow-rs/pull/6000 - [ ] https://github.com/apache/arrow-rs/pull/6011 - [ ] https://github.com/apache/arrow-rs/pul

Re: [I] DataFusion weekly project plan (Andrew Lamb) - July 1, 2024 [datafusion]

2024-07-08 Thread via GitHub
alamb closed issue #11190: DataFusion weekly project plan (Andrew Lamb) - July 1, 2024 URL: https://github.com/apache/datafusion/issues/11190 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Improve volatile expression handling in `CommonSubexprEliminate` [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11265: URL: https://github.com/apache/datafusion/pull/11265 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Improve volatile expression handling in `CommonSubexprEliminate` [datafusion]

2024-07-08 Thread via GitHub
alamb commented on PR #11265: URL: https://github.com/apache/datafusion/pull/11265#issuecomment-2214669069 Thanks again @peter-toth and @haohuaijin -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Change array agg result from empty list to null if no row qualifed [datafusion]

2024-07-08 Thread via GitHub
alamb commented on code in PR #11299: URL: https://github.com/apache/datafusion/pull/11299#discussion_r1668961727 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -2744,28 +2735,84 @@ SELECT ARRAY_AGG([1]) [[1]] -# test_approx_percentile_cont_decimal_support -

Re: [PR] Convert `nth_value` to UDAF [datafusion]

2024-07-08 Thread via GitHub
alamb commented on PR #11287: URL: https://github.com/apache/datafusion/pull/11287#issuecomment-2214670401 I merged this PR up from main to ensure all the CI still passes and once they do I plan to merge tit -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Implement TPCH substrait integration test, support tpch_4 and tpch_5 [datafusion]

2024-07-08 Thread via GitHub
alamb commented on code in PR #11311: URL: https://github.com/apache/datafusion/pull/11311#discussion_r1668976073 ## datafusion/substrait/src/logical_plan/consumer.rs: ## @@ -1297,6 +1298,32 @@ pub async fn from_substrait_rex( outer_ref_columns,

Re: [PR] Support `IS NULL` and `IS NOT NULL` on Unions [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11321: URL: https://github.com/apache/datafusion/pull/11321 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Union columns can never be `NULL` [datafusion]

2024-07-08 Thread via GitHub
alamb closed issue #11162: Union columns can never be `NULL` URL: https://github.com/apache/datafusion/issues/11162 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] feat: add cargo machete to remove udeps [datafusion-comet]

2024-07-08 Thread via GitHub
vaibhawvipul commented on PR #641: URL: https://github.com/apache/datafusion-comet/pull/641#issuecomment-2214729925 CI failed because of network error . Can we do a re-run? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[I] abs implementation seems incorrect [datafusion-comet]

2024-07-08 Thread via GitHub
andygrove opened a new issue, #642: URL: https://github.com/apache/datafusion-comet/issues/642 ### Describe the bug ### Describe the bug I was reviewing the `abs` logic as part of https://github.com/apache/datafusion-comet/pull/638 and noticed that it seems incorrect. I

Re: [PR] feat: add cargo machete to remove udeps [datafusion-comet]

2024-07-08 Thread via GitHub
viirya commented on PR #641: URL: https://github.com/apache/datafusion-comet/pull/641#issuecomment-2214783158 > CI failed because of network error . Can we do a re-run? Sure. Triggered a re-run. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] chore: Make shuffle compression level configurable [datafusion-comet]

2024-07-08 Thread via GitHub
parthchandra commented on code in PR #632: URL: https://github.com/apache/datafusion-comet/pull/632#discussion_r1669032147 ## common/src/main/scala/org/apache/comet/CometConf.scala: ## @@ -175,6 +175,12 @@ object CometConf extends ShimCometConf { .stringConf .createWit

Re: [I] abs implementation seems incorrect [datafusion-comet]

2024-07-08 Thread via GitHub
andygrove commented on issue #642: URL: https://github.com/apache/datafusion-comet/issues/642#issuecomment-2214793802 I added tests in https://github.com/apache/datafusion-comet/pull/638 that confirm the bug -- This is an automated message from the Apache Git Service. To respond to the m

Re: [I] Convert `NthValue` to UDAF [datafusion]

2024-07-08 Thread via GitHub
alamb closed issue #11284: Convert `NthValue` to UDAF URL: https://github.com/apache/datafusion/issues/11284 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Convert `nth_value` to UDAF [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11287: URL: https://github.com/apache/datafusion/pull/11287 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Convert `nth_value` to UDAF [datafusion]

2024-07-08 Thread via GitHub
alamb commented on PR #11287: URL: https://github.com/apache/datafusion/pull/11287#issuecomment-2214795739 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Convert `nth_value` to UDAF [datafusion]

2024-07-08 Thread via GitHub
alamb commented on PR #11287: URL: https://github.com/apache/datafusion/pull/11287#issuecomment-2214795920 Thanks again @jayzhan211 and @jcsherin -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Implement TPCH substrait integration test, support tpch_4 and tpch_5 [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11311: URL: https://github.com/apache/datafusion/pull/11311 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Implement TPCH substrait integration test, support tpch_4 and tpch_5 [datafusion]

2024-07-08 Thread via GitHub
alamb commented on PR #11311: URL: https://github.com/apache/datafusion/pull/11311#issuecomment-2214796233 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Enable `clone_on_ref_ptr` clippy lint on physical-plan crate [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11241: URL: https://github.com/apache/datafusion/pull/11241 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

[PR] Minor: update dashmap [datafusion]

2024-07-08 Thread via GitHub
alamb opened a new pull request, #11335: URL: https://github.com/apache/datafusion/pull/11335 ## Which issue does this PR close? N/A ## Rationale for this change Let's keep up with dependencies Previously Dashmap `6.0.0` was yanked, but is now available: https://githu

Re: [PR] feat: Create new `datafusion-comet-expr` crate containing Spark-compatible DataFusion expressions [datafusion-comet]

2024-07-08 Thread via GitHub
andygrove commented on code in PR #638: URL: https://github.com/apache/datafusion-comet/pull/638#discussion_r1669040245 ## native/spark-expr/Cargo.toml: ## @@ -0,0 +1,38 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [PR] build(deps): update dashmap requirement from 5.5.0 to 6.0.0 [datafusion]

2024-07-08 Thread via GitHub
alamb commented on PR #10994: URL: https://github.com/apache/datafusion/pull/10994#issuecomment-2214810819 To to update to 6.0.1: https://github.com/apache/datafusion/pull/11335 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] User guide should have a reference listing all the built-in functions [datafusion]

2024-07-08 Thread via GitHub
alamb closed issue #2872: User guide should have a reference listing all the built-in functions URL: https://github.com/apache/datafusion/issues/2872 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] User guide should have a reference listing all the built-in functions [datafusion]

2024-07-08 Thread via GitHub
alamb commented on issue #2872: URL: https://github.com/apache/datafusion/issues/2872#issuecomment-2214856406 I think we now have this: https://datafusion.apache.org/user-guide/sql/scalar_functions.html, https://datafusion.apache.org/user-guide/sql/aggregate_functions.html , https://datafu

Re: [I] Improve array data types [datafusion]

2024-07-08 Thread via GitHub
alamb closed issue #7288: Improve array data types URL: https://github.com/apache/datafusion/issues/7288 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Improve array data types [datafusion]

2024-07-08 Thread via GitHub
alamb commented on issue #7288: URL: https://github.com/apache/datafusion/issues/7288#issuecomment-2214859989 I think @jayzhan211 has basically done this and the code is substantially cleaner (and different) than when this ticket was filed so closing it down. If there is any specific sugge

Re: [I] [Epic] General ticket for the concept of the practical implementation of `ARRAY` [datafusion]

2024-07-08 Thread via GitHub
alamb closed issue #6980: [Epic] General ticket for the concept of the practical implementation of `ARRAY` URL: https://github.com/apache/datafusion/issues/6980 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [I] [Epic] General ticket for the concept of the practical implementation of `ARRAY` [datafusion]

2024-07-08 Thread via GitHub
alamb commented on issue #6980: URL: https://github.com/apache/datafusion/issues/6980#issuecomment-2214863083 This looks pretty good to me now, I think we have done all the major items so closing this epic as good. We can handle small cleanup items as normal tickets. Please let me know if y

[I] Add a "Gentle Introduction to Arrow / Record Batches" [datafusion]

2024-07-08 Thread via GitHub
alamb opened a new issue, #11336: URL: https://github.com/apache/datafusion/issues/11336 ### Is your feature request related to a problem or challenge? As @efredine notes on https://github.com/apache/datafusion/pull/11290 / https://github.com/apache/datafusion/pull/11290#discussion_

Re: [PR] Improve and test dataframe API examples in docs [datafusion]

2024-07-08 Thread via GitHub
alamb commented on code in PR #11290: URL: https://github.com/apache/datafusion/pull/11290#discussion_r1669077319 ## docs/source/library-user-guide/using-the-dataframe-api.md: ## @@ -19,129 +19,267 @@ # Using the DataFrame API -## What is a DataFrame +The [Users Guide] intr

Re: [PR] fix: When consuming Substrait, temporarily rename clashing duplicate columns [datafusion]

2024-07-08 Thread via GitHub
alamb commented on code in PR #11329: URL: https://github.com/apache/datafusion/pull/11329#discussion_r1669083208 ## datafusion/substrait/src/logical_plan/consumer.rs: ## @@ -403,22 +403,33 @@ pub async fn from_substrait_rel( let mut input = LogicalPlanBuilder::

[I] Create a Comet dockerfile [datafusion-comet]

2024-07-08 Thread via GitHub
comphead opened a new issue, #643: URL: https://github.com/apache/datafusion-comet/issues/643 ### What is the problem the feature request solves? I would like to run the Comet accelerator with Spark in Kube environment, I would like to have instructions and dockerfile ### Descr

Re: [I] `spark.comet.memory.overhead.min` not respected when submitting jobs with Comet with Spark on Kubernetes [datafusion-comet]

2024-07-08 Thread via GitHub
comphead commented on issue #605: URL: https://github.com/apache/datafusion-comet/issues/605#issuecomment-2214898912 Depends on https://github.com/apache/datafusion-comet/issues/643 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] fix: When consuming Substrait, temporarily rename clashing duplicate columns [datafusion]

2024-07-08 Thread via GitHub
alamb merged PR #11329: URL: https://github.com/apache/datafusion/pull/11329 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

[PR] Minor: remove clones and unnecessary Arcs in `from_substrait_rex` [datafusion]

2024-07-08 Thread via GitHub
alamb opened a new pull request, #11337: URL: https://github.com/apache/datafusion/pull/11337 ## Which issue does this PR close? Closes #. ## Rationale for this change While reviewing https://github.com/apache/datafusion/pull/11329 from @Blizzara I was confused a

Re: [PR] fix: When consuming Substrait, temporarily rename clashing duplicate columns [datafusion]

2024-07-08 Thread via GitHub
alamb commented on code in PR #11329: URL: https://github.com/apache/datafusion/pull/11329#discussion_r1669099740 ## datafusion/substrait/src/logical_plan/consumer.rs: ## @@ -403,22 +403,33 @@ pub async fn from_substrait_rel( let mut input = LogicalPlanBuilder::

  1   2   >