[I] Cannot run `group by` without aggregation function due to 'Invalid aggregate expression' [datafusion]

2024-09-02 Thread via GitHub
Tanger opened a new issue, #12298: URL: https://github.com/apache/datafusion/issues/12298 ### Describe the bug Return an error like below message. Error in response: DataFusionError("Internal error: Invalid aggregate expression 'Column(Column { relation: Some(Bare { table:

[PR] fix hash-repartition panic [datafusion]

2024-09-02 Thread via GitHub
thinh2 opened a new pull request, #12297: URL: https://github.com/apache/datafusion/pull/12297 ## Which issue does this PR close? Closes #12057 . ## Rationale for this change In #12057, the root cause of the panic is because Hash Repartition Execution rec

Re: [PR] Support `map_keys` & `map_values` for MAP type [datafusion]

2024-09-02 Thread via GitHub
dharanad commented on PR #12194: URL: https://github.com/apache/datafusion/pull/12194#issuecomment-2325670489 Thank You @Weijun-H @jayzhan211 @Blizzara -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] implement max_by aggregate function [datafusion]

2024-09-02 Thread via GitHub
Lordworms commented on code in PR #12284: URL: https://github.com/apache/datafusion/pull/12284#discussion_r1741473849 ## datafusion/functions-aggregate/src/max_by.rs: ## @@ -0,0 +1,168 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

Re: [I] Support of timestamps and steps of less than a day for `generate_series` [datafusion]

2024-09-02 Thread via GitHub
Abdullahsab3 commented on issue #11822: URL: https://github.com/apache/datafusion/issues/11822#issuecomment-2325663126 Sky is the limit :-) I think postgres supports all kinds of intervals (except nanos since those are not supported by postgresql), but they handle `generate_series` differen

Re: [PR] Implement `kurtosis_pop` UDAF [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 commented on code in PR #12273: URL: https://github.com/apache/datafusion/pull/12273#discussion_r1741434584 ## datafusion/functions-aggregate/src/kurtosis_pop.rs: ## @@ -0,0 +1,199 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribut

Re: [PR] ignore: Experimenting with sharing Native/NativeUtils instead of creating per iterator [datafusion-comet]

2024-09-02 Thread via GitHub
andygrove commented on PR #903: URL: https://github.com/apache/datafusion-comet/pull/903#issuecomment-2325541234 This change causes a mix of errors in CI: ``` java.lang.IllegalStateException: RefCnt has gone negative at org.apache.comet.shaded.arrow.util.Preconditions.check

Re: [PR] Support `skewness(x)` in Aggregation function [datafusion]

2024-09-02 Thread via GitHub
2010YOUY01 commented on PR #12295: URL: https://github.com/apache/datafusion/pull/12295#issuecomment-2325498438 > Planning to add more logic tests. I think we can port tests from duckdb https://github.com/duckdb/duckdb/blob/main/test/sql/aggregate/aggregates/test_skewness.test -- T

Re: [I] Support prepared statement arguments in the LIMIT clause [datafusion]

2024-09-02 Thread via GitHub
jonahgao commented on issue #12294: URL: https://github.com/apache/datafusion/issues/12294#issuecomment-2325489579 I think it can be easily supported after #9821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Support null safe equals in extract_equijoin_predicate [datafusion]

2024-09-02 Thread via GitHub
github-actions[bot] commented on PR #11272: URL: https://github.com/apache/datafusion/pull/11272#issuecomment-2325462694 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [I] Support `map_values` for MAP type [datafusion]

2024-09-02 Thread via GitHub
Weijun-H commented on issue #12148: URL: https://github.com/apache/datafusion/issues/12148#issuecomment-2325458120 complete via #12194 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Support `map_values` for MAP type [datafusion]

2024-09-02 Thread via GitHub
Weijun-H closed issue #12148: Support `map_values` for MAP type URL: https://github.com/apache/datafusion/issues/12148 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] fix docker build in CI [datafusion-ballista]

2024-09-02 Thread via GitHub
andygrove merged PR #1046: URL: https://github.com/apache/datafusion-ballista/pull/1046 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] include input fields as output for Substrait consumer [datafusion]

2024-09-02 Thread via GitHub
Lordworms commented on PR #12225: URL: https://github.com/apache/datafusion/pull/12225#issuecomment-2325414784 > I don't feel this is right - unless I misunderstood something. This tries to fix the problem at the root level of the plan, so that we'd produce the columns the plan asked for, a

[PR] fix docker build in CI [datafusion-ballista]

2024-09-02 Thread via GitHub
andygrove opened a new pull request, #1046: URL: https://github.com/apache/datafusion-ballista/pull/1046 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing cha

Re: [PR] Fix cargo build [datafusion-ballista]

2024-09-02 Thread via GitHub
andygrove merged PR #1045: URL: https://github.com/apache/datafusion-ballista/pull/1045 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] Fix cargo build [datafusion-ballista]

2024-09-02 Thread via GitHub
andygrove commented on PR #1045: URL: https://github.com/apache/datafusion-ballista/pull/1045#issuecomment-2325406674 Cargo build is fixed with this PR but the docker build still fails. ``` ./dev/build-ballista-docker.sh: line 26: docker-compose: command not found ``` -- Thi

Re: [PR] Introduce `Signature::Coercible` [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 merged PR #12275: URL: https://github.com/apache/datafusion/pull/12275 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

[PR] ignore: Experimenting with sharing Native/NativeUtils instead of creating per iterator [datafusion-comet]

2024-09-02 Thread via GitHub
andygrove opened a new pull request, #903: URL: https://github.com/apache/datafusion-comet/pull/903 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes t

Re: [PR] feat: add extension for logical_plan_builder [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 commented on PR #12293: URL: https://github.com/apache/datafusion/pull/12293#issuecomment-2325391867 Why do we need `extension`? What do you mean by `build extension logical plan` -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] Removes min/max/count comparison based on name in aggregate statistics [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 commented on code in PR #12296: URL: https://github.com/apache/datafusion/pull/12296#discussion_r1741296984 ## datafusion/expr/src/udaf.rs: ## @@ -262,6 +262,19 @@ impl AggregateUDF { self.inner.is_descending() } +/// Returns true if the function i

Re: [PR] fix: Add output to Comet operators equal and hashCode [datafusion-comet]

2024-09-02 Thread via GitHub
viirya commented on PR #902: URL: https://github.com/apache/datafusion-comet/pull/902#issuecomment-2325385902 Thanks @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] fix: Add output to Comet operators equal and hashCode [datafusion-comet]

2024-09-02 Thread via GitHub
viirya merged PR #902: URL: https://github.com/apache/datafusion-comet/pull/902 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

[PR] Fix build [datafusion-ballista]

2024-09-02 Thread via GitHub
andygrove opened a new pull request, #1045: URL: https://github.com/apache/datafusion-ballista/pull/1045 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing cha

Re: [I] Update `REGEXP_LIKE` scalar function to support Utf8View [datafusion]

2024-09-02 Thread via GitHub
Omega359 commented on issue #11910: URL: https://github.com/apache/datafusion/issues/11910#issuecomment-2325345730 See #12168 for what I think would be a very similar impl (and note my comment on that) -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Implement native support StringView for `CONTAINS` function [datafusion]

2024-09-02 Thread via GitHub
Omega359 commented on PR #12168: URL: https://github.com/apache/datafusion/pull/12168#issuecomment-2325344149 I am wondering since the `contains` udf relies on regex (which isn't documented at all btw) whether it should be a wrapper for `regexp_like` and moved to the regex module? The order

Re: [I] Support of timestamps and steps of less than a day for `generate_series` [datafusion]

2024-09-02 Thread via GitHub
Omega359 commented on issue #11822: URL: https://github.com/apache/datafusion/issues/11822#issuecomment-2325332205 I'm curious what the smallest interval that these udf's should support? Seconds? Millis? Nanos? Should calls like the following be supported `select generate_series('2021

Re: [I] Expose API to register a foreign TableProvider [datafusion-python]

2024-09-02 Thread via GitHub
ion-elgreco commented on issue #823: URL: https://github.com/apache/datafusion-python/issues/823#issuecomment-2325325928 > For Table Provider, what I've been investigating is how we could do something like `register_table_provider`. But even scratching the surface of making this stable acr

Re: [PR] chore: Revise batch pull approach to more follow C Data interface semantics [datafusion-comet]

2024-09-02 Thread via GitHub
viirya commented on PR #893: URL: https://github.com/apache/datafusion-comet/pull/893#issuecomment-2325248687 The last test failure is fixed by #902. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Support `map_values` for MAP type [datafusion]

2024-09-02 Thread via GitHub
dharanad commented on issue #12148: URL: https://github.com/apache/datafusion/issues/12148#issuecomment-2325248237 @Weijun-H This can be closed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[PR] fix: Add output to Comet operators equal and hashCode [datafusion-comet]

2024-09-02 Thread via GitHub
viirya opened a new pull request, #902: URL: https://github.com/apache/datafusion-comet/pull/902 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes test

Re: [PR] WIP: experiment with SMJ last buffered batch [datafusion]

2024-09-02 Thread via GitHub
comphead commented on PR #12082: URL: https://github.com/apache/datafusion/pull/12082#issuecomment-2325221509 > At what point in the code you are able to observe `0..1` for the key 2? I'm running the test from https://github.com/apache/datafusion/pull/12082#issuecomment-2319361185 and

[PR] Removes min/max/count comparison based on name in aggregate statistics [datafusion]

2024-09-02 Thread via GitHub
edmondop opened a new pull request, #12296: URL: https://github.com/apache/datafusion/pull/12296 ## Which issue does this PR close? Closes #11151 . ## Rationale for this change ## What changes are included in this PR? ## Are these changes tested

Re: [PR] Support `skewness(x)` in Aggregation function [datafusion]

2024-09-02 Thread via GitHub
dharanad commented on PR #12295: URL: https://github.com/apache/datafusion/pull/12295#issuecomment-2325189763 Planning to add more logic tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Remove unsafe Send impl from PriorityMap [datafusion]

2024-09-02 Thread via GitHub
findepi commented on code in PR #12289: URL: https://github.com/apache/datafusion/pull/12289#discussion_r1741180122 ## datafusion/physical-plan/src/aggregates/topk/priority_map.rs: ## @@ -25,17 +25,12 @@ use datafusion_common::Result; /// A `Map` / `PriorityQueue` combo that

Re: [I] Performance regression after adding support for SMJ with join filter [datafusion-comet]

2024-09-02 Thread via GitHub
andygrove commented on issue #901: URL: https://github.com/apache/datafusion-comet/issues/901#issuecomment-2325149393 Disabling sortMergeJoin via configs restores the original performance. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Support alternate format for Date32 unparsing (TEXT/SQLite) [datafusion]

2024-09-02 Thread via GitHub
sgrebnov commented on PR #12282: URL: https://github.com/apache/datafusion/pull/12282#issuecomment-2325148515 Wow, it was fast 🚀 Thank you @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[PR] Support `skewness(x)` in Aggregation function [datafusion]

2024-09-02 Thread via GitHub
dharanad opened a new pull request, #12295: URL: https://github.com/apache/datafusion/pull/12295 ## Which issue does this PR close? Closes #12249 ## Rationale for this change ## What changes are included in this PR? Add UDAF to support `skewness`. R

Re: [PR] Support the custom terminator for the CSV file format [datafusion]

2024-09-02 Thread via GitHub
korowa commented on code in PR #12263: URL: https://github.com/apache/datafusion/pull/12263#discussion_r1741148011 ## datafusion/core/src/datasource/physical_plan/csv.rs: ## @@ -1210,6 +1244,44 @@ mod tests { crate::assert_batches_eq!(expected, &result); } +#

Re: [PR] Support the custom terminator for the CSV file format [datafusion]

2024-09-02 Thread via GitHub
korowa commented on code in PR #12263: URL: https://github.com/apache/datafusion/pull/12263#discussion_r1741148011 ## datafusion/core/src/datasource/physical_plan/csv.rs: ## @@ -1210,6 +1244,44 @@ mod tests { crate::assert_batches_eq!(expected, &result); } +#

Re: [PR] Remove unsafe Send impl from PriorityMap [datafusion]

2024-09-02 Thread via GitHub
avantgardnerio commented on code in PR #12289: URL: https://github.com/apache/datafusion/pull/12289#discussion_r1741143078 ## datafusion/physical-plan/src/aggregates/topk/priority_map.rs: ## @@ -25,17 +25,12 @@ use datafusion_common::Result; /// A `Map` / `PriorityQueue` comb

Re: [I] Support` skewness(x) ` in Aggregation function [datafusion]

2024-09-02 Thread via GitHub
dharanad commented on issue #12249: URL: https://github.com/apache/datafusion/issues/12249#issuecomment-2325113030 Referring : https://github.com/duckdb/duckdb/blob/main/src/core_functions/aggregate/distributive/skew.cpp -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Support the custom terminator for the CSV file format [datafusion]

2024-09-02 Thread via GitHub
goldmedal commented on code in PR #12263: URL: https://github.com/apache/datafusion/pull/12263#discussion_r1741127326 ## datafusion/core/src/datasource/physical_plan/csv.rs: ## @@ -1210,6 +1244,44 @@ mod tests { crate::assert_batches_eq!(expected, &result); } +

Re: [PR] Support the custom terminator for the CSV file format [datafusion]

2024-09-02 Thread via GitHub
goldmedal commented on code in PR #12263: URL: https://github.com/apache/datafusion/pull/12263#discussion_r1741091339 ## datafusion/core/src/datasource/physical_plan/csv.rs: ## @@ -112,6 +114,7 @@ impl CsvExecBuilder { has_header: false, delimiter: b','

Re: [I] feat: better exception when table doesn't exist [datafusion-python]

2024-09-02 Thread via GitHub
andygrove closed issue #796: feat: better exception when table doesn't exist URL: https://github.com/apache/datafusion-python/issues/796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [I] feat: better exception when table doesn't exist [datafusion-python]

2024-09-02 Thread via GitHub
andygrove closed issue #796: feat: better exception when table doesn't exist URL: https://github.com/apache/datafusion-python/issues/796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] feat: better exception and message for table not found [datafusion-python]

2024-09-02 Thread via GitHub
andygrove merged PR #851: URL: https://github.com/apache/datafusion-python/pull/851 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [I] Render tables using html in notebooks. [datafusion-python]

2024-09-02 Thread via GitHub
andygrove closed issue #713: Render tables using html in notebooks. URL: https://github.com/apache/datafusion-python/issues/713 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Support array expressions for __getitem__ [datafusion-python]

2024-09-02 Thread via GitHub
andygrove closed issue #810: Support array expressions for __getitem__ URL: https://github.com/apache/datafusion-python/issues/810 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Set of small features [datafusion-python]

2024-09-02 Thread via GitHub
andygrove merged PR #839: URL: https://github.com/apache/datafusion-python/pull/839 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [I] Add DataFrame transform function [datafusion-python]

2024-09-02 Thread via GitHub
andygrove closed issue #807: Add DataFrame transform function URL: https://github.com/apache/datafusion-python/issues/807 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [I] update GH Build workflow now that `macos-latest` is ARM64 [datafusion-python]

2024-09-02 Thread via GitHub
andygrove closed issue #831: update GH Build workflow now that `macos-latest` is ARM64 URL: https://github.com/apache/datafusion-python/issues/831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] build(ci): use proper mac runners [datafusion-python]

2024-09-02 Thread via GitHub
andygrove merged PR #841: URL: https://github.com/apache/datafusion-python/pull/841 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [I] update GH Build workflow now that `macos-latest` is ARM64 [datafusion-python]

2024-09-02 Thread via GitHub
andygrove closed issue #831: update GH Build workflow now that `macos-latest` is ARM64 URL: https://github.com/apache/datafusion-python/issues/831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] chore: fix typos [datafusion-python]

2024-09-02 Thread via GitHub
andygrove merged PR #844: URL: https://github.com/apache/datafusion-python/pull/844 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [PR] Add Window Functions for use with function builder [datafusion-python]

2024-09-02 Thread via GitHub
andygrove merged PR #808: URL: https://github.com/apache/datafusion-python/pull/808 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

[PR] feat: better exception and message for table not found [datafusion-python]

2024-09-02 Thread via GitHub
mesejo opened a new pull request, #851: URL: https://github.com/apache/datafusion-python/pull/851 # Which issue does this PR close? Closes #796. # What changes are included in this PR? Change from the broad Exception to a specific KeyError exception. # Are there any user-

Re: [PR] Fix issue with "to_date" failing to process dates later than year 2262 [datafusion]

2024-09-02 Thread via GitHub
Omega359 commented on PR #12227: URL: https://github.com/apache/datafusion/pull/12227#issuecomment-2324861122 > And, as a brief reminder… I just intended to fix the issue that the date calculation has an upper limit of 2262. Both, the full timestamp parsing as well as the UTC handling are a

[I] Support prepared statement arguments in the LIMIT clause [datafusion]

2024-09-02 Thread via GitHub
WeCodingNow opened a new issue, #12294: URL: https://github.com/apache/datafusion/issues/12294 ### Is your feature request related to a problem or challenge? DataFusion: v41.0.0 I want to be able to write the following query as a prepared statement: ```sql PREPARE get_

Re: [I] UNION schema depends on branch order in case of array values [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 commented on issue #12292: URL: https://github.com/apache/datafusion/issues/12292#issuecomment-2324827013 We should return error since integer is not able to cast to timestamp and vice versa -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Update the CONCAT scalar function to support Utf8View [datafusion]

2024-09-02 Thread via GitHub
devanbenz commented on PR #12224: URL: https://github.com/apache/datafusion/pull/12224#issuecomment-2324820327 > Thank you @devanbenz @findepi and @tshauck > > I think this PR is good enough to merge, but I agree with @devanbenz 's observation [#12224 (files)](https://github.com/apac

Re: [PR] Avoid RowConverter for multi group by [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 commented on code in PR #12269: URL: https://github.com/apache/datafusion/pull/12269#discussion_r1740962169 ## datafusion/physical-plan/src/aggregates/group_values/row_like.rs: ## @@ -0,0 +1,392 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

Re: [I] Casting existing timestamp to timestamp again strips timezone information [datafusion]

2024-09-02 Thread via GitHub
devanbenz commented on issue #12218: URL: https://github.com/apache/datafusion/issues/12218#issuecomment-2324818646 > I wouldn't say that timestamp has UTC time. timestamp is local date/time. You can think of it as a struct with fields year/month/day/hour/minute/seconds(fractional) and it h

Re: [I] Show Comet version on startup [datafusion-comet]

2024-09-02 Thread via GitHub
andygrove closed issue #894: Show Comet version on startup URL: https://github.com/apache/datafusion-comet/issues/894 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [PR] chore: print Comet native version to logs after Comet is initialized [datafusion-comet]

2024-09-02 Thread via GitHub
andygrove merged PR #900: URL: https://github.com/apache/datafusion-comet/pull/900 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] Remove unsafe Send impl from PriorityMap [datafusion]

2024-09-02 Thread via GitHub
findepi commented on code in PR #12289: URL: https://github.com/apache/datafusion/pull/12289#discussion_r1740932826 ## datafusion/physical-plan/src/aggregates/topk/priority_map.rs: ## @@ -25,17 +25,12 @@ use datafusion_common::Result; /// A `Map` / `PriorityQueue` combo that

Re: [PR] Avoid RowConverter for multi group by [datafusion]

2024-09-02 Thread via GitHub
Dandandan commented on code in PR #12269: URL: https://github.com/apache/datafusion/pull/12269#discussion_r1740912268 ## datafusion/physical-plan/src/aggregates/group_values/row_like.rs: ## @@ -0,0 +1,392 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or m

Re: [PR] Int64 as default type for make_array function empty or null case [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 commented on PR #10790: URL: https://github.com/apache/datafusion/pull/10790#issuecomment-2324738358 > Also, it's likely to affect coercion rules in the future. NULL type is coercible to any other type, but Int64 probably won't be. I think any type is coercible for Null.

[PR] feat: add extension for logical_plan_builder [datafusion]

2024-09-02 Thread via GitHub
zhuliquan opened a new pull request, #12293: URL: https://github.com/apache/datafusion/pull/12293 ## Which issue does this PR close? Closes #. ## Rationale for this change Support build extension for builder ## What changes are included in this PR? add funct

Re: [I] UNION schema depends on branch order in case of array values [datafusion]

2024-09-02 Thread via GitHub
findepi commented on issue #12292: URL: https://github.com/apache/datafusion/issues/12292#issuecomment-2324723406 So far i did not reproduce the problem for queries without arrays, so it **might** be array specific and thus related to https://github.com/apache/datafusion/issues/12291. --

[I] UNION schema depends on branch order in case of array values [datafusion]

2024-09-02 Thread via GitHub
findepi opened a new issue, #12292: URL: https://github.com/apache/datafusion/issues/12292 ### Describe the bug ``` > SELECT DISTINCT arrow_typeof(x[0]) FROM (SELECT make_array(2) x UNION ALL SELECT make_array(now()) x); +---+ | arrow_typeof(x[Int64(0)

[I] Arrays of of non-coercible types are coercible [datafusion]

2024-09-02 Thread via GitHub
findepi opened a new issue, #12291: URL: https://github.com/apache/datafusion/issues/12291 ### Describe the bug `Int64` and `timestamp with time zone` are not coercible: ``` > SELECT 2 x UNION ALL SELECT now() x; type_coercion caused by Error during planning: Incompatibl

Re: [PR] Minor: Add `RuntimeEnvBuilder::build_arc() [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 commented on PR #12213: URL: https://github.com/apache/datafusion/pull/12213#issuecomment-2324712602 Thanks @alamb and @findepi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Minor: Add `RuntimeEnvBuilder::build_arc() [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 merged PR #12213: URL: https://github.com/apache/datafusion/pull/12213 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Minor: Add `RuntimeEnvBuilder::build_arc() [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 commented on PR #12213: URL: https://github.com/apache/datafusion/pull/12213#issuecomment-2324711623 I also have the same thought, maybe we don't really need Arc in some cases. We could gradually cleanup them later on -- This is an automated message from the Apache Git Service.

Re: [PR] Int64 as default type for make_array function empty or null case [datafusion]

2024-09-02 Thread via GitHub
findepi commented on PR #10790: URL: https://github.com/apache/datafusion/pull/10790#issuecomment-2324702824 it seems counter-intuitive to me to infer Int64 where it was not provided by a user nor schema of tables a query operates on. It might be transparent to the user, in which case it's

Re: [PR] Fixes missing `nth_value` UDAF expr function [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 merged PR #12279: URL: https://github.com/apache/datafusion/pull/12279 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [I] NthValue UDAF removed in v41 [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 closed issue #12278: NthValue UDAF removed in v41 URL: https://github.com/apache/datafusion/issues/12278 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] Fixes missing `nth_value` UDAF expr function [datafusion]

2024-09-02 Thread via GitHub
jayzhan211 commented on PR #12279: URL: https://github.com/apache/datafusion/pull/12279#issuecomment-2324694584 Thanks @jcsherin -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Move `CombinePartialFinalAggregate` rule into physical-optimizer crate [datafusion]

2024-09-02 Thread via GitHub
lewiszlw merged PR #12167: URL: https://github.com/apache/datafusion/pull/12167 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

[I] Cannot use flight data from C++ client with DataFusion [datafusion]

2024-09-02 Thread via GitHub
EnricoMi opened a new issue, #12290: URL: https://github.com/apache/datafusion/issues/12290 ### Describe the bug Fetching data via Apache Arrow Flight (C++, Java, Python involved) and passing them to Apache DataFusion (Rust) does not work: Memory pointer from external sourc

Re: [PR] Fix Possible Congestion Scenario in SPM [datafusion]

2024-09-02 Thread via GitHub
berkaysynnada commented on PR #12230: URL: https://github.com/apache/datafusion/pull/12230#issuecomment-2324657547 @tustvold @alamb, could you please take a look when you have time? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] test: check record count and types in parquet window test [datafusion]

2024-09-02 Thread via GitHub
alamb commented on code in PR #12277: URL: https://github.com/apache/datafusion/pull/12277#discussion_r1740817386 ## datafusion/sqllogictest/test_files/parquet.slt: ## @@ -251,25 +251,21 @@ SELECT COUNT(*) FROM timestamp_with_tz; 131072 -# Perform the query: -query IPT

Re: [PR] Remove unsafe Send impl from PriorityMap [datafusion]

2024-09-02 Thread via GitHub
alamb commented on code in PR #12289: URL: https://github.com/apache/datafusion/pull/12289#discussion_r1740812783 ## datafusion/physical-plan/src/aggregates/topk/priority_map.rs: ## @@ -25,17 +25,12 @@ use datafusion_common::Result; /// A `Map` / `PriorityQueue` combo that ev

Re: [PR] Extract drive-by fixes from PR 12135 for easier reviewing [datafusion]

2024-09-02 Thread via GitHub
alamb merged PR #12240: URL: https://github.com/apache/datafusion/pull/12240 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Support alternate format for Date32 unparsing (TEXT/SQLite) [datafusion]

2024-09-02 Thread via GitHub
alamb merged PR #12282: URL: https://github.com/apache/datafusion/pull/12282 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Update prost-build requirement from =0.12.6 to =0.13.2 [datafusion]

2024-09-02 Thread via GitHub
alamb closed pull request #12287: Update prost-build requirement from =0.12.6 to =0.13.2 URL: https://github.com/apache/datafusion/pull/12287 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Update prost-build requirement from =0.12.6 to =0.13.2 [datafusion]

2024-09-02 Thread via GitHub
alamb commented on PR #12287: URL: https://github.com/apache/datafusion/pull/12287#issuecomment-2324592341 In https://github.com/apache/datafusion/pull/12032 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] Update prost-build requirement from =0.12.6 to =0.13.2 [datafusion]

2024-09-02 Thread via GitHub
dependabot[bot] commented on PR #12287: URL: https://github.com/apache/datafusion/pull/12287#issuecomment-2324592402 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version

Re: [PR] Update the CONCAT scalar function to support Utf8View [datafusion]

2024-09-02 Thread via GitHub
alamb commented on code in PR #12224: URL: https://github.com/apache/datafusion/pull/12224#discussion_r1740777076 ## datafusion/functions/src/string/concat.rs: ## @@ -64,13 +66,36 @@ impl ScalarUDFImpl for ConcatFunc { &self.signature } -fn return_type(&self,

Re: [PR] feat: Add projection to FilterExec [datafusion]

2024-09-02 Thread via GitHub
eejbyfeldt commented on code in PR #12281: URL: https://github.com/apache/datafusion/pull/12281#discussion_r1740799498 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -5772,15 +5772,14 @@ logical_plan 03)Aggregate: groupBy=[[having_test.v1, having_test.v2]], agg

Re: [I] Expose API to register a foreign TableProvider [datafusion-python]

2024-09-02 Thread via GitHub
timsaucer commented on issue #823: URL: https://github.com/apache/datafusion-python/issues/823#issuecomment-2324523635 For Table Provider, what I've been investigating is how we could do something like `register_table_provider`. But even scratching the surface of making this stable across

Re: [I] Remove special casting of `Min` / `Max` built in `AggregateFunctions` [datafusion]

2024-09-02 Thread via GitHub
edmondop commented on issue #11151: URL: https://github.com/apache/datafusion/issues/11151#issuecomment-2324520382 Thanks. I could see how for the min and max we are ready. Will create a similar additional method on the trait for count, I don't think it is there yet -- This is an automat

[PR] Remove unsafe Send impl from PriorityMap [datafusion]

2024-09-02 Thread via GitHub
findepi opened a new pull request, #12289: URL: https://github.com/apache/datafusion/pull/12289 It's not necessary to use unsafe Send impl. It's enough to require the referenced trait objects as Send. -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [I] making `LexRequirement` an actual struct [datafusion]

2024-09-02 Thread via GitHub
berkaysynnada commented on issue #12255: URL: https://github.com/apache/datafusion/issues/12255#issuecomment-2324501070 Makes sense to me. It would help clarify whether the expr vector corresponds to a lex ordering or multiple global orderings. If someone is interested in working on this, I

Re: [PR] Minor: Add `RuntimeEnvBuilder::build_arc() [datafusion]

2024-09-02 Thread via GitHub
alamb commented on PR #12213: URL: https://github.com/apache/datafusion/pull/12213#issuecomment-2324488044 > I like the simplification. Is this an established naming pattern in the ecosystem? I am not aware of this being an established pattern (I think DataFusion uses Arc's a bit mor

Re: [PR] feat: Add DateFieldExtractStyle::Strftime support for SqliteDialect unparser [datafusion]

2024-09-02 Thread via GitHub
alamb commented on PR #12161: URL: https://github.com/apache/datafusion/pull/12161#issuecomment-2324484217 > Thank you @peasee this looks good to me! I can't approve the pipelines to run, but I did check this out locally and manually ran some checks (all of which passed). > > @alamb

Re: [PR] Optimize `struct` and `named_struct` functions [datafusion]

2024-09-02 Thread via GitHub
alamb commented on code in PR #11688: URL: https://github.com/apache/datafusion/pull/11688#discussion_r1740743526 ## datafusion/functions/src/core/struct.rs: ## @@ -97,48 +98,3 @@ impl ScalarUDFImpl for StructFunc { struct_expr(args) } } - -#[cfg(test)] Review Co

Re: [I] making `LexRequirement` an actual struct [datafusion]

2024-09-02 Thread via GitHub
alamb commented on issue #12255: URL: https://github.com/apache/datafusion/issues/12255#issuecomment-2324468113 FYI @ozankabak and @mustafasrepo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Introduce `Signature::Coercible` [datafusion]

2024-09-02 Thread via GitHub
alamb commented on PR #12275: URL: https://github.com/apache/datafusion/pull/12275#issuecomment-2324467292 This idea looks much nicer and general @jayzhan211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

  1   2   >