Re: [I] Attach `Diagnostic` to "incompatible type in unary expression" error [datafusion]

2025-02-05 Thread via GitHub
alan910127 commented on issue #14433: URL: https://github.com/apache/datafusion/issues/14433#issuecomment-2639055320 Hi @eliaperantoni, would you mind sharing your rendering code snippet? It helps verifying if my implementation is correct! -- This is an automated message from the Apache G

Re: [PR] fix: order by expr rewrite fix [datafusion]

2025-02-05 Thread via GitHub
berkaysynnada commented on PR #14486: URL: https://github.com/apache/datafusion/pull/14486#issuecomment-2639043430 I checked the newly added tests and they were failing before 👍🏻 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [I] Add Memory Profiling Functionality [datafusion]

2025-02-05 Thread via GitHub
berkaysynnada commented on issue #14510: URL: https://github.com/apache/datafusion/issues/14510#issuecomment-2639022364 Thank you @comphead for the great elaboration! I don't have a clear answer to your question yet, but since this is the final part of our work list, I believe someone can c

Re: [I] Implement Nicer / DuckDB style explain plans [datafusion]

2025-02-05 Thread via GitHub
irenjj commented on issue #9371: URL: https://github.com/apache/datafusion/issues/9371#issuecomment-2638964832 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [I] Project Ideas for GSoC 2025 (Google Summer of Code) [datafusion]

2025-02-05 Thread via GitHub
ozankabak commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2638951722 @XiangpengHao, do you think you can take a look at how we can divide up the work of larger-than-memory aggregations into smaller tasks? If this is possible, we can make it a p

Re: [I] Project Ideas for GSoC 2025 (Google Summer of Code) [datafusion]

2025-02-05 Thread via GitHub
ozankabak commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2638944956 > Besides, there are many well-defined tasks in our SQL fuzzer https://github.com/apache/datafusion/issues/11030 and I'm interested to mentor. I'll open an issue for GSOC proj

Re: [PR] Fix Type Coercion for UDF Arguments [datafusion]

2025-02-05 Thread via GitHub
shehabgamin commented on code in PR #14268: URL: https://github.com/apache/datafusion/pull/14268#discussion_r1944122850 ## datafusion/functions/src/string/ascii.rs: ## @@ -93,6 +95,33 @@ impl ScalarUDFImpl for AsciiFunc { make_scalar_function(ascii, vec![])(args) }

[I] MSSQL parsing errors [datafusion-sqlparser-rs]

2025-02-05 Thread via GitHub
matt-deboer opened a new issue, #1709: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1709 1. The `PRINT string | @variable | str_expression` statement in MSSQL doesn't parse as a valid statement, and causes all other statement parsing to fail. I'm processing it now by findin

Re: [PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-02-05 Thread via GitHub
wForget commented on code in PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#discussion_r1944107142 ## native/core/src/parquet/parquet_support.rs: ## @@ -1861,6 +1863,42 @@ fn trim_end(s: &str) -> &str { } } +// Default object store which is local fil

Re: [PR] perf: improve performance of update metrics [datafusion-comet]

2025-02-05 Thread via GitHub
mbutrovich commented on code in PR #1329: URL: https://github.com/apache/datafusion-comet/pull/1329#discussion_r1944106936 ## native/core/src/execution/jni_api.rs: ## @@ -233,11 +242,12 @@ pub unsafe extern "system" fn Java_org_apache_comet_Native_createPlan( strea

Re: [PR] feat: Add `array_max` function support [datafusion]

2025-02-05 Thread via GitHub
erenavsarogullari commented on code in PR #14470: URL: https://github.com/apache/datafusion/pull/14470#discussion_r1944094462 ## datafusion/functions-nested/src/max.rs: ## @@ -0,0 +1,173 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

[PR] feat: add hint for missing fields [datafusion]

2025-02-05 Thread via GitHub
Lordworms opened a new pull request, #14521: URL: https://github.com/apache/datafusion/pull/14521 ## Which issue does this PR close? - Closes #14466 ## Rationale for this change ## What changes are included in this PR? ## Are these changes

Re: [I] Project Ideas for GSoC 2025 (Google Summer of Code) [datafusion]

2025-02-05 Thread via GitHub
2010YOUY01 commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2638799861 > Thanks [@Rachelint](https://github.com/Rachelint) -- you can be a co-mentor if you like. > > [@2010YOUY01](https://github.com/2010YOUY01), I'm not up-to-date with our

Re: [I] [EPIC] Decouple logical from physical types [datafusion]

2025-02-05 Thread via GitHub
jayzhan211 commented on issue #12622: URL: https://github.com/apache/datafusion/issues/12622#issuecomment-2638782736 > There should be no function for creating a ScalarValue from a physical object or DataType. I think getting `Scalar` from `DataType` or `ArrayRef` makes sense, so the

Re: [I] Implement UNION ALL BY NAME [datafusion]

2025-02-05 Thread via GitHub
rkrishn7 commented on issue #14508: URL: https://github.com/apache/datafusion/issues/14508#issuecomment-2638765741 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] Create UNION plan node with correct schema [datafusion]

2025-02-05 Thread via GitHub
rkrishn7 commented on issue #14380: URL: https://github.com/apache/datafusion/issues/14380#issuecomment-2638760053 Hello @findepi! I'd be happy to work on this. I believe it's as simple as shifting union schema coercion from the analyzer to logical plan building. However, since

Re: [I] Apply `take_function_args` to functions validating argument count [datafusion]

2025-02-05 Thread via GitHub
lgingerich commented on issue #14516: URL: https://github.com/apache/datafusion/issues/14516#issuecomment-2638730780 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] feat: Add fair unified memory pool [datafusion-comet]

2025-02-05 Thread via GitHub
wForget commented on code in PR #1369: URL: https://github.com/apache/datafusion-comet/pull/1369#discussion_r1944025359 ## native/core/src/execution/fair_memory_pool.rs: ## @@ -0,0 +1,159 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor l

Re: [PR] feat: Add fair unified memory pool [datafusion-comet]

2025-02-05 Thread via GitHub
kazuyukitanimura commented on code in PR #1369: URL: https://github.com/apache/datafusion-comet/pull/1369#discussion_r1944024038 ## native/core/src/execution/fair_memory_pool.rs: ## @@ -0,0 +1,159 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more cont

Re: [PR] feat: Add fair unified memory pool [datafusion-comet]

2025-02-05 Thread via GitHub
wForget commented on code in PR #1369: URL: https://github.com/apache/datafusion-comet/pull/1369#discussion_r1944018500 ## native/core/src/execution/fair_memory_pool.rs: ## @@ -0,0 +1,159 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor l

Re: [PR] feat: Add fair unified memory pool [datafusion-comet]

2025-02-05 Thread via GitHub
wForget commented on code in PR #1369: URL: https://github.com/apache/datafusion-comet/pull/1369#discussion_r1944011729 ## docs/source/user-guide/configs.md: ## @@ -48,7 +48,7 @@ Comet provides the following configuration settings. | spark.comet.exec.hashJoin.enabled | Whether

Re: [PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-02-05 Thread via GitHub
comphead commented on code in PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#discussion_r1943985797 ## native/core/src/parquet/parquet_support.rs: ## @@ -1861,6 +1863,42 @@ fn trim_end(s: &str) -> &str { } } +// Default object store which is local fi

Re: [PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-02-05 Thread via GitHub
wForget commented on code in PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#discussion_r1943975592 ## native/core/src/parquet/parquet_support.rs: ## @@ -1861,6 +1863,42 @@ fn trim_end(s: &str) -> &str { } } +// Default object store which is local fil

Re: [PR] Replace is_sorted helper with standard one. [datafusion]

2025-02-05 Thread via GitHub
github-actions[bot] closed pull request #13608: Replace is_sorted helper with standard one. URL: https://github.com/apache/datafusion/pull/13608 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] perf: improve performance of update metrics [datafusion-comet]

2025-02-05 Thread via GitHub
wForget commented on code in PR #1329: URL: https://github.com/apache/datafusion-comet/pull/1329#discussion_r1943950610 ## native/core/src/execution/metrics/utils.rs: ## @@ -55,60 +64,21 @@ pub fn update_comet_metric( Some(metrics.aggregate_by_name()) }; -upd

Re: [PR] disable coercison for unmatched struct type [datafusion]

2025-02-05 Thread via GitHub
jayzhan211 commented on code in PR #14409: URL: https://github.com/apache/datafusion/pull/14409#discussion_r1943937054 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -595,7 +595,7 @@ fn type_union_resolution_coercion( /// Handle type union resolution including s

Re: [I] Feb 4, 2025: This week(s) in DataFusion [datafusion]

2025-02-05 Thread via GitHub
comphead commented on issue #14491: URL: https://github.com/apache/datafusion/issues/14491#issuecomment-2638321685 > Turns out there is a Rust NYC event about DataFusion in Feb 11: https://www.meetup.com/rust-nyc/events/306004489 Interesting I can not see Empathic as DataFusion users

Re: [I] Feb 4, 2025: This week(s) in DataFusion [datafusion]

2025-02-05 Thread via GitHub
alamb commented on issue #14491: URL: https://github.com/apache/datafusion/issues/14491#issuecomment-2638318598 Turns out there is a Rust NYC event about DataFusion in Feb 11: https://www.meetup.com/rust-nyc/events/306004489 -- This is an automated message from the Apache Git Service. To

Re: [I] Add example to spark-expr crate [datafusion-comet]

2025-02-05 Thread via GitHub
viczsaurav commented on issue #1365: URL: https://github.com/apache/datafusion-comet/issues/1365#issuecomment-2638275943 I will give it a try -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] [Epic] Split datasources out from `datafusion` crate (`datafusion/core`) [datafusion]

2025-02-05 Thread via GitHub
davisp commented on issue #1: URL: https://github.com/apache/datafusion/issues/1#issuecomment-2638265550 @alamb Can you point me at whatever tool you used to generate those compile timing graphs? Those look like something I'd absolutely adopt in a bunch of projects. -- This is an

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-02-05 Thread via GitHub
djanderson commented on code in PR #14286: URL: https://github.com/apache/datafusion/pull/14286#discussion_r1943824052 ## datafusion-examples/examples/thread_pools.rs: ## @@ -0,0 +1,238 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic

Re: [PR] refactor: switch BooleanBufferBuilder to NullBufferBuilder in MaybeNullBufferBuilder [datafusion]

2025-02-05 Thread via GitHub
alamb commented on PR #14504: URL: https://github.com/apache/datafusion/pull/14504#issuecomment-2638241145 Looks like there are some CI failures so marking this PR as a draft -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Add example to spark-expr crate [datafusion-comet]

2025-02-05 Thread via GitHub
viczsaurav commented on issue #1365: URL: https://github.com/apache/datafusion-comet/issues/1365#issuecomment-2638248696 I am working on this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] [Epic] Split datasources out from `datafusion` crate (`datafusion/core`) [datafusion]

2025-02-05 Thread via GitHub
davisp commented on issue #1: URL: https://github.com/apache/datafusion/issues/1#issuecomment-2638247527 Whoa, this is awesome! I'm still ramping up on learning DataFusion internals as I add my own extensions and one thing that's been nagging at me is that it almost feels like

Re: [PR] Fix config_namespace macro symbol usage [datafusion]

2025-02-05 Thread via GitHub
alamb commented on code in PR #14520: URL: https://github.com/apache/datafusion/pull/14520#discussion_r1943799473 ## datafusion/common/src/config.rs: ## @@ -2093,3 +2094,22 @@ mod tests { assert_eq!(parsed_metadata.get("key_dupe"), Some(&Some("B".into(; } } +

Re: [PR] Fix config_namespace macro symbol usage [datafusion]

2025-02-05 Thread via GitHub
rkrishn7 commented on code in PR #14520: URL: https://github.com/apache/datafusion/pull/14520#discussion_r1943795899 ## datafusion/common/src/config.rs: ## @@ -2093,3 +2094,22 @@ mod tests { assert_eq!(parsed_metadata.get("key_dupe"), Some(&Some("B".into(; } }

Re: [PR] Relax physical schema validation [datafusion]

2025-02-05 Thread via GitHub
davisp commented on code in PR #14519: URL: https://github.com/apache/datafusion/pull/14519#discussion_r1943793782 ## datafusion/core/src/schema_equivalence.rs: ## @@ -0,0 +1,84 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agr

Re: [I] Project Ideas for GSoC 2025 [datafusion]

2025-02-05 Thread via GitHub
alamb commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2638232484 ANother possible project that would be neat would be to implement variant support in parquet and DataFusion There is not a ticket yet in DataFusion that I know of but there

Re: [PR] Add `DataFusionError::Collection` to return multiple `DataFusionError`s [datafusion]

2025-02-05 Thread via GitHub
eliaperantoni commented on code in PR #14439: URL: https://github.com/apache/datafusion/pull/14439#discussion_r1943780039 ## datafusion/common/src/error.rs: ## @@ -425,29 +442,38 @@ impl DataFusionError { "".to_owned() } -fn error_prefix(&self) -> &'static st

Re: [PR] Add `DataFusionError::Collection` to return multiple `DataFusionError`s [datafusion]

2025-02-05 Thread via GitHub
eliaperantoni commented on code in PR #14439: URL: https://github.com/apache/datafusion/pull/14439#discussion_r1943779212 ## datafusion/sql/src/select.rs: ## @@ -574,10 +574,19 @@ impl SqlToRel<'_, S> { empty_from: bool, planner_context: &mut PlannerContext,

Re: [PR] Add `DataFusionError::Collection` to return multiple `DataFusionError`s [datafusion]

2025-02-05 Thread via GitHub
eliaperantoni commented on code in PR #14439: URL: https://github.com/apache/datafusion/pull/14439#discussion_r1943778591 ## datafusion/common/src/error.rs: ## @@ -540,6 +570,37 @@ impl DataFusionError { DiagnosticsIterator { head: self }.next() } + +/// Some

Re: [PR] Add `DataFusionError::Collection` to return multiple `DataFusionError`s [datafusion]

2025-02-05 Thread via GitHub
eliaperantoni commented on code in PR #14439: URL: https://github.com/apache/datafusion/pull/14439#discussion_r1943778591 ## datafusion/common/src/error.rs: ## @@ -540,6 +570,37 @@ impl DataFusionError { DiagnosticsIterator { head: self }.next() } + +/// Some

[PR] Fix config_namespace macro symbol usage [datafusion]

2025-02-05 Thread via GitHub
davisp opened a new pull request, #14520: URL: https://github.com/apache/datafusion/pull/14520 The `config_namespace` macro was relying on a few symbols being properly imported before its used. This removes that need by referring to the symbols directly with the `$crate` prefix. ## W

Re: [I] The `config_namespace!` macro requires `datafusion::common::Result` to be in scope [datafusion]

2025-02-05 Thread via GitHub
davisp commented on issue #14518: URL: https://github.com/apache/datafusion/issues/14518#issuecomment-2638176310 @rkrishn7 Awesome, I'll open a PR in a few minutes. Thanks for the reminder on `$crate` being a thing. @logan-keede I'll add it back, no problem! Also that's not the cause

Re: [I] The `config_namespace!` macro requires `datafusion::common::Result` to be in scope [datafusion]

2025-02-05 Thread via GitHub
logan-keede commented on issue #14518: URL: https://github.com/apache/datafusion/issues/14518#issuecomment-2638152810 you are right, removing `#[macro_export]` does not seem to be related to the issue I was trying to solve. It was a miss on my part. Perhaps @davisp can fix that, while sol

Re: [PR] Validate and unpack function arguments tersely [datafusion]

2025-02-05 Thread via GitHub
findepi merged PR #14513: URL: https://github.com/apache/datafusion/pull/14513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafu

Re: [PR] Relax physical schema validation [datafusion]

2025-02-05 Thread via GitHub
findepi commented on PR #14519: URL: https://github.com/apache/datafusion/pull/14519#issuecomment-2638113219 cc @eejbyfeldt @tv42 @comphead @jayzhan211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] nullable Expr being constant fold to value can cause schema change and internal error [datafusion]

2025-02-05 Thread via GitHub
findepi commented on issue #13190: URL: https://github.com/apache/datafusion/issues/13190#issuecomment-2638111301 maybe we just change the check code - https://github.com/apache/datafusion/pull/14519 -- This is an automated message from the Apache Git Service. To respond to the message,

[PR] Relax physical schema validation [datafusion]

2025-02-05 Thread via GitHub
findepi opened a new pull request, #14519: URL: https://github.com/apache/datafusion/pull/14519 Physical plan can be further optimized. In particular, an expression can be determined as never null even if it wasn't known at the time of logical planning. Thus, the final schema check needs to

Re: [PR] Feat: support array_except function [datafusion-comet]

2025-02-05 Thread via GitHub
kazuyukitanimura commented on code in PR #1343: URL: https://github.com/apache/datafusion-comet/pull/1343#discussion_r1943711444 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -2387,6 +2387,8 @@ object QueryPlanSerde extends Logging with ShimQueryPla

Re: [I] nullable Expr being constant fold to value can cause schema change and internal error [datafusion]

2025-02-05 Thread via GitHub
tv42 commented on issue #13190: URL: https://github.com/apache/datafusion/issues/13190#issuecomment-2638095485 And a debugging hint for others, because this might not be a Datafusion bug in all cases: I refactored my TableProvider and forgot to apply the given Projection to the schem

Re: [PR] feat: Add fair unified memory pool [datafusion-comet]

2025-02-05 Thread via GitHub
kazuyukitanimura commented on PR #1369: URL: https://github.com/apache/datafusion-comet/pull/1369#issuecomment-2638088720 @andygrove @comphead @himadripal -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] The `config_namespace!` macro requires `datafusion::common::Result` to be in scope [datafusion]

2025-02-05 Thread via GitHub
rkrishn7 commented on issue #14518: URL: https://github.com/apache/datafusion/issues/14518#issuecomment-2638052626 Your solution makes sense to me @davisp! Consider using the [`$crate`](https://doc.rust-lang.org/reference/macros-by-example.html#r-macro.decl.hygiene.crate) metavariable withi

Re: [I] nullable Expr being constant fold to value can cause schema change and internal error [datafusion]

2025-02-05 Thread via GitHub
tv42 commented on issue #13190: URL: https://github.com/apache/datafusion/issues/13190#issuecomment-2638031512 For others stumbling on this, you can disable this check with ```rust let options = session_config.options_mut(); options.execution.skip_physical_aggrega

Re: [PR] fix: Capture nullability in `Values` node planning [datafusion]

2025-02-05 Thread via GitHub
rkrishn7 commented on code in PR #14472: URL: https://github.com/apache/datafusion/pull/14472#discussion_r1943660946 ## datafusion/expr/src/logical_plan/builder.rs: ## @@ -296,12 +298,14 @@ impl LogicalPlanBuilder { field_types.push(common_type.unwrap_or(DataType::N

Re: [PR] fix: Capture nullability in `Values` node planning [datafusion]

2025-02-05 Thread via GitHub
rkrishn7 commented on code in PR #14472: URL: https://github.com/apache/datafusion/pull/14472#discussion_r1943660946 ## datafusion/expr/src/logical_plan/builder.rs: ## @@ -296,12 +298,14 @@ impl LogicalPlanBuilder { field_types.push(common_type.unwrap_or(DataType::N

[PR] Fix incorrect parsing of JsonAccess bracket notation after cast in Snowflae [datafusion-sqlparser-rs]

2025-02-05 Thread via GitHub
yoavcloud opened a new pull request, #1708: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1708 Some dialects support a size parameter for array type definitions, using bracket notation. For example: `INT[100]`. For other dialects, the bracket notation should be parsed as JsonA

Re: [I] Redundancy/Repeated calls in query function [datafusion]

2025-02-05 Thread via GitHub
TheBitsmith commented on issue #14448: URL: https://github.com/apache/datafusion/issues/14448#issuecomment-2637972090 Take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Improve Unparser (scalar_to_sql) to respect dialect timestamp type overrides [datafusion]

2025-02-05 Thread via GitHub
sgrebnov commented on PR #14407: URL: https://github.com/apache/datafusion/pull/14407#issuecomment-2637962630 Thank you for review and merge @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Add suppport for Show Objects statement for the Snowflake parser [datafusion-sqlparser-rs]

2025-02-05 Thread via GitHub
iffyio commented on code in PR #1702: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1702#discussion_r1943620318 ## src/dialect/snowflake.rs: ## @@ -182,6 +183,15 @@ impl Dialect for SnowflakeDialect { return Some(parse_file_staging_command(kw, parser)

Re: [PR] Introduce unified `DataSourceExec` for provided datasources, remove `ParquetExec`, `CsvExec`, etc [datafusion]

2025-02-05 Thread via GitHub
ozankabak commented on PR #14224: URL: https://github.com/apache/datafusion/pull/14224#issuecomment-2637940645 @alamb -- great, we can incorporate your three suggestions before merging. Keep them coming if anything else comes up. @mertak-synnada, let's address these (and anything else that

Re: [PR] Introduce unified `DataSourceExec` for provided datasources, remove `ParquetExec`, `CsvExec`, etc [datafusion]

2025-02-05 Thread via GitHub
ozankabak commented on code in PR #14224: URL: https://github.com/apache/datafusion/pull/14224#discussion_r1943608331 ## datafusion/core/src/datasource/data_source.rs: ## @@ -0,0 +1,264 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic

Re: [PR] test: [DO NOT MERGE] [datafusion-comet]

2025-02-05 Thread via GitHub
codecov-commenter commented on PR #1370: URL: https://github.com/apache/datafusion-comet/pull/1370#issuecomment-2637922024 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1370?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] Project Ideas for GSoC 2025 [datafusion]

2025-02-05 Thread via GitHub
Rachelint commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2637873769 > Thanks [@Rachelint](https://github.com/Rachelint) -- you can be a co-mentor if you like. > > [@2010YOUY01](https://github.com/2010YOUY01), I'm not up-to-date with our

Re: [I] Project Ideas for GSoC 2025 [datafusion]

2025-02-05 Thread via GitHub
ozankabak commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2637881796 > Thanks for bringing me here [@alamb](https://github.com/alamb) . I'm also happy to help mentoring students on performance related projects (although myself also seems to qua

Re: [PR] Require space after -- to start single line comment in MySQL [datafusion-sqlparser-rs]

2025-02-05 Thread via GitHub
iffyio merged PR #1705: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1705 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] test: [DO NOT MERGE] [datafusion-comet]

2025-02-05 Thread via GitHub
kazuyukitanimura closed pull request #1370: test: [DO NOT MERGE] URL: https://github.com/apache/datafusion-comet/pull/1370 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [I] Project Ideas for GSoC 2025 [datafusion]

2025-02-05 Thread via GitHub
ozankabak commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2637864881 Thanks @Rachelint -- you can be a co-mentor if you like. @2010YOUY01, I'm not up-to-date with our status on larger-than-memory aggregation. If you think you can enrich t

Re: [I] Project Ideas for GSoC 2025 [datafusion]

2025-02-05 Thread via GitHub
XiangpengHao commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2637865928 Thanks for bringing me here @alamb . I'm also happy to help mentoring students on performance related projects (although myself also seems to qualify for GSoC lol), larger

[I] The `config_namespace!` macro requires `datafusion::common::Result` to be in scope [datafusion]

2025-02-05 Thread via GitHub
davisp opened a new issue, #14518: URL: https://github.com/apache/datafusion/issues/14518 ### Describe the bug Just got caught by this. If you don't have `datafusion::common::Result` in scope as `Result` when using `config_namespace!`, rust-analyzer returns diagnostics for `enum take

Re: [PR] Feature Unifying source execution plans [datafusion]

2025-02-05 Thread via GitHub
alamb commented on PR #14224: URL: https://github.com/apache/datafusion/pull/14224#issuecomment-2637849585 > (I am trying to upgrade InfluxDB IOx to use this PR. I will report back later today on its impact) To be clear, while this is definitely selfish for us at Influx, I also think

Re: [I] Project Ideas for GSoC 2025 [datafusion]

2025-02-05 Thread via GitHub
Rachelint commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2637844385 @ozankabak Willing to offer helps but may not have enough bandwidth as main mentor (busy next few months for finding new job...) Agree with @alamb , in-memory aggregatio

Re: [PR] Parse Postgres VARBIT datatype [datafusion-sqlparser-rs]

2025-02-05 Thread via GitHub
iffyio commented on code in PR #1703: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1703#discussion_r1943551438 ## src/parser/mod.rs: ## @@ -8720,6 +8720,7 @@ impl<'a> Parser<'a> { Ok(DataType::Bit(self.parse_optional_precision()?))

Re: [PR] Parse Snowflake COPY INTO [datafusion-sqlparser-rs]

2025-02-05 Thread via GitHub
iffyio merged PR #1669: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1669 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[PR] [DO NOT MERGE] Test [datafusion-comet]

2025-02-05 Thread via GitHub
kazuyukitanimura opened a new pull request, #1370: URL: https://github.com/apache/datafusion-comet/pull/1370 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] 14044/enhancement/add xxhash algorithms in expression api [datafusion]

2025-02-05 Thread via GitHub
Spaarsh commented on PR #14367: URL: https://github.com/apache/datafusion/pull/14367#issuecomment-2637820853 @Omega359 I had not pushed the updated cargo.lock file, hence the tests were failing. I think now this PR is ready for review. The hash.stl file has been added as well. There is a sm

Re: [PR] feat: Add fair unified memory pool [datafusion-comet]

2025-02-05 Thread via GitHub
codecov-commenter commented on PR #1369: URL: https://github.com/apache/datafusion-comet/pull/1369#issuecomment-2637804842 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1369?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] Project Ideas for GSoC 2025 [datafusion]

2025-02-05 Thread via GitHub
ozankabak commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2637764808 @andygrove, would you be willing to mentor a student if they choose to work on the Spark-compatible functions crate? -- This is an automated message from the Apache Git Serv

Re: [PR] expose write options [datafusion-python]

2025-02-05 Thread via GitHub
kylebarron commented on code in PR #1006: URL: https://github.com/apache/datafusion-python/pull/1006#discussion_r1943493631 ## src/options.rs: ## @@ -0,0 +1,74 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [I] Comet can produce different results to Spark when averaging a decimal [datafusion-comet]

2025-02-05 Thread via GitHub
andygrove commented on issue #1354: URL: https://github.com/apache/datafusion-comet/issues/1354#issuecomment-2637709281 We should port the logic from `org.apache.spark.sql.types.Decimal#changePrecision` to resolve this. -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Replacing `SessionState` with `Session` and progress towards moving `FileFormatFactory` out of `datasource` [datafusion]

2025-02-05 Thread via GitHub
logan-keede commented on PR #14517: URL: https://github.com/apache/datafusion/pull/14517#issuecomment-2637704059 @alamb do you think this is correct approach to make progress? PS: I am working on fixing CI -- This is an automated message from the Apache Git Service. To respond to the m

Re: [I] Comet can produce different results to Spark when averaging a decimal [datafusion-comet]

2025-02-05 Thread via GitHub
andygrove commented on issue #1354: URL: https://github.com/apache/datafusion-comet/issues/1354#issuecomment-2637700758 The issue is with the cast of the avg float64 value `0.5153125`. Spark rounds up to `0.515313` and Comet rounds down (or truncates) to `0.515312`. -- This is an automat

Re: [I] Project Ideas for GSoC 2025 [datafusion]

2025-02-05 Thread via GitHub
ozankabak commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2637692590 @alamb, thank you for sharing your thoughts. I added a difficulty column to guide students as they browse the projects. I agree that the "Advanced" level projects are either f

Re: [PR] Replacing `SessionState` with `Session` and progress towards moving `FileFormatFactory` out of `datasource` [datafusion]

2025-02-05 Thread via GitHub
logan-keede commented on code in PR #14517: URL: https://github.com/apache/datafusion/pull/14517#discussion_r1943450081 ## datafusion/catalog-listing/src/file_scan_config.rs: ## @@ -0,0 +1,1616 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[PR] Replacing `SessionState` with `Session` and progress towards moving `FileFormatFactory` out of `datasource` [datafusion]

2025-02-05 Thread via GitHub
logan-keede opened a new pull request, #14517: URL: https://github.com/apache/datafusion/pull/14517 ## Which issue does this PR close? - part of https://github.com/apache/datafusion/issues/14462 - Part of https://github.com/apache/datafusion/issues/1. ## Rationale

Re: [PR] Feature Unifying source execution plans [datafusion]

2025-02-05 Thread via GitHub
alamb commented on PR #14224: URL: https://github.com/apache/datafusion/pull/14224#issuecomment-2637667253 I am trying to upgrade InfluxDB IOx to use this PR. I will report back later today on its impact -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [I] Comet can produce different results to Spark when averaging a decimal [datafusion-comet]

2025-02-05 Thread via GitHub
andygrove commented on issue #1354: URL: https://github.com/apache/datafusion-comet/issues/1354#issuecomment-2637651966 It turns out that Comet isn't using it's `AvgDecimal` expression but instead it is using the `Avg` expression and is operating on an `UnscaledValue` which represents the

[PR] feat: Add fair unified memory pool [datafusion-comet]

2025-02-05 Thread via GitHub
kazuyukitanimura opened a new pull request, #1369: URL: https://github.com/apache/datafusion-comet/pull/1369 ## Which issue does this PR close? ## Rationale for this change Current Comet unified memory pool is a greedy pool. One thread (consumer) can take a large amount of memo

Re: [PR] 14044/enhancement/add xxhash algorithms in expression api [datafusion]

2025-02-05 Thread via GitHub
Omega359 commented on PR #14367: URL: https://github.com/apache/datafusion/pull/14367#issuecomment-2637647464 > > Beyond the comments I've added it would be very welcome to include .slt tests for these hash functions. Search for md5 or sha256 for examples of current tests. > > @Omega

Re: [PR] 14044/enhancement/add xxhash algorithms in expression api [datafusion]

2025-02-05 Thread via GitHub
Spaarsh commented on PR #14367: URL: https://github.com/apache/datafusion/pull/14367#issuecomment-2637627050 > Beyond the comments I've added it would be very welcome to include .slt tests for these hash functions. Search for md5 or sha256 for examples of current tests. @Omega359 I f

Re: [PR] feat: Add `datafusion-spark` crate [datafusion]

2025-02-05 Thread via GitHub
andygrove commented on PR #14392: URL: https://github.com/apache/datafusion/pull/14392#issuecomment-2637613868 I filed an issue in Comet to do the necessary work (testing and examples) to prepare to move the crate into DataFusion repo. https://github.com/apache/datafusion-comet/issues

Re: [PR] Add `DataFusionError::Collection` to return multiple `DataFusionError`s [datafusion]

2025-02-05 Thread via GitHub
alamb commented on code in PR #14439: URL: https://github.com/apache/datafusion/pull/14439#discussion_r1943361006 ## datafusion/common/src/error.rs: ## @@ -334,6 +343,14 @@ impl Error for DataFusionError { DataFusionError::Context(_, e) => Some(e.as_ref()),

Re: [I] Buildable release builds [datafusion]

2025-02-05 Thread via GitHub
findepi commented on issue #14479: URL: https://github.com/apache/datafusion/issues/14479#issuecomment-2637572096 Let's do this. we upgrade compiler version manually anyway, so we can update one more place i believe. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Validate and unpack function arguments tersely [datafusion]

2025-02-05 Thread via GitHub
findepi commented on PR #14513: URL: https://github.com/apache/datafusion/pull/14513#issuecomment-2637552879 thank you @alamb @mbrobbel @comphead for your reviews! > BTW I bet others would love a chance to clean up the code using this function if we filed a ticket for them to do so.

[I] Apply `take_function_args` to functions validating argument count [datafusion]

2025-02-05 Thread via GitHub
findepi opened a new issue, #14516: URL: https://github.com/apache/datafusion/issues/14516 https://github.com/apache/datafusion/pull/14513 introduced a heper to make code more readable and error messages more consistent. apply the pattern in other functions **_candidate_** places

Re: [PR] Support Limit pushdown for `MemoryExec` [datafusion]

2025-02-05 Thread via GitHub
alamb commented on PR #14502: URL: https://github.com/apache/datafusion/pull/14502#issuecomment-2637549139 > @alamb, this is subsumed by the unified source PR. Let's only merge this if we encounter an issue that will delay the other one, but let's hold off otherwise to avoid conflicts

Re: [PR] fix: Capture nullability in `Values` node planning [datafusion]

2025-02-05 Thread via GitHub
alamb commented on PR #14472: URL: https://github.com/apache/datafusion/pull/14472#issuecomment-2637546836 I merged up from main to get a clean CI run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Fix: Avoid recursive external error wrapping [datafusion]

2025-02-05 Thread via GitHub
alamb commented on PR #14371: URL: https://github.com/apache/datafusion/pull/14371#issuecomment-2637543568 Marking as an API change as this will result in downstream users having to handle a new DataFusionError variants (which is fine, I just want to correctly capture the idea) -- This i

Re: [PR] Validate and unpack function arguments tersely [datafusion]

2025-02-05 Thread via GitHub
findepi commented on code in PR #14513: URL: https://github.com/apache/datafusion/pull/14513#discussion_r1943355544 ## datafusion/functions/src/core/nvl2.rs: ## @@ -104,27 +105,19 @@ impl ScalarUDFImpl for NVL2Func { } fn coerce_types(&self, arg_types: &[DataType]) -

Re: [PR] Support returning multiple `DataFusionError`s [datafusion]

2025-02-05 Thread via GitHub
alamb commented on PR #14439: URL: https://github.com/apache/datafusion/pull/14439#issuecomment-2637541478 Marking as an API change as the new enum will need to be handled by downstream users -- This is an automated message from the Apache Git Service. To respond to the message, please lo

  1   2   3   >