Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1586956681 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -293,6 +314,32 @@ fn maybe_data_types( Some(new_type) } +/// Check if the current argument

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1586956681 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -293,6 +314,32 @@ fn maybe_data_types( Some(new_type) } +/// Check if the current argument

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1586956681 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -293,6 +314,32 @@ fn maybe_data_types( Some(new_type) } +/// Check if the current argument

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1587073299 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -53,20 +54,31 @@ pub fn data_types( } } -let valid_types =

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1587073726 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -53,20 +54,31 @@ pub fn data_types( } } -let valid_types =

[PR] feat: Switch to use Rust stable by default [datafusion-comet]

2024-05-01 Thread via GitHub
sunchao opened a new pull request, #373: URL: https://github.com/apache/datafusion-comet/pull/373 ## Which issue does this PR close? Closes #. ## Rationale for this change There is little reason for Comet to use nightly at the moment. The only feature we

Re: [I] Making Comet Common Module Engine Independent [datafusion-comet]

2024-05-01 Thread via GitHub
sunchao commented on issue #329: URL: https://github.com/apache/datafusion-comet/issues/329#issuecomment-2089524991 The original purpose of `comet-common` module is to make it engine-agnostic so it can be used for other use cases like Iceberg. Unfortunately we didn't have time to make it

Re: [PR] feat(7181): cascading loser tree merges [datafusion]

2024-05-01 Thread via GitHub
wiedld commented on PR #7379: URL: https://github.com/apache/datafusion/pull/7379#issuecomment-2089478253 Working on other things. If/when we circle back, we'll be recreating differently. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] feat(7181): cascading loser tree merges [datafusion]

2024-05-01 Thread via GitHub
wiedld closed pull request #7379: feat(7181): cascading loser tree merges URL: https://github.com/apache/datafusion/pull/7379 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Remove ScalarFunctionDefinition [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10325: URL: https://github.com/apache/datafusion/pull/10325#discussion_r1587001540 ## datafusion/expr/src/expr_schema.rs: ## @@ -138,25 +138,21 @@ impl ExprSchemable for Expr { .iter() .map(|e|

Re: [PR] Minor: add a few more dictionary unwrap tests [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 merged PR #10335: URL: https://github.com/apache/datafusion/pull/10335 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Minor: reuse the group key row buffer to avoid reallocation [datafusion]

2024-05-01 Thread via GitHub
github-actions[bot] commented on PR #7426: URL: https://github.com/apache/datafusion/pull/7426#issuecomment-2089392509 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] feat(7181): cascading loser tree merges [datafusion]

2024-05-01 Thread via GitHub
github-actions[bot] commented on PR #7379: URL: https://github.com/apache/datafusion/pull/7379#issuecomment-2089392557 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] Use make_array to handle SQLExpr::Array. [datafusion]

2024-05-01 Thread via GitHub
github-actions[bot] commented on PR #7427: URL: https://github.com/apache/datafusion/pull/7427#issuecomment-2089392469 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [I] Prepare Spark 4.0 shims [datafusion-comet]

2024-05-01 Thread via GitHub
kazuyukitanimura commented on issue #372: URL: https://github.com/apache/datafusion-comet/issues/372#issuecomment-2089389739 I am on it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Add `SessionContext`/`SessionState::create_physical_expr()` to create `PhysicalExpressions` from `Expr`s [datafusion]

2024-05-01 Thread via GitHub
phillipleblanc commented on code in PR #10330: URL: https://github.com/apache/datafusion/pull/10330#discussion_r1586971614 ## datafusion/common/src/dfschema.rs: ## @@ -806,6 +820,12 @@ impl From<> for Schema { } } +impl AsRef for DFSchema { Review Comment: Very

[PR] Introducing repartitioning outside [datafusion]

2024-05-01 Thread via GitHub
edmondop opened a new pull request, #10338: URL: https://github.com/apache/datafusion/pull/10338 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tested?

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1586960617 ## datafusion/expr/src/signature.rs: ## @@ -92,14 +92,22 @@ pub enum TypeSignature { /// A function such as `concat` is `Variadic(vec![DataType::Utf8,

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1586955192 ## datafusion/expr/src/signature.rs: ## @@ -92,14 +92,22 @@ pub enum TypeSignature { /// A function such as `concat` is `Variadic(vec![DataType::Utf8,

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1586955192 ## datafusion/expr/src/signature.rs: ## @@ -92,14 +92,22 @@ pub enum TypeSignature { /// A function such as `concat` is `Variadic(vec![DataType::Utf8,

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1586955192 ## datafusion/expr/src/signature.rs: ## @@ -92,14 +92,22 @@ pub enum TypeSignature { /// A function such as `concat` is `Variadic(vec![DataType::Utf8,

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1586955192 ## datafusion/expr/src/signature.rs: ## @@ -92,14 +92,22 @@ pub enum TypeSignature { /// A function such as `concat` is `Variadic(vec![DataType::Utf8,

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1586956681 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -293,6 +314,32 @@ fn maybe_data_types( Some(new_type) } +/// Check if the current argument

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1586955192 ## datafusion/expr/src/signature.rs: ## @@ -92,14 +92,22 @@ pub enum TypeSignature { /// A function such as `concat` is `Variadic(vec![DataType::Utf8,

Re: [PR] Minor: Add additional coalesce tests [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 merged PR #10334: URL: https://github.com/apache/datafusion/pull/10334 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Move `create_physical_expr` to `physical-expr-common` [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on issue #10074: URL: https://github.com/apache/datafusion/issues/10074#issuecomment-2089325105 @alamb First of all, we plan to pull the aggregate function out from `datafusion-physical-expr`. And, we also split `datafusion-physical-common` from

Re: [PR] Fix: Sort Merge Join crashes on TPCH Q21 [datafusion]

2024-05-01 Thread via GitHub
comphead commented on PR #10304: URL: https://github.com/apache/datafusion/pull/10304#issuecomment-2089299005 Q21 contains both LeftSemi and LeftAnti joins. Hopefully LeftSemi works now, solving LeftAnti -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Fix `coalesce`, `struct` and `named_strct` expr_fn function to take multiple arguments [datafusion]

2024-05-01 Thread via GitHub
jayzhan211 commented on code in PR #10321: URL: https://github.com/apache/datafusion/pull/10321#discussion_r1586925126 ## datafusion/functions/src/core/mod.rs: ## @@ -39,14 +42,68 @@ make_udf_function!(getfield::GetFieldFunc, GET_FIELD, get_field);

Re: [PR] Fix `coalesce`, `struct` and `named_strct` expr_fn function to take multiple arguments [datafusion]

2024-05-01 Thread via GitHub
Omega359 commented on code in PR #10321: URL: https://github.com/apache/datafusion/pull/10321#discussion_r1586921438 ## datafusion/functions/src/core/mod.rs: ## @@ -39,14 +42,68 @@ make_udf_function!(getfield::GetFieldFunc, GET_FIELD, get_field);

[I] Avoid inlining non deterministic CTE [datafusion]

2024-05-01 Thread via GitHub
tgujar opened a new issue, #10337: URL: https://github.com/apache/datafusion/issues/10337 ### Describe the bug Currently Datafusion will inline all CTE, a non-deterministic expression can be executed multiple times producing different results ### To Reproduce

Re: [PR] feat: Implement Spark-compatible CAST between integer types [datafusion-comet]

2024-05-01 Thread via GitHub
andygrove commented on code in PR #340: URL: https://github.com/apache/datafusion-comet/pull/340#discussion_r1586886509 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -807,11 +833,22 @@ class CometCastSuite extends CometTestBase with

Re: [PR] Stop copying LogicalPlan and Exprs in EliminateNestedUnion [datafusion]

2024-05-01 Thread via GitHub
emgeee commented on PR #10319: URL: https://github.com/apache/datafusion/pull/10319#issuecomment-2089232074 I went ahead and fixed the formatting so all tests pass and managed to remove 1 clone() call from with in the `coerce_plan_expr_for_schema()` call stack. If we want to

Re: [PR] feat: Implement Spark-compatible CAST from string to timestamp types [datafusion-comet]

2024-05-01 Thread via GitHub
andygrove commented on PR #335: URL: https://github.com/apache/datafusion-comet/pull/335#issuecomment-2089227013 This is looking great @vaibhawvipul. I think this is close to being ready to merge and then have some follow on issues for remaining items. I think the one thing I would

Re: [PR] Add `SessionContext`/`SessionState::create_physical_expr()` to create `PhysicalExpressions` from `Expr`s [datafusion]

2024-05-01 Thread via GitHub
westonpace commented on code in PR #10330: URL: https://github.com/apache/datafusion/pull/10330#discussion_r1586867994 ## datafusion/core/src/execution/context/mod.rs: ## @@ -510,6 +515,34 @@ impl SessionContext { } } +/// Create a [`PhysicalExpr`] from an

Re: [PR] feat: Determine ordering of file groups [datafusion]

2024-05-01 Thread via GitHub
suremarc commented on code in PR #9593: URL: https://github.com/apache/datafusion/pull/9593#discussion_r1586860589 ## datafusion/core/src/datasource/physical_plan/mod.rs: ## @@ -473,15 +468,281 @@ fn get_projected_output_ordering( // since rest of the orderings are

Re: [PR] Stop copying LogicalPlan and Exprs in `DecorrelatePredicateSubquery` [datafusion]

2024-05-01 Thread via GitHub
alamb merged PR #10318: URL: https://github.com/apache/datafusion/pull/10318 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Stop copying LogicalPlan and Exprs in `DecorrelatePredicateSubquery ` [datafusion]

2024-05-01 Thread via GitHub
alamb closed issue #10289: Stop copying LogicalPlan and Exprs in `DecorrelatePredicateSubquery ` URL: https://github.com/apache/datafusion/issues/10289 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Stop copying LogicalPlan and Exprs in `DecorrelatePredicateSubquery` [datafusion]

2024-05-01 Thread via GitHub
alamb commented on PR #10318: URL: https://github.com/apache/datafusion/pull/10318#issuecomment-2089182854 Thanks for the review @waynexia -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Fix `coalesce`, `struct` and `named_strct` expr_fn function to take multiple arguments [datafusion]

2024-05-01 Thread via GitHub
alamb commented on code in PR #10321: URL: https://github.com/apache/datafusion/pull/10321#discussion_r1586856895 ## datafusion/functions/src/core/mod.rs: ## @@ -39,14 +42,68 @@ make_udf_function!(getfield::GetFieldFunc, GET_FIELD, get_field);

Re: [I] Move `create_physical_expr` to `physical-expr-common` [datafusion]

2024-05-01 Thread via GitHub
alamb commented on issue #10074: URL: https://github.com/apache/datafusion/issues/10074#issuecomment-2089174413 > The overall idea is to enable us to import common things to Expr::AggregateFunction which lives in datafusion-expr. In the chart you have in

Re: [PR] feat: unwrap casts of string and dictionary columns [datafusion]

2024-05-01 Thread via GitHub
alamb commented on code in PR #10323: URL: https://github.com/apache/datafusion/pull/10323#discussion_r1586823032 ## datafusion/optimizer/src/unwrap_cast_in_comparison.rs: ## @@ -298,19 +306,47 @@ fn is_support_data_type(data_type: ) -> bool { ) } +/// Returns true if

Re: [I] [Epic] A Collection of Sort Based Optimizations [datafusion]

2024-05-01 Thread via GitHub
alamb commented on issue #10313: URL: https://github.com/apache/datafusion/issues/10313#issuecomment-2089129451 > Would this ticket be an appropriate place to add tickets related to pushing down sorts to federated query engines? I know that this was discussed previously (i.e. #7871) and it

Re: [PR] docs: update docs CI to install python-311 requirements [datafusion-python]

2024-05-01 Thread via GitHub
andygrove merged PR #661: URL: https://github.com/apache/datafusion-python/pull/661 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] feat: Determine ordering of file groups [datafusion]

2024-05-01 Thread via GitHub
alamb commented on PR #9593: URL: https://github.com/apache/datafusion/pull/9593#issuecomment-2089122508 Filed https://github.com/apache/datafusion/issues/10336 to track enable this flag by default -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [I] Enable `split_file_groups_by_statistics` by default [datafusion]

2024-05-01 Thread via GitHub
alamb commented on issue #10336: URL: https://github.com/apache/datafusion/issues/10336#issuecomment-2089121776 Example test coverage we should add I think: https://github.com/apache/datafusion/pull/9593#discussion_r1585517605 -- This is an automated message from the Apache Git Service.

Re: [I] [Epic] A Collection of Sort Based Optimizations [datafusion]

2024-05-01 Thread via GitHub
alamb commented on issue #10313: URL: https://github.com/apache/datafusion/issues/10313#issuecomment-2089116574 Update here: we merged https://github.com/apache/datafusion/pull/9593 and now will work on increasing the test coverage to enable it by default (tracked in

[I] Enable `split_file_groups_by_statistics` by default [datafusion]

2024-05-01 Thread via GitHub
alamb opened a new issue, #10336: URL: https://github.com/apache/datafusion/issues/10336 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/10313 In https://github.com/apache/datafusion/pull/9593, @suremarc added a

Re: [PR] feat: unwrap casts of string and dictionary columns [datafusion]

2024-05-01 Thread via GitHub
erratic-pattern commented on code in PR #10323: URL: https://github.com/apache/datafusion/pull/10323#discussion_r1586805024 ## datafusion/optimizer/src/unwrap_cast_in_comparison.rs: ## @@ -298,19 +306,47 @@ fn is_support_data_type(data_type: ) -> bool { ) } +/// Returns

Re: [PR] feat: Determine ordering of file groups [datafusion]

2024-05-01 Thread via GitHub
alamb commented on code in PR #9593: URL: https://github.com/apache/datafusion/pull/9593#discussion_r1586804068 ## datafusion/core/src/datasource/physical_plan/mod.rs: ## @@ -473,15 +468,281 @@ fn get_projected_output_ordering( // since rest of the orderings are

Re: [I] Use file statistics in query planning to avoid sorting when unecessary [datafusion]

2024-05-01 Thread via GitHub
alamb closed issue #7490: Use file statistics in query planning to avoid sorting when unecessary URL: https://github.com/apache/datafusion/issues/7490 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] feat: Determine ordering of file groups [datafusion]

2024-05-01 Thread via GitHub
alamb merged PR #9593: URL: https://github.com/apache/datafusion/pull/9593 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] feat: unwrap casts of string and dictionary columns [datafusion]

2024-05-01 Thread via GitHub
erratic-pattern commented on code in PR #10323: URL: https://github.com/apache/datafusion/pull/10323#discussion_r1586797735 ## datafusion/optimizer/src/unwrap_cast_in_comparison.rs: ## @@ -298,19 +306,47 @@ fn is_support_data_type(data_type: ) -> bool { ) } +/// Returns

Re: [PR] feat: Determine ordering of file groups [datafusion]

2024-05-01 Thread via GitHub
alamb commented on PR #9593: URL: https://github.com/apache/datafusion/pull/9593#issuecomment-2089091928 > @alamb I added a config value, and I moved MinMaxStatistics to its own module as requested. I wasn't sure if I should delay addressing your feedback on tests to the next PR, since it

Re: [PR] feat: unwrap casts of string and dictionary columns [datafusion]

2024-05-01 Thread via GitHub
erratic-pattern commented on code in PR #10323: URL: https://github.com/apache/datafusion/pull/10323#discussion_r1586795632 ## datafusion/optimizer/src/unwrap_cast_in_comparison.rs: ## @@ -298,19 +306,47 @@ fn is_support_data_type(data_type: ) -> bool { ) } +/// Returns

Re: [I] Stop copying LogicalPlan and Exprs in `CommonSubexprEliminate` [datafusion]

2024-05-01 Thread via GitHub
alamb closed issue #10211: Stop copying LogicalPlan and Exprs in `CommonSubexprEliminate` URL: https://github.com/apache/datafusion/issues/10211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Stop copying LogicalPlan and Exprs in `CommonSubexprEliminate` [datafusion]

2024-05-01 Thread via GitHub
alamb commented on issue #10211: URL: https://github.com/apache/datafusion/issues/10211#issuecomment-2089088807 > @alamb, why did you reopen this? I am not sure. Sorry about that -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] fix: preserve more dictionaries when coercing types [datafusion]

2024-05-01 Thread via GitHub
alamb closed pull request #10221: fix: preserve more dictionaries when coercing types URL: https://github.com/apache/datafusion/pull/10221 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] fix: preserve more dictionaries when coercing types [datafusion]

2024-05-01 Thread via GitHub
alamb commented on PR #10221: URL: https://github.com/apache/datafusion/pull/10221#issuecomment-2089086256 I believe this is superceded by https://github.com/apache/datafusion/pull/10323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] feat: Implement Spark-compatible CAST from string to timestamp types [datafusion-comet]

2024-05-01 Thread via GitHub
andygrove commented on code in PR #335: URL: https://github.com/apache/datafusion-comet/pull/335#discussion_r1586766148 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -510,9 +558,246 @@ impl PhysicalExpr for Cast { } } +fn timestamp_parser(value: ,

Re: [PR] fix: limit with offset should return correct results [datafusion-comet]

2024-05-01 Thread via GitHub
viirya commented on code in PR #359: URL: https://github.com/apache/datafusion-comet/pull/359#discussion_r1586759865 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -340,7 +340,7 @@ class CometSparkSessionExtensions op

Re: [PR] feat: unwrap casts of string and dictionary columns [datafusion]

2024-05-01 Thread via GitHub
alamb commented on PR #10323: URL: https://github.com/apache/datafusion/pull/10323#issuecomment-2089031318 I made a PR with a few more tests here: https://github.com/apache/datafusion/pull/10335 -- This is an automated message from the Apache Git Service. To respond to the message,

[PR] Minor: add a few more dictionary unwrap tests [datafusion]

2024-05-01 Thread via GitHub
alamb opened a new pull request, #10335: URL: https://github.com/apache/datafusion/pull/10335 ## Which issue does this PR close? Small follow on to https://github.com/apache/datafusion/pull/10323 ## Rationale for this change I noticed a few more cases I thought should be

Re: [PR] feat: unwrap casts of string and dictionary columns [datafusion]

2024-05-01 Thread via GitHub
alamb commented on code in PR #10323: URL: https://github.com/apache/datafusion/pull/10323#discussion_r1586745915 ## datafusion/optimizer/src/unwrap_cast_in_comparison.rs: ## @@ -298,19 +306,47 @@ fn is_support_data_type(data_type: ) -> bool { ) } +/// Returns true if

Re: [I] Slow comparisions to dictionary columns with type coercion [datafusion]

2024-05-01 Thread via GitHub
alamb closed issue #10220: Slow comparisions to dictionary columns with type coercion URL: https://github.com/apache/datafusion/issues/10220 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] feat: unwrap casts of string and dictionary columns [datafusion]

2024-05-01 Thread via GitHub
alamb merged PR #10323: URL: https://github.com/apache/datafusion/pull/10323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] fix: limit with offset should return correct results [datafusion-comet]

2024-05-01 Thread via GitHub
parthchandra commented on code in PR #359: URL: https://github.com/apache/datafusion-comet/pull/359#discussion_r1586750015 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -340,7 +340,7 @@ class CometSparkSessionExtensions op

Re: [PR] feat: Implement Spark-compatible CAST from string to timestamp types [datafusion-comet]

2024-05-01 Thread via GitHub
andygrove commented on code in PR #335: URL: https://github.com/apache/datafusion-comet/pull/335#discussion_r1586743530 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -86,6 +89,24 @@ macro_rules! cast_utf8_to_int { }}; } +macro_rules! cast_utf8_to_timestamp

Re: [PR] Minor: Add coalesce tests [datafusion]

2024-05-01 Thread via GitHub
alamb commented on code in PR #10334: URL: https://github.com/apache/datafusion/pull/10334#discussion_r1586740016 ## datafusion/sqllogictest/test_files/coalesce.slt: ## @@ -0,0 +1,374 @@ +# Licensed to the Apache Software Foundation (ASF) under one Review Comment: Given the

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
alamb commented on PR #10268: URL: https://github.com/apache/datafusion/pull/10268#issuecomment-2088998120 I made a PR with just your wonderful tests in https://github.com/apache/datafusion/pull/10334/files -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] Unable to Run tests of CometShuffleEncryptionSuite [datafusion-comet]

2024-05-01 Thread via GitHub
ganeshkumar269 commented on issue #367: URL: https://github.com/apache/datafusion-comet/issues/367#issuecomment-2088994122 ./mvnw test -Dsuites=org.apache.comet.exec.CometShuffleEncryptionSuite ran the above command, same result -- This is an automated message from the Apache Git

Re: [PR] Coverage: Add a manual test to show what Spark built in expression the DF can support directly [datafusion-comet]

2024-05-01 Thread via GitHub
parthchandra commented on code in PR #331: URL: https://github.com/apache/datafusion-comet/pull/331#discussion_r1586727820 ## spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala: ## @@ -123,7 +134,7 @@ class CometExpressionCoverageSuite extends

[I] Write a guide on contributing a new expression [datafusion-comet]

2024-05-01 Thread via GitHub
andygrove opened a new issue, #370: URL: https://github.com/apache/datafusion-comet/issues/370 ### What is the problem the feature request solves? We should write a detailed guide showing new contributors how to add support for a new expression in Comet. This would cover the Scala,

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-05-01 Thread via GitHub
alamb commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1586698773 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -53,20 +54,31 @@ pub fn data_types( } } -let valid_types = get_valid_types(_signature,

[I] Plan first release [datafusion-comet]

2024-05-01 Thread via GitHub
andygrove opened a new issue, #369: URL: https://github.com/apache/datafusion-comet/issues/369 ### What is the problem the feature request solves? During the Comet public meeting this morning, there were questions about when the first official release would be. We do not really have

[I] Enable GitHub discussions [datafusion-comet]

2024-05-01 Thread via GitHub
andygrove opened a new issue, #368: URL: https://github.com/apache/datafusion-comet/issues/368 ### What is the problem the feature request solves? During the public Comet meeting this morning we discussed enabling GitHub discussions as a place for people to ask questions and also

[PR] remove expr node accumulation [datafusion]

2024-05-01 Thread via GitHub
MohamedAbdeen21 opened a new pull request, #10333: URL: https://github.com/apache/datafusion/pull/10333 ## Which issue does this PR close? Closes #10280. Thanks @JasonLi-cn for pointing out the relevant code snippet and the detailed description, saved me some time ❤️

Re: [PR] feat: Add a test for supported types of SortMergeJoin in DataFusion [datafusion-comet]

2024-05-01 Thread via GitHub
planga82 commented on code in PR #365: URL: https://github.com/apache/datafusion-comet/pull/365#discussion_r1586685070 ## common/src/main/scala/org/apache/comet/CometConf.scala: ## @@ -383,6 +383,13 @@ object CometConf { .booleanConf .createWithDefault(false) + val

Re: [I] Use non-comparison coercion for `Coalesce` or even avoid implicit casting for `Coalesce` [datafusion]

2024-05-01 Thread via GitHub
alamb commented on issue #10261: URL: https://github.com/apache/datafusion/issues/10261#issuecomment-2088944901 BTW this is a really nicely written ticket -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Finalize SIGMOD 2024 paper ~(if accepted)~ [datafusion]

2024-05-01 Thread via GitHub
alamb commented on issue #8373: URL: https://github.com/apache/datafusion/issues/8373#issuecomment-2088935249 I think it is finally accepted  ``` -- Forwarded message - From: Tim Pollitt Date: Wed, May 1, 2024 at 2:44 PM Subject: ACM Proceedings

[I] Tracking Upgrade to Datafusion 37 [datafusion-python]

2024-05-01 Thread via GitHub
Michael-J-Ward opened a new issue, #663: URL: https://github.com/apache/datafusion-python/issues/663 I sketched out some of the upgrade in #662 and wanted to share what I encountered. Some major changes in between datafusion 36 and 37. ##

Re: [I] Unable to Run tests of CometShuffleEncryptionSuite [datafusion-comet]

2024-05-01 Thread via GitHub
viirya commented on issue #367: URL: https://github.com/apache/datafusion-comet/issues/367#issuecomment-2088932628 Have you tried to run it in command line? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] feat: Only allow supported and tested cast operations to be converted to native [datafusion-comet]

2024-05-01 Thread via GitHub
andygrove commented on code in PR #362: URL: https://github.com/apache/datafusion-comet/pull/362#discussion_r158786 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -1042,10 +1042,10 @@ object CometSparkSessionExtensions extends Logging {

Re: [I] DataFusion `38.0.0` Release [datafusion]

2024-05-01 Thread via GitHub
alamb commented on issue #10217: URL: https://github.com/apache/datafusion/issues/10217#issuecomment-2088930080 Here is a PR witha proposed API https://github.com/apache/datafusion/pull/10330 that would close https://github.com/apache/datafusion/issues/10181 -- This is an

Re: [I] Error "entered unreachable code: NamedStructField should be rewritten in OperatorToFunction" after upgrade to 37 [datafusion]

2024-05-01 Thread via GitHub
alamb commented on issue #10181: URL: https://github.com/apache/datafusion/issues/10181#issuecomment-2088929490 Here is a PR with a proposed new API: https://github.com/apache/datafusion/pull/10330 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Add APIs to create `PhysicalExpressions` from `Expr`s: `SessionContext::create_physical_expr()` and `SessionState::create_physical_expr()` [datafusion]

2024-05-01 Thread via GitHub
alamb commented on code in PR #10330: URL: https://github.com/apache/datafusion/pull/10330#discussion_r1586649331 ## datafusion-examples/examples/expr_api.rs: ## @@ -92,7 +90,8 @@ fn evaluate_demo() -> Result<()> { let expr = col("a").lt(lit(5)).or(col("a").eq(lit(8)));

Re: [PR] feat: Implement Spark-compatible CAST from string to timestamp types [datafusion-comet]

2024-05-01 Thread via GitHub
andygrove commented on code in PR #335: URL: https://github.com/apache/datafusion-comet/pull/335#discussion_r1586652474 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -528,14 +528,29 @@ class CometCastSuite extends CometTestBase with

Re: [PR] doc: Clean up supported JDKs in README [datafusion-comet]

2024-05-01 Thread via GitHub
andygrove merged PR #366: URL: https://github.com/apache/datafusion-comet/pull/366 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] doc: Fix target typo in development.md [datafusion-comet]

2024-05-01 Thread via GitHub
andygrove merged PR #364: URL: https://github.com/apache/datafusion-comet/pull/364 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] docs: Add a plugin overview page to the contributors guide [datafusion-comet]

2024-05-01 Thread via GitHub
parthchandra commented on code in PR #345: URL: https://github.com/apache/datafusion-comet/pull/345#discussion_r1586638726 ## docs/source/contributor-guide/plugin_overview.md: ## @@ -0,0 +1,50 @@ + + +# Comet Plugin Overview + +The entry point to Comet is the

Re: [I] Setup regular meetups for Comet development [datafusion-comet]

2024-05-01 Thread via GitHub
parthchandra commented on issue #217: URL: https://github.com/apache/datafusion-comet/issues/217#issuecomment-2088897629 For those who might be reading this in future: Meetups links: https://docs.google.com/document/d/1NBpkIAuU7O9h8Br5CbFksDhX-L9TyO9wmGLPMe0Plc8/edit?usp=sharing

Re: [I] Setup regular meetups for Comet development [datafusion-comet]

2024-05-01 Thread via GitHub
parthchandra closed issue #217: Setup regular meetups for Comet development URL: https://github.com/apache/datafusion-comet/issues/217 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] feat: Add a test for supported types of SortMergeJoin in DataFusion [datafusion-comet]

2024-05-01 Thread via GitHub
comphead commented on code in PR #365: URL: https://github.com/apache/datafusion-comet/pull/365#discussion_r1586622441 ## common/src/main/scala/org/apache/comet/CometConf.scala: ## @@ -383,6 +383,13 @@ object CometConf { .booleanConf .createWithDefault(false) + val

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-05-01 Thread via GitHub
parthchandra commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1586606093 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -142,6 +233,202 @@ impl Cast { } } +/// Equivalent to

[I] Unable to Run tests of CometShuffleEncryptionSuite [datafusion-comet]

2024-05-01 Thread via GitHub
ganeshkumar269 opened a new issue, #367: URL: https://github.com/apache/datafusion-comet/issues/367 ### Describe the bug when ever I execute the test class CometShuffleEncryptionSuite, the tests fails/crashes with no meaningful error message, though I get this warning message

Re: [PR] chore: add a utils method to getColumnReader with SQLConf [datafusion-comet]

2024-05-01 Thread via GitHub
viirya commented on PR #360: URL: https://github.com/apache/datafusion-comet/pull/360#issuecomment-203549 Hmm, I think @parthchandra meant that by adding SQLConf, `common` will depend on Spark. Does it conflict with the direction to make `common` not Spark dependent? -- This is an

Re: [I] explain plan disables some cases in ScanExecRule [datafusion-comet]

2024-05-01 Thread via GitHub
viirya commented on issue #322: URL: https://github.com/apache/datafusion-comet/issues/322#issuecomment-200855 Closed it now. Thanks @jc4x4 for reminding. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] explain plan disables some cases in ScanExecRule [datafusion-comet]

2024-05-01 Thread via GitHub
viirya closed issue #322: explain plan disables some cases in ScanExecRule URL: https://github.com/apache/datafusion-comet/issues/322 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Jdk 11 in readme [datafusion-comet]

2024-05-01 Thread via GitHub
viirya commented on code in PR #366: URL: https://github.com/apache/datafusion-comet/pull/366#discussion_r1586603568 ## README.md: ## @@ -63,7 +63,7 @@ Linux, Apple OSX (Intel and M1) ## Requirements - Apache Spark 3.2, 3.3, or 3.4 -- JDK 8 and up +- JDK 11 and up (JDK 8

Re: [PR] Jdk 11 in readme [datafusion-comet]

2024-05-01 Thread via GitHub
viirya commented on PR #366: URL: https://github.com/apache/datafusion-comet/pull/366#issuecomment-2088874522 Thanks @edmondop -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] feat: Implement Spark-compatible CAST between integer types [datafusion-comet]

2024-05-01 Thread via GitHub
ganeshkumar269 commented on PR #340: URL: https://github.com/apache/datafusion-comet/pull/340#issuecomment-2088848565 Hi @andygrove , i have added a check before we fetch sparkInvalidValue, defaulting it to EMPTY_STRING if ':' is not present. Also added additional comments on why we are

  1   2   >