Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
github-actions[bot] commented on PR #15482: URL: https://github.com/apache/datafusion/pull/15482#issuecomment-2974908124 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
alamb commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2038079988 ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; Review Comment: (in case anyone else find this, you can get the full plan via `EXPLAIN FORMAT INDENT ...` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
alan910127 commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2029954233 ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; Review Comment: EDIT: `EXPLAIN` works (only outputs physical plan) in the latest main branch, but I'm not sure if this behavior is expected :thinking: ```sql > create table t as select CAST(123 AS int) a; 0 row(s) fetched. Elapsed 0.021 seconds. > select * from t where a = '99'; type_coercion caused by Error during planning: Cannot coerce '99' to type 'Int32' > explain select * from t where a = '99'; +---+---+ | plan_type | plan | +---+---+ | physical_plan | ┌───┐ | | | │CoalesceBatchesExec│ | | | │ │ | | | │ target_batch_size:│ | | | │8192 │ | | | └─┬─┘ | | | ┌─┴─┐ | | | │ FilterExec│ | | | │ │ | | | │ predicate:│ | | | │ a = 99 │ | | | └─┬─┘ | | | ┌─┴─┐ | | | │ DataSourceExec │ | | | │ │ | | | │ bytes: 160│ | | | │ format: memory │ | | | │ rows: 1 │ | | | └───┘ | | | | +---+---+ ``` > Perhaps we can file a ticket to track it ~Issue created: #15598~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
alan910127 commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2029954233 ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; Review Comment: EDIT: @alamb `EXPLAIN` works (only outputs physical plan) in the latest main branch, but I'm not sure if this behavior is expected :thinking: ```sql > create table t as select CAST(123 AS int) a; 0 row(s) fetched. Elapsed 0.021 seconds. > select * from t where a = '99'; type_coercion caused by Error during planning: Cannot coerce '99' to type 'Int32' > explain select * from t where a = '99'; +---+---+ | plan_type | plan | +---+---+ | physical_plan | ┌───┐ | | | │CoalesceBatchesExec│ | | | │ │ | | | │ target_batch_size:│ | | | │8192 │ | | | └─┬─┘ | | | ┌─┴─┐ | | | │ FilterExec│ | | | │ │ | | | │ predicate:│ | | | │ a = 99 │ | | | └─┬─┘ | | | ┌─┴─┐ | | | │ DataSourceExec │ | | | │ │ | | | │ bytes: 160│ | | | │ format: memory │ | | | │ rows: 1 │ | | | └───┘ | | | | +---+---+ ``` > Perhaps we can file a ticket to track it ~Issue created: #15598~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
alan910127 commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2029961032 ## datafusion/optimizer/src/analyzer/type_coercion.rs: ## @@ -290,12 +290,31 @@ impl<'a> TypeCoercionRewriter<'a> { right: Expr, right_schema: &DFSchema, ) -> Result<(Expr, Expr)> { -let (left_type, right_type) = BinaryTypeCoercer::new( -&left.get_type(left_schema)?, -&op, -&right.get_type(right_schema)?, -) -.get_input_types()?; +let left_type = left.get_type(left_schema)?; +let right_type = right.get_type(right_schema)?; + +match (&left, &right) { Review Comment: I think both are valid -- duckdb supports something like `select int_col = '12'||'3'`, while postgres returns an error since it only treats string literals as `unknown` type (thus the filter was `int = text` => error). However, I think supporting expressions makes more sense semantically (I believe we don't have a concept like the `unknown` type?). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
alan910127 commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2029954233 ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; Review Comment: EDIT: `EXPLAIN` works in the latest main branch, but I'm not sure if this behavior is expected :thinking: ```sql > create table t as select CAST(123 AS int) a; 0 row(s) fetched. Elapsed 0.021 seconds. > select * from t where a = '99'; type_coercion caused by Error during planning: Cannot coerce '99' to type 'Int32' > explain select * from t where a = '99'; +---+---+ | plan_type | plan | +---+---+ | physical_plan | ┌───┐ | | | │CoalesceBatchesExec│ | | | │ │ | | | │ target_batch_size:│ | | | │8192 │ | | | └─┬─┘ | | | ┌─┴─┐ | | | │ FilterExec│ | | | │ │ | | | │ predicate:│ | | | │ a = 99 │ | | | └─┬─┘ | | | ┌─┴─┐ | | | │ DataSourceExec │ | | | │ │ | | | │ bytes: 160│ | | | │ format: memory │ | | | │ rows: 1 │ | | | └───┘ | | | | +---+---+ ``` > Perhaps we can file a ticket to track it ~Issue created: #15598~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
alan910127 commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2029954233 ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; Review Comment: > Perhaps we can file a ticket to track it Issue created: #15598 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
gabotechs commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2024939228 ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; Review Comment: I imagine that instead of returning an empty plan, we should be expecting a runtime error here, something like ```suggestion statement error Cannot coerce '...' to type '...' explain select a from t where a = '999'; ``` I do not see any instance in the sql logic tests that are actually expecting an empty plan upon an error. ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; -logical_plan TableScan: t projection=[a], full_filters=[CAST(t.a AS Utf8) = Utf8("999")] + Review Comment: I can confirm that Postgres is also not able to plan for this query for the same reason: [link to Postgres fiddle](https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/16528) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
alan910127 commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2025222514 ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; Review Comment: I think this is likely an EXPLAIN-related issue, but returning an error when no plan could be generated seems like a reasonable approach. I'm not entirely sure why it's implemented this way, maybe we should call for help :laughing: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
alamb commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r202663 ## datafusion/optimizer/src/analyzer/type_coercion.rs: ## @@ -290,12 +290,31 @@ impl<'a> TypeCoercionRewriter<'a> { right: Expr, right_schema: &DFSchema, ) -> Result<(Expr, Expr)> { -let (left_type, right_type) = BinaryTypeCoercer::new( -&left.get_type(left_schema)?, -&op, -&right.get_type(right_schema)?, -) -.get_input_types()?; +let left_type = left.get_type(left_schema)?; +let right_type = right.get_type(right_schema)?; + +match (&left, &right) { Review Comment: I am surprise this code is only triggered for literals -- I think coercion is supposed to happen based on data types not on expressions. Among other things, this code won't handle expressions (`date_col = '2025'||'-'||'02'` for example) I think we should change the base coercion rules for types ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; -logical_plan TableScan: t projection=[a], full_filters=[CAST(t.a AS Utf8) = Utf8("999")] + Review Comment: Let's file a ticket about the explain plan being missing ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; -logical_plan TableScan: t projection=[a], full_filters=[CAST(t.a AS Utf8) = Utf8("999")] + # The predicate should still have the column cast when the value is a NOT valid i32 query TT explain select a from t where a = '99.99'; -logical_plan TableScan: t projection=[a], full_filters=[CAST(t.a AS Utf8) = Utf8("99.99")] Review Comment: I think the comments in this file are now out of date -- so perhaps we can update them ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; Review Comment: Yeah, it is kind of wierd -- it would be nice to have at least some message in the explain plan if there was an error Perhaps we can file a ticket to track it I think we should update the test to just run the query (and expect an error) rather than `EXPLAIN` it ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; Review Comment: BTW this is a new test added last week: https://github.com/apache/datafusion/blame/190634bee1093d9d71786aa9c98ec207be05ea72/datafusion/sqllogictest/test_files/push_down_filter.slt#L231 - in https://github.com/apache/datafusion/pull/15110 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
alan910127 commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2019692309 ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; -logical_plan TableScan: t projection=[a], full_filters=[CAST(t.a AS Utf8) = Utf8("999")] + Review Comment: I thought it's because it returnd a `plan_err!()` when the cast fails, but I'm not quite sure about that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
alan910127 commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2019692309 ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; -logical_plan TableScan: t projection=[a], full_filters=[CAST(t.a AS Utf8) = Utf8("999")] + Review Comment: I thought it's because it returns a `plan_err!()` when the cast fails, but I'm not quite sure about that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
jayzhan211 commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2019668440 ## datafusion/sqllogictest/test_files/push_down_filter.slt: ## @@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a], full_filters=[t.a != Int32(100)] query TT explain select a from t where a = '999'; -logical_plan TableScan: t projection=[a], full_filters=[CAST(t.a AS Utf8) = Utf8("999")] + Review Comment: why is there no plan? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix!: incorrect coercion when comparing with string literals [datafusion]
alan910127 commented on code in PR #15482: URL: https://github.com/apache/datafusion/pull/15482#discussion_r2019310584 ## datafusion/core/tests/expr_api/mod.rs: ## @@ -330,12 +330,12 @@ async fn test_create_physical_expr_coercion() { create_expr_test(lit(1i32).eq(col("id")), "CAST(1 AS Utf8) = id@0"); // compare int col to string literal `i = '202410'` // Note this casts the column (not the field) -create_expr_test(col("i").eq(lit("202410")), "CAST(i@1 AS Utf8) = 202410"); -create_expr_test(lit("202410").eq(col("i")), "202410 = CAST(i@1 AS Utf8)"); +create_expr_test(col("i").eq(lit("202410")), "i@1 = 202410"); +create_expr_test(lit("202410").eq(col("i")), "202410 = i@1"); // however, when simplified the casts on i should removed // https://github.com/apache/datafusion/issues/14944 -create_simplified_expr_test(col("i").eq(lit("202410")), "CAST(i@1 AS Utf8) = 202410"); -create_simplified_expr_test(lit("202410").eq(col("i")), "CAST(i@1 AS Utf8) = 202410"); +create_simplified_expr_test(col("i").eq(lit("202410")), "i@1 = 202410"); +create_simplified_expr_test(lit("202410").eq(col("i")), "i@1 = 202410"); Review Comment: not sure if this test is still needed since the literal casting behavior is not considered an "optimization" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org