Trikooo commented on code in PR #20012:
URL: https://github.com/apache/datafusion/pull/20012#discussion_r2764082086


##########
datafusion/expr/src/expr_schema.rs:
##########
@@ -598,13 +598,32 @@ impl ExprSchemable for Expr {
                         )
                     })?;
 
-                let arguments = args
+                let coerced_values: Vec<Option<ScalarValue>> = args
                     .iter()
-                    .map(|e| match e {
-                        Expr::Literal(sv, _) => Some(sv),
-                        _ => None,
+                    .zip(new_fields.iter())
+                    .map(|(expr, field)| {
+                        let mut current_expr = expr;
+
+                        // Loop to remove all casting layers
+                        loop {
+                            match current_expr {
+                                Expr::Cast(cast) => current_expr = &cast.expr,
+                                Expr::TryCast(cast) => current_expr = 
&cast.expr,
+                                _ => break,
+                            }
+                        }
+
+                        let literal = if let Expr::Literal(sv, _) = 
current_expr {
+                            Some(sv)
+                        } else {
+                            None
+                        };
+
+                        literal.map(|sv| 
sv.cast_to(field.data_type())).transpose()

Review Comment:
   I've spent some time with DuckDB to see how it handles this and here is what 
I found during the instrumentation of these 3 scenarios:
   
   ### 1: 
   Query: `SELECT row(42)` 
   Note: `row` accepts ANY type so no implicit cast will happen here. 
   
   ```text
   [PHASE] 1. Expression Binding (Children Binding)
     Child 0 Bound As: BOUND_CONSTANT
     
   [PHASE] 2. Function Binding (Resolution)
     Candidate Function: row
     
   [PHASE] 3. Bind Callback (Return Type Calculation)
     -> Calling row bind callback...
       [CALLBACK] Inside struct_pack bind callback
         Argument 0:
           Type: INTEGER
           Expression Class: BOUND_CONSTANT
           -> Note: This is a Constant (Literal)
     -> Callback finished. Return Type determined as: STRUCT(INTEGER)
     
   [PHASE] 4. Argument Casting (Implicit)
   ```
   
   ### 2:
   
   Query: `SELECT row(CAST(42 AS BIGINT));`
   ```text
   [BINDER] Bound EXPLICIT CAST to BIGINT
   [PHASE] 1. Expression Binding (Children Binding)
     Child 0 Bound As: BOUND_CAST (Explicit Cast Wrapper)
     
   [PHASE] 2. Function Binding (Resolution)
     Candidate Function: row
     
   [PHASE] 3. Bind Callback (Return Type Calculation)
     -> Calling row bind callback...
       [CALLBACK] Inside struct_pack bind callback
         Argument 0:
           Type: BIGINT
           Expression Class: BOUND_CAST
           -> Note: This IS a BoundCastExpression (Explicit Cast)
     -> Callback finished. Return Type determined as: STRUCT(BIGINT)
     
   [PHASE] 4. Argument Casting (Implicit)
   ```
   
   ### 3: 
   
   Query: `SELECT substring('test', 1, 1);` 
   Note: `substring` expects `BIGINT` but we passed `INTEGER` here. 
   ```text
   [PHASE] 1. Expression Binding (Children Binding)
     Child 0 Bound As: BOUND_CONSTANT
     Child 1 Bound As: BOUND_CONSTANT
     Child 2 Bound As: BOUND_CONSTANT
     
   [PHASE] 2. Function Binding (Resolution)
     Candidate Function: substring
     
   [PHASE] 3. No Custom Bind Callback (Using Standard Return Type)
     Return Type: VARCHAR
     
   [PHASE] 4. Argument Casting (Implicit)
       [CAST] Materializing IMPLICIT Cast: INTEGER -> BIGINT
       [CAST] Materializing IMPLICIT Cast: INTEGER -> BIGINT
   ```
   
   
   
   Basically,  DuckDB is materializing explicit casts → calculating 
`return_type` → adds implicit casts. 
   
   - The return_type is calculated once and attached. 
   - When calculating the return type, the explicit casts are 
`BoundCastExpression` which is basically a cast expression with a validated 
target type, (they go from `CastExpression` to `BoundCastExpression` in the 
preceding phase). 
   
   
   I think this validates your suggestions @Jefffrey, so unless we agree on a 
temporary fix, I don't think this should get merged as is (unless we remove 
cast checks), I'm willing to take on the more involved approach of fixing this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to