Ted-Jiang commented on code in PR #2794:
URL: https://github.com/apache/arrow-datafusion/pull/2794#discussion_r907044238


##########
datafusion/expr/src/binary_rule.rs:
##########
@@ -185,6 +186,17 @@ fn comparison_order_coercion(
         .or_else(|| null_coercion(lhs_type, rhs_type))
 }
 
+fn string_numeric_coercion(lhs_type: &DataType, rhs_type: &DataType) -> 
Option<DataType> {
+    use arrow::datatypes::DataType::*;
+    match (lhs_type, rhs_type) {

Review Comment:
   I test in `748b6a65a5fa801595fd80a3c7b2728be3c9cdaa`(not this commit)
   
   ```
   explain select * from part where p_partkey in (1, 2, '3');
   
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                       
                                                                                
                                                                                
                                               |
   
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | logical_plan  | Projection: #part.p_partkey, #part.p_name, #part.p_mfgr, 
#part.p_brand, #part.p_type, #part.p_size, #part.p_container, 
#part.p_retailprice, #part.p_comment                                            
                                                                   |
   |               |   Filter: #part.p_partkey IN ([Int64(1), Int64(2), 
Utf8("3")])                                                                     
                                                                                
                                                       |
   |               |     TableScan: part projection=Some([p_partkey, p_name, 
p_mfgr, p_brand, p_type, p_size, p_container, p_retailprice, p_comment]), 
partial_filters=[#part.p_partkey IN ([Int64(1), Int64(2), Utf8("3")])]          
                                                        |
   | physical_plan | ProjectionExec: expr=[p_partkey@0 as p_partkey, p_name@1 
as p_name, p_mfgr@2 as p_mfgr, p_brand@3 as p_brand, p_type@4 as p_type, 
p_size@5 as p_size, p_container@6 as p_container, p_retailprice@7 as 
p_retailprice, p_comment@8 as p_comment]                           |
   |               |   CoalesceBatchesExec: target_batch_size=4096              
                                                                                
                                                                                
                                               |
   |               |     FilterExec: p_partkey@0 IN ([Literal { value: Int64(1) 
}, Literal { value: Int64(2) }, CastExpr { expr: Literal { value: Utf8("3") }, 
cast_type: Int64, cast_options: CastOptions { safe: false } }])                 
                                                |
   |               |       RepartitionExec: partitioning=RoundRobinBatch(16)    
                                                                                
                                                                                
                                               |
   |               |         ParquetExec: limit=None, 
partitions=[/Users/yangjiang/test-data/tpch-1g-oneFile/part/part-00000-3a3c2777-00d3-4c27-b917-4ff2145123dc-c000.snappy.parquet],
 projection=[p_partkey, p_name, p_mfgr, p_brand, p_type, p_size, p_container, 
p_retailprice, p_comment] |
   |               |                                                            
                                                                                
                                                                                
                                               |
   
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   ```
   
   `int, int,utf8` cast to -> `int, int, int`,
   
   In my opinion, after apply this patch it will get   int, int,utf8` cast to 
-> `utf8, utf8, utf8`
   I think when list_values_size is large, we will construct a hashSet in 
https://github.com/apache/arrow-datafusion/pull/2156,  change to `int` will get 
better performance in build hasSet, Am i right?  😄 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to