albertlockett commented on PR #22647:
URL: https://github.com/apache/datafusion/pull/22647#issuecomment-4606772028

   > Can we have an SLT case to prove it works end to end (to ensure it works 
with the signature code too)
   
   @Jefffrey if I'm not mistaken, we already have SLT tests for this:
   
https://github.com/apache/datafusion/blob/00c35d0c1e7dc5c1acd5ee7c0b16f1da047faf24/datafusion/sqllogictest/test_files/functions.slt#L513-L516
   
https://github.com/apache/datafusion/blob/00c35d0c1e7dc5c1acd5ee7c0b16f1da047faf24/datafusion/sqllogictest/test_files/functions.slt#L433-L436
   
   Invoking these functions doesn't fail when we invoke them on dictionary 
columns using the sql API. For example:
   ```rust
   use arrow::{
       array::{DictionaryArray, RecordBatch, StringArray, UInt8Array},
       util::pretty::print_batches,
   };
   use arrow_schema::{DataType, Field, Schema};
   use datafusion::{catalog::MemTable, functions::expr_fn::lower, 
prelude::SessionContext};
   use datafusion::{logical_expr::col, physical_expr::create_physical_expr};
   use datafusion_common::DFSchema;
   use std::sync::Arc;
   
   #[tokio::main]
   async fn main() {
       let schema = Arc::new(Schema::new(vec![
           Field::new("a", DataType::Utf8, false),
           Field::new(
               "b",
               DataType::Dictionary(Box::new(DataType::UInt8), 
Box::new(DataType::Utf8)),
               false,
           ),
       ]));
   
       let rb = RecordBatch::try_new(
           schema.clone(),
           vec![
               Arc::new(StringArray::from_iter_values(["a", "aa", "b"])),
               Arc::new(DictionaryArray::new(
                   UInt8Array::from_iter_values([0, 1, 0]),
                   Arc::new(StringArray::from_iter_values(["a", "B"])),
               )),
           ],
       )
       .unwrap();
   
       // -- Invoking through SQL API works ---
       let ctx = SessionContext::new();
       ctx.register_table(
           "tab1",
           Arc::new(MemTable::try_new(schema.clone(), 
vec![vec![rb.clone()]]).unwrap()),
       )
       .unwrap();
       let df = ctx.sql("select lower(b) from tab1").await.unwrap();
       let batches = df.collect().await.unwrap();
       print_batches(&batches).unwrap();
   
       // --- Invoking through expr API, does not work ---
       let physical_expr = create_physical_expr(
           &lower(col("b")),
           &DFSchema::try_from(schema).unwrap(),
           &ctx.state().execution_props(),
       )
       .unwrap();
   
       let expr_result = physical_expr.evaluate(&rb).unwrap(); // panics
       println!("{:?}", expr_result);
   }
   ```
   
   I think this is because when planning, we coerce the field to Utf8View. The 
`explain` for that SQL query:
   ```
   
+---------------+----------------------------------------------------------------------+
   | plan_type     | plan                                                       
          |
   
+---------------+----------------------------------------------------------------------+
   | logical_plan  | Projection: lower(CAST(tab1.b AS Utf8View))                
          |
   |               |   TableScan: tab1 projection=[b]                           
          |
   | physical_plan | ProjectionExec: expr=[lower(CAST(b@0 AS Utf8View)) as 
lower(tab1.b)] |
   |               |   DataSourceExec: partitions=1, partition_sizes=[1]        
          |
   |               |                                                            
          |
   
+---------------+----------------------------------------------------------------------+
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to