nuno-faria opened a new issue, #18818:
URL: https://github.com/apache/datafusion/issues/18818

   ### Describe the bug
   
   When enabling the `expand_views_at_output` config to convert `UTF8View` to 
`UTF8Large`, the names of the converted columns change, being prefixed with the 
relation name. I think the cause is that a `CAST` is added to change the type, 
meaning `Expr::qualified_name` will return "table.column" instead of just 
column:
   
   ```rust
   // when we have a CAST we end up at the last match arm
   pub fn qualified_name(&self) -> (Option<TableReference>, String) {
       match self {
           Expr::Column(Column {
               relation,
               name,
               spans: _,
           }) => (relation.clone(), name.clone()),
           Expr::Alias(Alias { relation, name, .. }) => (relation.clone(), 
name.clone()),
           _ => (None, self.schema_name().to_string()),
       }
   }
   
   // which in turn calls
   SchemaDisplay(self)
   
   // which for cast simply calls SchemaDisplay(self) of the inner expression
   Expr::Cast(Cast { expr, .. }) | Expr::TryCast(TryCast { expr, .. }) => {
       write!(f, "{}", SchemaDisplay(expr))
   }
   
   // which for Column calls
   impl fmt::Display for Column {
       fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
           write!(f, "{}", self.flat_name())
       }
   }
   
   // which includes the relation + name, unlike the original qualified_name 
for a regular Column
   ```
   
   I think one approach would be to update `qualified_name` and adding a match 
for casts. I would be happy to fix this, if it is indeed a bug a not expected 
behavior.
   
   ### To Reproduce
   
   ```rust
   use datafusion::error::Result;
   use datafusion::prelude::{ParquetReadOptions, SessionContext};
   
   #[tokio::main]
   async fn main() -> Result<()> {
       let ctx = SessionContext::new();
       ctx.sql("copy (select 1 as k, 'a' as v) to 't.parquet'")
           .await?
           .collect()
           .await?;
       ctx.register_parquet("t", "t.parquet", ParquetReadOptions::new())
           .await?;
   
       let df = ctx.sql("select * from t").await?;
       df.clone().show().await?;
       println!("{:?}", df.collect().await?[0].schema());
   
       ctx.sql("set datafusion.optimizer.expand_views_at_output = true")
           .await?
           .collect()
           .await?;
   
       let df = ctx.sql("select * from t").await?;
       df.clone().show().await?;
       println!("{:?}", df.collect().await?[0].schema());
   
       Ok(())
   }
   ```
   
   `k` remains the same but `v` changes:
   ```
   +---+---+
   | k | v |
   +---+---+
   | 1 | a |
   +---+---+
   Schema { fields: [Field { name: "k", data_type: Int64 }, Field { name: "v", 
data_type: Utf8View }], metadata: {} }
   +---+-----+
   | k | t.v |
   +---+-----+
   | 1 | a   |
   +---+-----+
   Schema { fields: [Field { name: "k", data_type: Int64 }, Field { name: 
"t.v", data_type: LargeUtf8 }], metadata: {} }
   ```
   
   ### Expected behavior
   
   Maintaining the original column names.
   
   ### Additional context
   
   Tested on main.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to