Re: [PR] fix: `PushDownFilter` for `GROUP BY` on uppercase col names [datafusion]

via GitHub Mon, 11 Aug 2025 02:20:13 -0700


aditanase commented on code in PR #16049:
URL: https://github.com/apache/datafusion/pull/16049#discussion_r2266120950



##########
datafusion/optimizer/src/push_down_filter.rs:
##########
@@ -978,7 +978,11 @@ impl OptimizerRule for PushDownFilter {
                 let group_expr_columns = agg
                     .group_expr
                     .iter()
-                    .map(|e| 
Ok(Column::from_qualified_name(e.schema_name().to_string())))
+                    .map(|e| {
+                        Ok(Column::from_qualified_name_ignore_case(

Review Comment:
   @alamb thanks for taking a look. I agree the fix looks fishy, and would love 
to better understand the mechanics of case comparison. We can keep the tests 
and come up with a better fix that addresses your concerns.
   
   It boils down to the `ignore_case` flag which is not very intuitive, as 
setting to `true` ends up keeping the UPPERCASE, while setting to false will 
lowercase the column name from the gby expression:
   
https://github.com/apache/datafusion/blob/main/datafusion/common/src/utils/mod.rs#L298-L299
   
   The problem is introduced as we're 
[roundtripping](https://github.com/apache/datafusion/blob/main/datafusion/optimizer/src/push_down_filter.rs#L981)
 the correctly parsed column name
   ```
   Column { relation: Some(Bare { table: "test" }), name: "A" }
   ``` 
   through
   ```
   Column::from_qualified_name(e.schema_name().to_string())
   ```
   Which will lowercase it, as now there are no more double quotes around the 
field name. That's why I switched to `Column::from_qualified_name_ignore_case`, 
to avoid lowercasing as we're already starting from parsed/cased names.
   
   Very open to suggestions



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] fix: `PushDownFilter` for `GROUP BY` on uppercase col names [datafusion]

Reply via email to