LiaCastaneda commented on code in PR #19856:
URL: https://github.com/apache/datafusion/pull/19856#discussion_r2821386586


##########
datafusion/substrait/src/logical_plan/consumer/utils.rs:
##########
@@ -396,10 +433,25 @@ impl NameTracker {
         &mut self,
         expr: Expr,
     ) -> datafusion::common::Result<Expr> {
-        match self.get_unique_name(expr.name_for_alias()?) {
-            (_, NameTrackerStatus::NeverSeen) => Ok(expr),
-            (name, NameTrackerStatus::SeenBefore) => Ok(expr.alias(name)),
+        if !self.would_conflict(&expr) {
+            self.insert(&expr);
+            return Ok(expr);
         }
+
+        // Name collision - need to generate a unique alias
+        let schema_name = expr.schema_name().to_string();
+        let mut counter = 0;
+        let candidate_name = loop {
+            let candidate_name = format!("{schema_name}__temp__{counter}");

Review Comment:
   There is also some logic to rename aliases to make them unique (used for 
avoiding duplicate names in join schemas 
[here](https://github.com/apache/datafusion/blob/468b690d71350bc19c7e7bafd5dc61800973d91e/datafusion/expr/src/logical_plan/builder.rs#L1591)
 and 
[here](https://github.com/apache/datafusion/blob/468b690d71350bc19c7e7bafd5dc61800973d91e/datafusion/expr/src/logical_plan/builder.rs#L1760))
 This generates plans with `:N` suffixes like 
[this](https://github.com/apache/datafusion/blob/468b690d71350bc19c7e7bafd5dc61800973d91e/datafusion/substrait/tests/cases/consumer_integration.rs#L614),
 but this operates on Arrow `Fields` rather than `Expr`, so it can't be easily 
unified with the `__temp__` mechanism. Maybe a future consistency improvement 
could standardize on one naming convention (using `__temp__{N}` everywhere), 
though probably the current distinction may be intentional (`__temp__` = 
substrait conversion, `:N` = standard deduplication)? 
   (I'm not suggesting any change with this, it's an open question if it makes 
sense)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to