findepi commented on code in PR #12224:
URL: https://github.com/apache/datafusion/pull/12224#discussion_r1739649597


##########
datafusion/functions/src/string/concat.rs:
##########
@@ -64,13 +66,19 @@ impl ScalarUDFImpl for ConcatFunc {
         &self.signature
     }
 
-    fn return_type(&self, _arg_types: &[DataType]) -> Result<DataType> {
-        Ok(Utf8)
+    fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> {
+        use DataType::*;
+        Ok(match &arg_types[0] {
+            Utf8View => Utf8View,
+            LargeUtf8 => LargeUtf8,
+            _ => Utf8,
+        })

Review Comment:
   > when you say binary concat operator are you talking about `||` as an 
operator?
   
   yes
   
   
   > the logic seems to assume all arguments are of the same type?
   > 
   > also, why not always return `Utf8`? the code performing actual 
concatenation seems to be always the same.
   
   the question still holds (why exactly we bias towards the first param type), 
but i am no longer convinced about my suggestion to use Utf8 always.
   
   i think we should "just" make sure `concat(a, b, c)`  is type-equivalent to 
`a || b || c`.
   the `||`'s logic apparently is
   - if any of the operands are Utf8View, the result is Utf8View
   - else, if any of the operands are LargeUtf8, the result is LargeUtf8
   
   cc @alamb 
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to