Fernando Pereira created SPARK-22771:
----------------------------------------

             Summary: SQL concat for binary 
                 Key: SPARK-22771
                 URL: https://issues.apache.org/jira/browse/SPARK-22771
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.2.1
            Reporter: Fernando Pereira
            Priority: Minor


spark.sql {{concat}}  function automatically casts arguments to StringType and 
returns a String.
This might be the behavior of traditional databases, however in Spark there's 
Binary as a standard type, and concat'ing binary seems reasonable if it returns 
another binary sequence.

Taking the example of, e.g. Python where both {{bytes}} and {{unicode}} 
represent text, by concat'ing both we end up with the same type as the 
arguments, and in case they are intermixed (str + unicode) the most generic 
type is returned (unicode).

Following the same principle, I believe that when concat'ing binary it would 
make sense to return a binary. 
In terms of Spark behavior, it would affect only the case when all arguments 
are binary. All other cases chould remain unchanged.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to