cloud-fan commented on code in PR #37483:
URL: https://github.com/apache/spark/pull/37483#discussion_r954014236


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala:
##########
@@ -2487,59 +2538,117 @@ case class Encode(value: Expression, charset: 
Expression)
   """,
   since = "3.3.0",
   group = "string_funcs")
-// scalastyle:on line.size.limit
-case class ToBinary(
-    expr: Expression,
-    format: Option[Expression],
-    nullOnInvalidFormat: Boolean = false) extends RuntimeReplaceable
-  with ImplicitCastInputTypes {
-
-  override lazy val replacement: Expression = format.map { f =>
-    assert(f.foldable && (f.dataType == StringType || f.dataType == NullType))
-    val value = f.eval()
-    if (value == null) {
-      Literal(null, BinaryType)
-    } else {
-      value.asInstanceOf[UTF8String].toString.toLowerCase(Locale.ROOT) match {
-        case "hex" => Unhex(expr)
-        case "utf-8" => Encode(expr, Literal("UTF-8"))
-        case "base64" => UnBase64(expr)

Review Comment:
   I know this PR is for fixing `to_binary`, but I'm wondering if these 3 
functions have wrong behaviors as well?
   
   As a Spark user, I'd expect `to_binary(input, 'hex')` should have the same 
behavior as `hex(input)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to