Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20024#discussion_r159589126
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
    @@ -203,9 +203,26 @@ case class Cast(child: Expression, dataType: DataType, 
timeZoneId: Option[String
       // UDFToString
       private[this] def castToString(from: DataType): Any => Any = from match {
         case BinaryType => buildCast[Array[Byte]](_, UTF8String.fromBytes)
    +    case StringType => buildCast[UTF8String](_, identity)
         case DateType => buildCast[Int](_, d => 
UTF8String.fromString(DateTimeUtils.dateToString(d)))
    --- End diff --
    
    we may covert a string to `UTF8String` and then convert it back, which is 
inefficient. I think we should create a special `StringBuilder` for 
`UTF8String`, e.g.
    ```
    class UTF8StringBuilder {
      public void append(UTF8String str)
    }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to