Fokko commented on a change in pull request #26644: [SPARK-30004][SQL] Allow merge UserDefinedType into a native DataType URL: https://github.com/apache/spark/pull/26644#discussion_r350673743
########## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/types/TestUDT.scala ########## @@ -58,4 +63,22 @@ object TestUDT { override def equals(other: Any): Boolean = other.isInstanceOf[MyDenseVectorUDT] } + + private[sql] class MyXMLGregorianCalendarUDT extends UserDefinedType[XMLGregorianCalendar] { + override def sqlType: DataType = TimestampType + + override def serialize(obj: XMLGregorianCalendar): Any = + obj.toGregorianCalendar.getTimeInMillis * 1000 + + override def deserialize(datum: Any): XMLGregorianCalendar = { + val calendar = new GregorianCalendar + calendar.setTimeInMillis(datum.asInstanceOf[Long]) + DatatypeFactory.newInstance.newXMLGregorianCalendar(calendar) + } + + override def userClass: Class[XMLGregorianCalendar] = classOf[XMLGregorianCalendar] + + // By setting this to a timestamp, we lose the information about the udt + override private[sql] def jsonValue: JValue = "timestamp" Review comment: With the normal UDT `jsonValue`, we would get: ``` override private[sql] def jsonValue: JValue = { ("type" -> "udt") ~ ("class" -> this.getClass.getName) ~ ("pyClass" -> pyUDT) ~ ("sqlType" -> sqlType.jsonValue) ``` Which will write the type as the UDT. If you try to read the column later on in another job where the UDTRegistration hasn't been done will give an error. Therefore we would like to write the XMLGregorianCalendar as a normal timestamp. However, when we append the table, we want to be able to merge the `XMLGregorianCalendar` UDT into the timestamp. Without the added rule this wouldn't be possible. Hope this helps. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org