Fokko commented on a change in pull request #26644: [SPARK-30004][SQL] Allow 
merge UserDefinedType into a native DataType
URL: https://github.com/apache/spark/pull/26644#discussion_r350673743
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/types/TestUDT.scala
 ##########
 @@ -58,4 +63,22 @@ object TestUDT {
 
     override def equals(other: Any): Boolean = 
other.isInstanceOf[MyDenseVectorUDT]
   }
+
+  private[sql] class MyXMLGregorianCalendarUDT extends 
UserDefinedType[XMLGregorianCalendar] {
+    override def sqlType: DataType = TimestampType
+
+    override def serialize(obj: XMLGregorianCalendar): Any =
+      obj.toGregorianCalendar.getTimeInMillis * 1000
+
+    override def deserialize(datum: Any): XMLGregorianCalendar = {
+      val calendar = new GregorianCalendar
+      calendar.setTimeInMillis(datum.asInstanceOf[Long])
+      DatatypeFactory.newInstance.newXMLGregorianCalendar(calendar)
+    }
+
+    override def userClass: Class[XMLGregorianCalendar] = 
classOf[XMLGregorianCalendar]
+
+    // By setting this to a timestamp, we lose the information about the udt
+    override private[sql] def jsonValue: JValue = "timestamp"
 
 Review comment:
   With the normal UDT `jsonValue`, we would get:
   ```
     override private[sql] def jsonValue: JValue = {
       ("type" -> "udt") ~
         ("class" -> this.getClass.getName) ~
         ("pyClass" -> pyUDT) ~
         ("sqlType" -> sqlType.jsonValue)
   ```
   Which will write the type as the UDT. If you try to read the column later on 
in another job where the UDTRegistration hasn't been done will give an error. 
Therefore we would like to write the XMLGregorianCalendar as a normal 
timestamp. However, when we append the table, we want to be able to merge the 
`XMLGregorianCalendar` UDT into the timestamp. Without the added rule this 
wouldn't be possible. Hope this helps.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to