[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555151#comment-14555151 ]
Justin Uang commented on SPARK-7768: ------------------------------------ Agreed. For example, we wanted to add a ZonedDateTimeUDT, so on the python side we ended up having to create a wrapper class called ZonedDateTime, which was quite unfortunate. {code} class ZonedDateTime(object): """ Wrapper class for datetime """ __UDT__ = ZonedDateTimeUDT() def __init__(self, dt): self.dt = dt def __repr__(self): return "ZonedDateTime({})".format(self.dt.__repr__()) def __eq__(self, other): return type(self) == type(other) and self.__dict__ == other.__dict__ def __ne__(self, other): return not self.__eq__(other) {code} The only tradeoff is that now, a specific java/python type might map to multiple UDTs, and so we might run into issues where we can't infer unambiguously what the catalyst types are. > Make user-defined type (UDT) API public > --------------------------------------- > > Key: SPARK-7768 > URL: https://issues.apache.org/jira/browse/SPARK-7768 > Project: Spark > Issue Type: New Feature > Components: SQL > Reporter: Xiangrui Meng > Priority: Critical > > As the demand for UDTs increases beyond sparse/dense vectors in MLlib, it > would be nice to make the UDT API public in 1.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org