FYI, org.apache.spark.unsafe.types.CalendarInterval is undocumented in both scaladoc/javadoc (entire unsafe module) but org.apache.spark.sql.types.CalendarIntervalType is exposed ( https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.types.CalendarIntervalType )
+1 for starting the discussion after 2.4.0. I would suggest defer, as I said in the PR again. 2018년 7월 29일 (일) 오후 6:58, Daniel Mateus Pires <dmate...@gmail.com>님이 작성: > Sounds good! @Xiao > > @Reynold AFAIK the only data type that is valid to cast to Calendar > Interval is VARCHAR > > here is Postgres: > > postgres=# select CAST(CAST(interval '1 hour' AS varchar) AS interval); > interval > ---------- > 01:00:00 > (1 row) > > (snippet comes from the JIRA) > > Thanks, > > Daniel > > > On 27 July 2018 at 20:38, Xiao Li <gatorsm...@gmail.com> wrote: > >> The code freeze of the upcoming release Spark 2.4 is very close. How >> about revisiting this and explicitly defining the support scope >> of CalendarIntervalType in the next release (Spark 3.0)? >> >> Thanks, >> >> Xiao >> >> >> 2018-07-27 10:45 GMT-07:00 Reynold Xin <r...@databricks.com>: >> >>> CalendarInterval is definitely externally visible. >>> >>> E.g. sql("select interval 1 day").dtypes would return "Array[(String, >>> String)] = Array((interval 1 days,CalendarIntervalType))" >>> >>> However, I'm not sure what it means to support casting. What are the >>> semantics for casting from any other data type to calendar interval? I can >>> see string casting and casting from itself, but not any other data types. >>> >>> >>> >>> >>> On Fri, Jul 27, 2018 at 10:34 AM Daniel Mateus Pires <dmate...@gmail.com> >>> wrote: >>> >>>> Hi Sparkers! (maybe Sparkles ?) >>>> >>>> I just wanted to bring up the apparently ?controversial? Calendar >>>> Interval topic. >>>> >>>> I worked on: https://issues.apache.org/jira/browse/SPARK-24702, >>>> https://github.com/apache/spark/pull/21706 >>>> >>>> The user was reporting an unexpected behaviour where he/she wasn’t able >>>> to cast to a Calendar Interval type. >>>> >>>> In the current version of Spark the following code works: >>>> >>>> scala> spark.sql("SELECT 'interval 1 hour' as >>>> a").select(col("a").cast("calendarinterval")).show()+----------------+| >>>> a|+----------------+|interval 1 hours|+----------------+ >>>> >>>> >>>> While the following doesn’t: >>>> spark.sql("SELECT CALENDARINTERVAL('interval 1 hour') as a").show() >>>> >>>> >>>> Since the DataFrame API equivalent of the SQL worked, I thought adding >>>> it would be an easy decision to make (to make it consistent) >>>> >>>> However, I got push-back on the PR on the basis that “*we do not plan >>>> to expose Calendar Interval as a public type*” >>>> Should there be a consensus on either cleaning up the public DataFrame >>>> API out of CalendarIntervalType OR making it consistent with the SQL ? >>>> >>>> -- >>>> Best regards, >>>> Daniel Mateus Pires >>>> Data Engineer @ Hudson's Bay Company >>>> >>> >> >