Hi Sparkers! (maybe Sparkles ?)

I just wanted to bring up the apparently ?controversial? Calendar Interval 
topic.

I worked on: https://issues.apache.org/jira/browse/SPARK-24702 
<https://issues.apache.org/jira/browse/SPARK-24702>, 
https://github.com/apache/spark/pull/21706 
<https://github.com/apache/spark/pull/21706>

The user was reporting an unexpected behaviour where he/she wasn’t able to cast 
to a Calendar Interval type.

In the current version of Spark the following code works:
scala> spark.sql("SELECT 'interval 1 hour' as 
a").select(col("a").cast("calendarinterval")).show()
+----------------+
|               a|
+----------------+
|interval 1 hours|
+----------------+

While the following doesn’t:
spark.sql("SELECT CALENDARINTERVAL('interval 1 hour') as a").show()


Since the DataFrame API equivalent of the SQL worked, I thought adding it would 
be an easy decision to make (to make it consistent)

However, I got push-back on the PR on the basis that “we do not plan to expose 
Calendar Interval as a public type”
Should there be a consensus on either cleaning up the public DataFrame API out 
of CalendarIntervalType OR making it consistent with the SQL ?

--
Best regards,
Daniel Mateus Pires
Data Engineer @ Hudson's Bay Company

Reply via email to