FYI, org.apache.spark.unsafe.types.CalendarInterval is undocumented in both
scaladoc/javadoc (entire unsafe module)
but org.apache.spark.sql.types.CalendarIntervalType is exposed (
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.types.CalendarIntervalType
)

+1 for starting the discussion after 2.4.0. I would suggest defer, as I
said in the PR again.

2018년 7월 29일 (일) 오후 6:58, Daniel Mateus Pires <dmate...@gmail.com>님이 작성:

> Sounds good! @Xiao
>
> @Reynold AFAIK the only data type that is valid to cast to Calendar
> Interval is VARCHAR
>
> here is Postgres:
>
> postgres=# select CAST(CAST(interval '1 hour' AS varchar) AS interval);
>  interval
> ----------
>  01:00:00
> (1 row)
>
> (snippet comes from the JIRA)
>
> Thanks,
>
> Daniel
>
>
> On 27 July 2018 at 20:38, Xiao Li <gatorsm...@gmail.com> wrote:
>
>> The code freeze of the upcoming release Spark 2.4 is very close. How
>> about revisiting this and explicitly defining the support scope
>> of CalendarIntervalType in the next release (Spark 3.0)?
>>
>> Thanks,
>>
>> Xiao
>>
>>
>> 2018-07-27 10:45 GMT-07:00 Reynold Xin <r...@databricks.com>:
>>
>>> CalendarInterval is definitely externally visible.
>>>
>>> E.g. sql("select interval 1 day").dtypes would return "Array[(String,
>>> String)] = Array((interval 1 days,CalendarIntervalType))"
>>>
>>> However, I'm not sure what it means to support casting. What are the
>>> semantics for casting from any other data type to calendar interval? I can
>>> see string casting and casting from itself, but not any other data types.
>>>
>>>
>>>
>>>
>>> On Fri, Jul 27, 2018 at 10:34 AM Daniel Mateus Pires <dmate...@gmail.com>
>>> wrote:
>>>
>>>> Hi Sparkers! (maybe Sparkles ?)
>>>>
>>>> I just wanted to bring up the apparently ?controversial? Calendar
>>>> Interval topic.
>>>>
>>>> I worked on: https://issues.apache.org/jira/browse/SPARK-24702,
>>>> https://github.com/apache/spark/pull/21706
>>>>
>>>> The user was reporting an unexpected behaviour where he/she wasn’t able
>>>> to cast to a Calendar Interval type.
>>>>
>>>> In the current version of Spark the following code works:
>>>>
>>>> scala> spark.sql("SELECT 'interval 1 hour' as 
>>>> a").select(col("a").cast("calendarinterval")).show()+----------------+|    
>>>>            a|+----------------+|interval 1 hours|+----------------+
>>>>
>>>>
>>>> While the following doesn’t:
>>>> spark.sql("SELECT CALENDARINTERVAL('interval 1 hour') as a").show()
>>>>
>>>>
>>>> Since the DataFrame API equivalent of the SQL worked, I thought adding
>>>> it would be an easy decision to make (to make it consistent)
>>>>
>>>> However, I got push-back on the PR on the basis that “*we do not plan
>>>> to expose Calendar Interval as a public type*”
>>>> Should there be a consensus on either cleaning up the public DataFrame
>>>> API out of CalendarIntervalType OR making it consistent with the SQL ?
>>>>
>>>> --
>>>> Best regards,
>>>> Daniel Mateus Pires
>>>> Data Engineer @ Hudson's Bay Company
>>>>
>>>
>>
>

Reply via email to