> It's unclear to me that this needs to be introduced into the top-level

Similar thing to columnar format, How to store interval like 1 month 1 day 1 
hour? It’s not possible to do it without converting 1 month to 30 days, which 
is a bad way.

> On 17 Feb 2021, at 21:02, Wes McKinney <wesmck...@gmail.com> wrote:
> 
> It's unclear to me that this needs to be introduced into the top-level
> columnar format without more analysis — have you considered
> implementing this for DataFusion as an extension type for the time
> being?
> 
> On Wed, Feb 17, 2021 at 11:59 AM t...@dmtry.me <mailto:t...@dmtry.me> 
> <t...@dmtry.me <mailto:t...@dmtry.me>> wrote:
>> 
>> Hi,
>> 
>> For now, There are only two types of IntervalUnit inside Arrow:
>> 
>> - YearMonth - month stored as int32
>> - DayTime - days as int32 and time in milliseconds  as in32. Total (64 bites)
>> 
>> Since DF is using Arrow, It’s not possible to store “Complex” intervals such 
>> 1 MONTH 1 DAY 1 HOUR.
>> I think, the best way to understand the problem will be to read a comment 
>> from DF codebase: 
>> https://github.com/apache/arrow/blob/bca7d2fe84ccd8fc1129cb4d85448eb0779c52c3/rust/datafusion/src/sql/planner.rs#L1148
>> 
>>        // Interval is tricky thing
>>        // 1 day is not 24 hours because timezones, 1 year != 365/364! 30 
>> days != 1 month
>>        // The true way to store and calculate intervals is to store it as it 
>> defined
>>        // Due the fact that Arrow supports only two types YearMonth (month) 
>> and DayTime (day, time)
>>        // It's not possible to store complex intervals
>>        // It's possible to do select (NOW() + INTERVAL '1 year') + INTERVAL 
>> '1 day'; as workaround
>>        if result_month != 0 && (result_days != 0 || result_millis != 0) {
>>            return Err(DataFusionError::NotImplemented(format!(
>>                "DF does not support intervals that have both a Year/Month 
>> part as well as Days/Hours/Mins/Seconds: {:?}. Hint: try breaking the 
>> interval into two parts, one with Year/Month and the other with 
>> Days/Hours/Mins/Seconds - e.g. (NOW() + INTERVAL '1 year') + INTERVAL '1 
>> day'",
>>                value
>>            )));
>>        }
>> 
>> 
>> 
>> I prepared a PR https://github.com/apache/arrow/pull/9516/files 
>> <https://github.com/apache/arrow/pull/9516/files> 
>> <https://github.com/apache/arrow/pull/9516/files 
>> <https://github.com/apache/arrow/pull/9516/files>> that introduce a new type 
>> for IntervalUnit called Complex, that store both YearMonth and DayTime to 
>> support complex interval.
>> I didn’t find any page/documentation on how to do RFC in Arrow protocol, so 
>> can anyone point me to it or PR with email will be enough?
>> 
>> Thanks.

Reply via email to