My intuition is that this would not be as easy as decimal_v2 to put
behind a flag. Is that right?



On Wed, Nov 1, 2017 at 2:26 PM, Greg Rahn <greg.r...@gmail.com> wrote:
> It might not be that simple.
>
> One side-effect of the return type promotion of EXTRACT and DATE_PART() is
> that INSERTs that use either could potentially fail after this change
> simply because of the type mismatch and the fact that Impala does not do
> implicit type casting even if the value could fit.
>
> For example:
> create table t(i int);  -- assume this is a destination for any EXTRACT or
> DATE_PART() value
> insert into t values( cast(10 as bigint));  -- emulating BIGINT return type
> ERROR: Expression 'cast(10 as bigint)' (type: BIGINT) would need to be cast
> to INT for column 'I'
>
> The solution here is an obvious CAST because w/o the nanosecond support, no
> values can overflow INT, but that still requires changes from end users.
>
> Similarly with #2, there is a value change.  Arguably a good change, but a
> change nonetheless.  Again, there is a trivial solution to provide an
> equivalent expression that returns the legacy value, but again, a change.
>
> Based on this I would suggest we apply this change to the 3.0 branch and
> make sure we call out both behavioral changes in the 3.0 docs.
>
>
>
> On Tue, Oct 31, 2017 at 2:48 PM, Zachary Amsden <zams...@cloudera.com>
> wrote:
>
>> In discussion about IMPALA-5607, it came up that implementing this as
>> described is a compatibility breaking change.  There are a couple of
>> required changes:
>>
>> 1) Return data type for date_part and EXTRACT FROM must be promoted to
>> BIGINT.
>> 2) MILLISECONDS now will include seconds part in the calculation of
>> milliseconds.
>>
>> I think the first is a non-issue; we're promoting the type to be wide
>> enough to hold nanoseconds precision values (with the seconds component
>> included).  An alternative could be to return this as a decimal type, but
>> that seems rather unwieldy for other date expressions so I'd prefer these
>> values to all be returned as integral types.
>>
>> The bigger issue is including seconds in the calculation of milliseconds,
>> microseconds and nanoseconds breaks the existing value returned for
>> milliseconds, which is just bare milliseconds with no seconds component.
>> For compatibility with other SQL implementations, I think we'd like to
>> include seconds with all of these date parts, but that is certainly
>> debatable.
>>
>> The question then, is anyone relying on this functionality that can't
>> easily workaround such a change?  The Impala documentation doesn't specify
>> this behavior either way, and there isn't a formal specification for how
>> sub-second granularity time is handled.  Whatever we decide, we should
>> document this going forward.
>>
>>  - Zach
>>

Reply via email to