[ https://issues.apache.org/jira/browse/IMPALA-8231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gabor Kaszab updated IMPALA-8231: --------------------------------- Description: Impala allows e.g. having multiple year sections in a datetime pattern. {code:java} select to_timestamp('2018-21-01-01', 'yyyy-yy-MM-dd'); +------------------------------------------------+ │······················ | to_timestamp('2018-21-01-01', 'yyyy-yy-mm-dd') | │······················ +------------------------------------------------+ │······················ | 2021-01-01 00:00:00 | │······················ +------------------------------------------------+ {code} Here even the result is something weird: {code:java} select to_timestamp('21-2018-01-01', 'yy-yyyy-MM-dd'); +------------------------------------------------+ │······················ | to_timestamp('21-2018-01-01', 'yy-yyyy-mm-dd') | │······················ +------------------------------------------------+ │······················ | 3918-01-01 00:00:00 | │······················ +------------------------------------------------+ {code} I think having the mentioned patterns in a from_timestamp() is fine as that wouldn't make any inconsistencies in the result. However, in a to_timestamp() it's ambiguous which section to use for populating e.g. the year part of a timestamp. In that case I think returning an error is reasonable. +This proposal is in line with what Oracle does:+ Oracle forbids the same: {code:java} select to_timestamp('2018-19-11-19', 'YYYY-YY-MM-DD') from DUAL; ORA-01812: year may only be specified once {code} But Oracle allows the same format for conversions the other way around: {code:java} select to_char( to_timestamp('2018-11-19', 'YYYY-MM-DD'), 'YYYY-YY-MM-DD') from DUAL; 2018-18-11-19 {code} Note, that this issue is also true for any other datetime pattern element as there is no duplicate or conflict check during parsing. was: Impala allows e.g. having multiple year sections in a datetime pattern. {code:java} select to_timestamp('2018-21-01-01', 'yyyy-yy-MM-dd'); +------------------------------------------------+ │······················ | to_timestamp('2018-21-01-01', 'yyyy-yy-mm-dd') | │······················ +------------------------------------------------+ │······················ | 2021-01-01 00:00:00 | │······················ +------------------------------------------------+ {code} Here even the result is something weird: {code:java} select to_timestamp('21-2018-01-01', 'yy-yyyy-MM-dd'); +------------------------------------------------+ │······················ | to_timestamp('21-2018-01-01', 'yy-yyyy-mm-dd') | │······················ +------------------------------------------------+ │······················ | 3918-01-01 00:00:00 | │······················ +------------------------------------------------+ {code} I think having the mentioned patterns in a from_timestamp() is fine as that wouldn't make any inconsistencies in the result. However, in a to_timestamp() it's ambiguous which section to use for populating e.g. the year part of a timestamp. In that case I think returning an error is reasonable. Oracle forbids the same: {code:java} select to_timestamp('2018-19-11-19', 'YYYY-YY-MM-DD') from DUAL; ORA-01812: year may only be specified once {code} Note, that this issue is also true for any other datetime pattern element as there is no duplicate or conflict check during parsing. > Impala allows ambiguous datetime patterns with to_timestamp > ----------------------------------------------------------- > > Key: IMPALA-8231 > URL: https://issues.apache.org/jira/browse/IMPALA-8231 > Project: IMPALA > Issue Type: Bug > Components: Backend > Affects Versions: Impala 3.1.0 > Reporter: Gabor Kaszab > Priority: Major > > Impala allows e.g. having multiple year sections in a datetime pattern. > {code:java} > select to_timestamp('2018-21-01-01', 'yyyy-yy-MM-dd'); > +------------------------------------------------+ > │······················ > | to_timestamp('2018-21-01-01', 'yyyy-yy-mm-dd') | > │······················ > +------------------------------------------------+ > │······················ > | 2021-01-01 00:00:00 | > │······················ > +------------------------------------------------+ > {code} > Here even the result is something weird: > {code:java} > select to_timestamp('21-2018-01-01', 'yy-yyyy-MM-dd'); > +------------------------------------------------+ > │······················ > | to_timestamp('21-2018-01-01', 'yy-yyyy-mm-dd') | > │······················ > +------------------------------------------------+ > │······················ > | 3918-01-01 00:00:00 | > │······················ > +------------------------------------------------+ > {code} > I think having the mentioned patterns in a from_timestamp() is fine as that > wouldn't make any inconsistencies in the result. However, in a to_timestamp() > it's ambiguous which section to use for populating e.g. the year part of a > timestamp. In that case I think returning an error is reasonable. > > +This proposal is in line with what Oracle does:+ > Oracle forbids the same: > {code:java} > select to_timestamp('2018-19-11-19', 'YYYY-YY-MM-DD') from DUAL; > ORA-01812: year may only be specified once > {code} > But Oracle allows the same format for conversions the other way around: > {code:java} > select to_char( to_timestamp('2018-11-19', 'YYYY-MM-DD'), > 'YYYY-YY-MM-DD') from DUAL; > 2018-18-11-19 > {code} > Note, that this issue is also true for any other datetime pattern element as > there is no duplicate or conflict check during parsing. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org