[ https://issues.apache.org/jira/browse/IMPALA-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Csaba Ringhofer closed IMPALA-7723. ----------------------------------- Resolution: Invalid > Recognize int64 timestamps in CREATE TABLE LIKE PARQUET > ------------------------------------------------------- > > Key: IMPALA-7723 > URL: https://issues.apache.org/jira/browse/IMPALA-7723 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Reporter: Csaba Ringhofer > Priority: Minor > Labels: parquet > > IMPALA-5050 adds support for reading int64 encoded Parquet timestamps. These > columns have int64 physical type, and converted/logical types has to be used > to differentiate them from BIGINTs. These columns can be read both as BIGINTs > and TIMESTAMPs depending on the table's schema. > CREATE TABLE LIKE PARQUET could also convert these columns to TIMESTAMP > instead of BIGINT, but I decided to postpone adding this feature for two > reasons: > 1. It could break the following possible workflow: > - generate Parquet files (that contain int64 timestamps) with some tool > - use Impala's CREATE TABLE LIKE PARQUET + LOAD DATA to make it accessible as > a table > - run some queries that rely on interpreting these columns as integers > CAST (col as BIGINT) in the query would make this even worse, as it would > convert timestamp to unix time in seconds instead of micros/millis without > any warning. > 2. Adding support for int64 timestamps with nanoseconds precision will need > Impala's parquet-hadoop-bundle dependency to be bumped to a new major > version, which may contain incompatible API changes. > Note that parquet-hadoop-bundle is only used in CREATE TABLE LIKE PARQUET. > The C++ parts of Impala only rely on parquet.thrift, which can be updated > more easily. -- This message was sent by Atlassian JIRA (v7.6.3#76005)