Bobeff created HIVE-13534: ----------------------------- Summary: xception when trying to access TIMESTAMP columns into parquet file using hive external table Key: HIVE-13534 URL: https://issues.apache.org/jira/browse/HIVE-13534 Project: Hive Issue Type: Bug Components: File Formats, Hive, Import/Export, JDBC Reporter: Bobeff Assignee: Sushanth Sowmyan Priority: Critical
Imported data was stored from a netezza datasource using a sqoop import command like this SQL DDL creation script of imported table looks like this CREATE TABLE "ADMIN"."MIS_AUX_ITR" ( "DDEBVAL" DATE, "DFINVAL" DATE, "NAUX" VARCHAR(6), "CDMNITR" VARCHAR(3), "CDERIMG" VARCHAR(1), "DDERIMG" DATE ); Import sqoop job is the following sqoop job --create import-name -- import --connect jdbc:netezza://server:port/database --username user --password pwd --table MIS_AUX_ITR --as-parquetfile --target-dir hdfs:///prod/ZA/dee/MIS_AUX_ITR -m 1 After import parquet file schema is the following > yarn jar /tmp/parquet-tools-1.6.0.jar schema > /prod/ZA/dee/MIS_AUX_ITR/2cf3e971-4c2c-408f-bd86-5d3cf3bd4fa5.parquet message MIS_AUX_ITR { optional int64 DDEBVAL; optional int64 DFINVAL; optional binary NAUX (UTF8); optional binary CDMNITR (UTF8); optional binary CDERIMG (UTF8); optional int64 DDERIMG; } In order to access data stored into the parquet file we created the external table below CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR ( `DDEBVAL` DATE, `DFINVAL` DATE, `NAUX` VARCHAR(6), `CDMNITR` VARCHAR(3), `CDERIMG` VARCHAR(1), `DDERIMG` DATE ) COMMENT 'Table DEE MIS_AUX_ITR' STORED AS PARQUET LOCATION '/prod/ZA/dee/MIS_AUX_ITR'; But when we try to list data from external table above we get the following exception hive> CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD > ( > `DDEBVAL`DATE, > `DFINVAL`DATE, > `NAUX`VARCHAR(6), > `CDMNITR`VARCHAR(3), > `CDERIMG`VARCHAR(1), > `DDERIMG`DATE > ) > COMMENT 'Table DEE MIS_AUX_ITR_V_PROD' > STORED AS PARQUET > LOCATION > '/prod/ZA/dee/MIS_AUX_ITR_V_PPROD'; OK Time taken: 0.196 seconds hive> select * from za_dee.MIS_AUX_ITR_V_PPROD limit 100; OK SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.hive.serde2.io.DateWritable Time taken: 0.529 seconds hive> We also tried with the following external table CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD_BI ( `DDEBVAL` BIGINT, `DFINVAL` BIGINT, `NAUX` VARCHAR(6), `CDMNITR` VARCHAR(3), `CDERIMG` VARCHAR(1), `DDERIMG` BIGINT ) COMMENT 'Table DEE MIS_AUX_ITR_V_PROD_BI' STORED AS PARQUET LOCATION '/prod/ZA/dee/MIS_AUX_ITR_V_PPROD'; Then the “Date” columns are shown as “timestamp” values as below hive> select DDEBVAL from za_dee.MIS_AUX_ITR_V_PPROD_BI limit 5; OK 1080770400000 1080770400000 1080770400000 1080770400000 1080770400000 Time taken: 0.081 seconds, Fetched: 5 row(s) hive> However “Date” values can be listed by casting as Timestamp hive> select cast(DDEBVAL as Timestamp) from za_dee.MIS_AUX_ITR_V_PPROD_BI limit 5; OK 2004-04-01 00:00:00 2004-04-01 00:00:00 2004-04-01 00:00:00 2004-04-01 00:00:00 2004-04-01 00:00:00 Time taken: 0.087 seconds, Fetched: 5 row(s) hive> We also have tested with an external table using TIMESTAMP type as shown below CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR ( `DDEBVAL` TIMESTAMP, `DFINVAL` TIMESTAMP, `NAUX` VARCHAR(6), `CDMNITR` VARCHAR(3), `CDERIMG` VARCHAR(1), `DDERIMG` TIMESTAMP ) COMMENT 'Table DEE MIS_AUX_ITR' STORED AS PARQUET LOCATION '/prod/ZA/dee/MIS_AUX_ITR'; But we got the same behavior: an exception when trying to access data from an Oracle DB. I tried this CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD_TS ( `DDEBVAL` TIMESTAMP, `DFINVAL` TIMESTAMP, `NAUX` VARCHAR(6), `CDMNITR` VARCHAR(3), `CDERIMG` VARCHAR(1), `DDERIMG` TIMESTAMP ) COMMENT 'Table DEE MIS_AUX_ITR_V_PROD_TS' STORED AS PARQUET LOCATION '/prod/ZA/dee/MIS_AUX_ITR_V_PPROD'; and then i created and launched the sqoop job below sqoop job --create import-za_dee-MIS_AUX_ITR_V-full-default-import-PPROD -- import --connect jdbc:netezza:/<server>:<port>/db --username <user> --password <password> --table MIS_AUX_ITR_V --as-parquetfile --hive-import --hive-overwrite --hive-database za_dee --hive-table MIS_AUX_ITR_V_PPROD_TS -m 1 sqoop job --exec import-za_dee-MIS_AUX_ITR_V-full-default-import-PPROD the raising error is the following 16/04/11 17:15:09 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-sqoop/compile/3533e18a81a65fb8eb88ec9cef2f4688/codegen_MIS_AUX_ITR_V.jar 16/04/11 17:15:09 WARN manager.NetezzaManager: It looks like you are importing from Netezza. 16/04/11 17:15:09 WARN manager.NetezzaManager: This transfer can be faster! Use the --direct 16/04/11 17:15:09 WARN manager.NetezzaManager: option to exercise a Netezza-specific fast path. 16/04/11 17:15:17 INFO mapreduce.ImportJobBase: Beginning import of MIS_AUX_ITR_V 16/04/11 17:15:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "MIS_AUX_ITR_V" AS t WHERE 1=0 16/04/11 17:15:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "MIS_AUX_ITR_V" AS t WHERE 1=0 16/04/11 17:15:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "MIS_AUX_ITR_V" AS t WHERE 1=0 16/04/11 17:15:19 INFO hive.metastore: Trying to connect to metastore with URI thrift://slhdm007.maif.local:9083 16/04/11 17:15:19 INFO hive.metastore: Connected to metastore. 16/04/11 17:15:19 ERROR tool.ImportTool: Imported Failed: Cannot convert unsupported type: timestamp -- This message was sent by Atlassian JIRA (v6.3.4#6332)