[jira] [Comment Edited] (HAWQ-335) Cannot query parquet hive table through PXF

Goden Yao (JIRA) Thu, 14 Jan 2016 11:43:20 -0800

    [ 
https://issues.apache.org/jira/browse/HAWQ-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097046#comment-15097046
 ]


Goden Yao edited comment on HAWQ-335 at 1/14/16 7:41 PM:
---------------------------------------------------------

What's the error message you see from HAWQ side, what does the pxf error log 
say?
Also can you post the table definition you have in Hive stored as parquet?


was (Author: godenyao):
I noticed your definition has "offset" in quotes. Is it a typo?
Also can you post the table definition you have in Hive stored as parquet?

> Cannot query parquet hive table through PXF
> -------------------------------------------
>
>                 Key: HAWQ-335
>                 URL: https://issues.apache.org/jira/browse/HAWQ-335
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: PXF
>    Affects Versions: 2.0.0-beta-incubating
>            Reporter: zharui
>            Assignee: Goden Yao
>
> I created an external table in hawq that exist in hive with parquet format, 
> but I cannot query this table in hawq. The segment processes are idle and 
> nothing happened.
> The clause of creating external hive parquet table as below:
> {code}
> create external table zc_parquet800_partitioned 
> (
> start_time bigint,
> cdr_id int,
> "offset" int,
> calling varchar(255),
> imsi varchar(255),
> user_ip int,
> tmsi int,
> p_tmsi int,
> imei varchar(255),
> mcc int,
> mnc int,
> lac int,
> rac int,
> cell_id int,
> bsc_ip int,
> opc int,
> dpc int,
> sgsn_sg_ip int,
> ggsn_sg_ip int,
> sgsn_data_ip int,
> ggsn_data_ip int,
> apn varchar(255),
> rat int,
> service_type smallint,
> service_group smallint,
> up_packets int,
> down_packets int,
> up_bytes int,
> down_bytes int,
> up_speed real,
> down_speed real,
> trans_time int,
> first_time timestamp,
> end_time timestamp,
> is_end int,
> user_port int,
> proto_type int,
> dest_ip int,
> dest_port int,
> paging_count smallint,
> assignment_count smallint,
> joiner_id varchar(255),
> operation smallint,
> country smallint,
> loc_prov smallint,
> loc_city smallint,
> roam_prov smallint,
> roam_city smallint,
> sgsn varchar(255),
> bsc_rnc varchar(255),
> terminal_fac smallint,
> terminal_type int,
> terminal_class smallint,
> roaming_type smallint,
> host_operator smallint,
> net_type smallint, 
> time int, 
> calling_hash int) 
> LOCATION ('pxf://ws01.mzhen.cn:51200/zc_parquet800_partitioned?PROFILE=Hive') 
> FORMAT 'custom' (formatter='pxfwritable_import');
> {code}
> The catalina logs as below:
> {code}
> Jan 13, 2016 11:26:29 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:26:29 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1332450 records.
> Jan 13, 2016 11:26:29 AM INFO: parquet.hadoop.InternalParquetRecordReader: at 
> row 0. reading next block
> Jan 13, 2016 11:26:30 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> block read in memory in 398 ms. row count = 1332450
> Jan 13, 2016 11:26:58 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:26:58 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1460760 records.
> Jan 13, 2016 11:26:58 AM INFO: parquet.hadoop.InternalParquetRecordReader: at 
> row 0. reading next block
> Jan 13, 2016 11:26:59 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> block read in memory in 441 ms. row count = 1460760
> Jan 13, 2016 11:27:34 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:27:34 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1396605 records.
> Jan 13, 2016 11:27:34 AM INFO: parquet.hadoop.InternalParquetRecordReader: at 
> row 0. reading next block
> Jan 13, 2016 11:27:34 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> block read in memory in 367 ms. row count = 1396605
> Jan 13, 2016 11:28:06 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:28:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1337385 records.
> Jan 13, 2016 11:28:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: at 
> row 0. reading next block
> Jan 13, 2016 11:28:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> block read in memory in 348 ms. row count = 1337385
> Jan 13, 2016 11:28:32 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:28:32 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1322580 records.
> Jan 13, 2016 11:28:32 AM INFO: parquet.hadoop.InternalParquetRecordReader: at 
> row 0. reading next block
> Jan 13, 2016 11:28:33 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> block read in memory in 459 ms. row count = 1322580
> Jan 13, 2016 11:28:59 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:28:59 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1431150 records.
> Jan 13, 2016 11:28:59 AM INFO: parquet.hadoop.InternalParquetRecordReader
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HAWQ-335) Cannot query parquet hive table through PXF

Reply via email to