[jira] [Resolved] (HAWQ-335) Cannot query parquet hive table through PXF

Goden Yao (JIRA) Mon, 22 Feb 2016 13:42:51 -0800

     [ 
https://issues.apache.org/jira/browse/HAWQ-335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Goden Yao resolved HAWQ-335.
----------------------------
       Resolution: Incomplete
    Fix Version/s:     (was: backlog)

resolve as incomplete. no further info provided by user.

> Cannot query parquet hive table through PXF
> -------------------------------------------
>
>                 Key: HAWQ-335
>                 URL: https://issues.apache.org/jira/browse/HAWQ-335
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: PXF
>    Affects Versions: 2.0.0-beta-incubating
>            Reporter: zharui
>            Assignee: Goden Yao
>
> I created an external table in hawq that exist in hive with parquet format, 
> but I cannot query this table in hawq. The segment processes are idle and 
> nothing happened.
> The clause of creating external hive parquet table as below:
> {code}
> create external table zc_parquet800_partitioned 
> (
> start_time bigint,
> cdr_id int,
> "offset" int,
> calling varchar(255),
> imsi varchar(255),
> user_ip int,
> tmsi int,
> p_tmsi int,
> imei varchar(255),
> mcc int,
> mnc int,
> lac int,
> rac int,
> cell_id int,
> bsc_ip int,
> opc int,
> dpc int,
> sgsn_sg_ip int,
> ggsn_sg_ip int,
> sgsn_data_ip int,
> ggsn_data_ip int,
> apn varchar(255),
> rat int,
> service_type smallint,
> service_group smallint,
> up_packets int,
> down_packets int,
> up_bytes int,
> down_bytes int,
> up_speed real,
> down_speed real,
> trans_time int,
> first_time timestamp,
> end_time timestamp,
> is_end int,
> user_port int,
> proto_type int,
> dest_ip int,
> dest_port int,
> paging_count smallint,
> assignment_count smallint,
> joiner_id varchar(255),
> operation smallint,
> country smallint,
> loc_prov smallint,
> loc_city smallint,
> roam_prov smallint,
> roam_city smallint,
> sgsn varchar(255),
> bsc_rnc varchar(255),
> terminal_fac smallint,
> terminal_type int,
> terminal_class smallint,
> roaming_type smallint,
> host_operator smallint,
> net_type smallint, 
> time int, 
> calling_hash int) 
> LOCATION ('pxf://ws01.mzhen.cn:51200/zc_parquet800_partitioned?PROFILE=Hive') 
> FORMAT 'custom' (formatter='pxfwritable_import');
> {code}
> The catalina logs as below:
> {code}
> Jan 13, 2016 11:26:29 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:26:29 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1332450 records.
> Jan 13, 2016 11:26:29 AM INFO: parquet.hadoop.InternalParquetRecordReader: at 
> row 0. reading next block
> Jan 13, 2016 11:26:30 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> block read in memory in 398 ms. row count = 1332450
> Jan 13, 2016 11:26:58 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:26:58 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1460760 records.
> Jan 13, 2016 11:26:58 AM INFO: parquet.hadoop.InternalParquetRecordReader: at 
> row 0. reading next block
> Jan 13, 2016 11:26:59 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> block read in memory in 441 ms. row count = 1460760
> Jan 13, 2016 11:27:34 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:27:34 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1396605 records.
> Jan 13, 2016 11:27:34 AM INFO: parquet.hadoop.InternalParquetRecordReader: at 
> row 0. reading next block
> Jan 13, 2016 11:27:34 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> block read in memory in 367 ms. row count = 1396605
> Jan 13, 2016 11:28:06 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:28:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1337385 records.
> Jan 13, 2016 11:28:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: at 
> row 0. reading next block
> Jan 13, 2016 11:28:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> block read in memory in 348 ms. row count = 1337385
> Jan 13, 2016 11:28:32 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:28:32 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1322580 records.
> Jan 13, 2016 11:28:32 AM INFO: parquet.hadoop.InternalParquetRecordReader: at 
> row 0. reading next block
> Jan 13, 2016 11:28:33 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> block read in memory in 459 ms. row count = 1322580
> Jan 13, 2016 11:28:59 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not 
> initialize counter due to context is not a instance of 
> TaskInputOutputContext, but is 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:28:59 AM INFO: parquet.hadoop.InternalParquetRecordReader: 
> RecordReader initialized will read a total of 1431150 records.
> Jan 13, 2016 11:28:59 AM INFO: parquet.hadoop.InternalParquetRecordReader
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HAWQ-335) Cannot query parquet hive table through PXF

Reply via email to