[jira] [Created] (HAWQ-1366) HAWQ should throw error if finding dictionary encoding type for Parquet

Lili Ma (JIRA) Tue, 28 Feb 2017 01:43:29 -0800

Lili Ma created HAWQ-1366:
-----------------------------

             Summary: HAWQ should throw error if finding dictionary encoding 
type for Parquet
                 Key: HAWQ-1366
                 URL: https://issues.apache.org/jira/browse/HAWQ-1366
             Project: Apache HAWQ
          Issue Type: Bug
          Components: Storage
            Reporter: Lili Ma
            Assignee: Ed Espino
             Fix For: 2.2.0.0-incubating



Since HAWQ is based on Parquet format version 1.0, which does not support 
dictionary page, and hawq register may register Parquet format version 2.0 data 
into HAWQ, we should throw error if finding unsupported page for column.

Reproduce Steps:
1. In Hive, create a table and insert into 8 records:
{code}
(hive> create table tt (i int,
    >   fname varchar(100),
    >   title varchar(100),
    >   salary double
    > )
    > STORED AS PARQUET;
OK
Time taken: 0.029 seconds
hive> insert into tt values (5,    'OYLNUQSQIGWDWBKMDQNYUGYXOBDFGW',    
'Sales',    80282.54),
    > (7,    'UKIPCBGKHDNEEXQHOFGKKFIZGLFNHE',    'Engineer',    10206.65),
    > (4,    'PTPIRDISZNTWNFRNBPCUKWXYFGSRBQ',    'Director',    63691.23),
    > (9,    'CTDCDYRURBZMBLNWHQNOQCYFFVULOP',    'Engineer',    63867.44),
    > (10,    'WZQGZJEEVDKOKTPRFKLVCBSBIYTEDK',    'Sales',    97720.08);
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the 
future versions. Consider using a different execution engine (i.e. spark, tez) 
or using Hive 1.X releases.
Query ID = malili_20170228173956_f370414c-ddc8-4e6d-99e9-7c1fa1f678d1
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2017-02-28 17:39:58,713 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_local2046305831_0004
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory 
hdfs://127.0.0.1:8020/user/hive/warehouse/tt/.hive-staging_hive_2017-02-28_17-39-56_806_3518057455919651199-1/-ext-10000
Loading data to table default.tt
MapReduce Jobs Launched:
Stage-Stage-1:  HDFS Read: 3945 HDFS Write: 4226 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 1.975 seconds
hive> select * from tt;
OK
5       OYLNUQSQIGWDWBKMDQNYUGYXOBDFGW  Sales   80282.54
7       UKIPCBGKHDNEEXQHOFGKKFIZGLFNHE  Engineer        10206.65
4       PTPIRDISZNTWNFRNBPCUKWXYFGSRBQ  Director        63691.23
9       CTDCDYRURBZMBLNWHQNOQCYFFVULOP  Engineer        63867.44
10      WZQGZJEEVDKOKTPRFKLVCBSBIYTEDK  Sales   97720.08
Time taken: 0.056 seconds, Fetched: 5 row(s)
{code}
2. Create table in HAWQ
{code}
CREATE TABLE public.tt
(i int,
  fname varchar(100),
  title varchar(100),
  salary float8)
WITH (appendonly=true,orientation=parquet);
{code}
3. run hawq register
{code}
malilis-MacBook-Pro:Hawq_register malili$ hawq register -d postgres -f 
hdfs://localhost:8020/user/hive/warehouse/tt tt
20170228:17:40:25:090499 hawqregister:malilis-MacBook-Pro:malili-[INFO]:-try to 
connect database localhost:5432 postgres
20170228:17:40:33:090499 hawqregister:malilis-MacBook-Pro:malili-[INFO]:-New 
file(s) to be registered: 
['hdfs://localhost:8020/user/hive/warehouse/tt/000000_0']
hdfscmd: "hadoop fs -mv hdfs://localhost:8020/user/hive/warehouse/tt/000000_0 
hdfs://localhost:8020/hawq_default/16385/16387/49281/1"
20170228:17:40:41:090499 hawqregister:malilis-MacBook-Pro:malili-[INFO]:-Hawq 
Register Succeed.
{code}
4. select from hawq
{code}
postgres=# select * from tt;
 i  |             fname              | title |  salary
----+--------------------------------+-------+----------
  5 | OYLNUQSQIGWDWBKMDQNYUGYXOBDFGW |       | 80282.54
  7 | UKIPCBGKHDNEEXQHOFGKKFIZGLFNHE |       | 10206.65
  4 | PTPIRDISZNTWNFRNBPCUKWXYFGSRBQ |       | 63691.23
  9 | CTDCDYRURBZMBLNWHQNOQCYFFVULOP |       | 63867.44
 10 | WZQGZJEEVDKOKTPRFKLVCBSBIYTEDK |       | 97720.08
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (HAWQ-1366) HAWQ should throw error if finding dictionary encoding type for Parquet

Reply via email to