[ 
https://issues.apache.org/jira/browse/HAWQ-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887672#comment-15887672
 ] 

Lili Ma commented on HAWQ-1366:
-------------------------------

The title is optimized in Hive to dictionary storage.  Since HAWQ doesn't 
support this, the output information is a little werid.

In short team, HAWQ should throw error out for this case. In long term, HAWQ 
should support Parquet 2.0 data read/write.


> HAWQ should throw error if finding dictionary encoding type for Parquet
> -----------------------------------------------------------------------
>
>                 Key: HAWQ-1366
>                 URL: https://issues.apache.org/jira/browse/HAWQ-1366
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: Storage
>            Reporter: Lili Ma
>            Assignee: Ed Espino
>             Fix For: 2.2.0.0-incubating
>
>
> Since HAWQ is based on Parquet format version 1.0, which does not support 
> dictionary page, and hawq register may register Parquet format version 2.0 
> data into HAWQ, we should throw error if finding unsupported page for column.
> Reproduce Steps:
> 1. In Hive, create a table and insert into 8 records:
> {code}
> (hive> create table tt (i int,
>     >   fname varchar(100),
>     >   title varchar(100),
>     >   salary double
>     > )
>     > STORED AS PARQUET;
> OK
> Time taken: 0.029 seconds
> hive> insert into tt values (5,    'OYLNUQSQIGWDWBKMDQNYUGYXOBDFGW',    
> 'Sales',    80282.54),
>     > (7,    'UKIPCBGKHDNEEXQHOFGKKFIZGLFNHE',    'Engineer',    10206.65),
>     > (4,    'PTPIRDISZNTWNFRNBPCUKWXYFGSRBQ',    'Director',    63691.23),
>     > (9,    'CTDCDYRURBZMBLNWHQNOQCYFFVULOP',    'Engineer',    63867.44),
>     > (10,    'WZQGZJEEVDKOKTPRFKLVCBSBIYTEDK',    'Sales',    97720.08);
> WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the 
> future versions. Consider using a different execution engine (i.e. spark, 
> tez) or using Hive 1.X releases.
> Query ID = malili_20170228173956_f370414c-ddc8-4e6d-99e9-7c1fa1f678d1
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> Job running in-process (local Hadoop)
> 2017-02-28 17:39:58,713 Stage-1 map = 100%,  reduce = 0%
> Ended Job = job_local2046305831_0004
> Stage-4 is selected by condition resolver.
> Stage-3 is filtered out by condition resolver.
> Stage-5 is filtered out by condition resolver.
> Moving data to directory 
> hdfs://127.0.0.1:8020/user/hive/warehouse/tt/.hive-staging_hive_2017-02-28_17-39-56_806_3518057455919651199-1/-ext-10000
> Loading data to table default.tt
> MapReduce Jobs Launched:
> Stage-Stage-1:  HDFS Read: 3945 HDFS Write: 4226 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> OK
> Time taken: 1.975 seconds
> hive> select * from tt;
> OK
> 5     OYLNUQSQIGWDWBKMDQNYUGYXOBDFGW  Sales   80282.54
> 7     UKIPCBGKHDNEEXQHOFGKKFIZGLFNHE  Engineer        10206.65
> 4     PTPIRDISZNTWNFRNBPCUKWXYFGSRBQ  Director        63691.23
> 9     CTDCDYRURBZMBLNWHQNOQCYFFVULOP  Engineer        63867.44
> 10    WZQGZJEEVDKOKTPRFKLVCBSBIYTEDK  Sales   97720.08
> Time taken: 0.056 seconds, Fetched: 5 row(s)
> {code}
> 2. Create table in HAWQ
> {code}
> CREATE TABLE public.tt
> (i int,
>   fname varchar(100),
>   title varchar(100),
>   salary float8)
> WITH (appendonly=true,orientation=parquet);
> {code}
> 3. run hawq register
> {code}
> malilis-MacBook-Pro:Hawq_register malili$ hawq register -d postgres -f 
> hdfs://localhost:8020/user/hive/warehouse/tt tt
> 20170228:17:40:25:090499 hawqregister:malilis-MacBook-Pro:malili-[INFO]:-try 
> to connect database localhost:5432 postgres
> 20170228:17:40:33:090499 hawqregister:malilis-MacBook-Pro:malili-[INFO]:-New 
> file(s) to be registered: 
> ['hdfs://localhost:8020/user/hive/warehouse/tt/000000_0']
> hdfscmd: "hadoop fs -mv hdfs://localhost:8020/user/hive/warehouse/tt/000000_0 
> hdfs://localhost:8020/hawq_default/16385/16387/49281/1"
> 20170228:17:40:41:090499 hawqregister:malilis-MacBook-Pro:malili-[INFO]:-Hawq 
> Register Succeed.
> {code}
> 4. select from hawq
> {code}
> postgres=# select * from tt;
>  i  |             fname              | title |  salary
> ----+--------------------------------+-------+----------
>   5 | OYLNUQSQIGWDWBKMDQNYUGYXOBDFGW |       | 80282.54
>   7 | UKIPCBGKHDNEEXQHOFGKKFIZGLFNHE |       | 10206.65
>   4 | PTPIRDISZNTWNFRNBPCUKWXYFGSRBQ |       | 63691.23
>   9 | CTDCDYRURBZMBLNWHQNOQCYFFVULOP |       | 63867.44
>  10 | WZQGZJEEVDKOKTPRFKLVCBSBIYTEDK |       | 97720.08
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to