Jakub Kukul has posted comments on this change.

Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating 
impala tables.
......................................................................


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6550/3/docs/topics/impala_parquet.xml
File docs/topics/impala_parquet.xml:

PS3, Line 1154: ENUM
> Are enums logical types?
Yes, enums are logical types. The documentation for it has been missing, but I 
recently opened a PR to fix this:
https://github.com/apache/parquet-format/pull/54


http://gerrit.cloudera.org:8080/#/c/6550/3/fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java:

Line 274:       // UTF8 is the type annotation Parquet uses for strings
> This comment needs to be updated. It would be good if you put a link to a s
Done


http://gerrit.cloudera.org:8080/#/c/6550/3/testdata/data/schemas/logicaltypes.parquet
File testdata/data/schemas/logicaltypes.parquet:

> How did you generate this file? Was it with Hive?
I generated this file from a protobuf file, using 
https://github.com/Parquet/parquet-mr/blob/master/parquet-protobuf/src/main/java/parquet/proto/ProtoParquetWriter.java.


http://gerrit.cloudera.org:8080/#/c/6550/2/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test
File 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test:

Line 57: create table $DATABASE.like_logicaltypes_file like parquet
> Please indicate here what the logic types in question are.
Done


Line 66: ---- TYPES
> You'll also want to see that SELECT works, I think.
This file only contains queries that are testing table creation. Such a test 
probably doesn't belong here.

Also, I am not sure if such a test is within the scope of this ticket. We just 
want to make sure that parquet columns which are annotated with ENUM logical 
type, e.g.:
```
optional binary string_col (ENUM);
```
will end up as string columns in impala table definition, just like it is the 
case for un-annotated parquet columns, e.g.:
```
optional binary string_col;
```

When an impala table is created, these columns become regular string columns 
and there are already several tests for querying string columns, I think.


-- 
To view, visit http://gerrit.cloudera.org:8080/6550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jakub Kukul <jakub.ku...@gmail.com>
Gerrit-Reviewer: Jakub Kukul <jakub.ku...@gmail.com>
Gerrit-Reviewer: Jim Apple <jbapple-imp...@apache.org>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to