Jakub Kukul has posted comments on this change. Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. ......................................................................
Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/6550/3/docs/topics/impala_parquet.xml File docs/topics/impala_parquet.xml: PS3, Line 1154: ENUM > Are enums logical types? Yes, enums are logical types. The documentation for it has been missing, but I recently opened a PR to fix this: https://github.com/apache/parquet-format/pull/54 http://gerrit.cloudera.org:8080/#/c/6550/3/fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java: Line 274: // UTF8 is the type annotation Parquet uses for strings > This comment needs to be updated. It would be good if you put a link to a s Done http://gerrit.cloudera.org:8080/#/c/6550/3/testdata/data/schemas/logicaltypes.parquet File testdata/data/schemas/logicaltypes.parquet: > How did you generate this file? Was it with Hive? I generated this file from a protobuf file, using https://github.com/Parquet/parquet-mr/blob/master/parquet-protobuf/src/main/java/parquet/proto/ProtoParquetWriter.java. http://gerrit.cloudera.org:8080/#/c/6550/2/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test File testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test: Line 57: create table $DATABASE.like_logicaltypes_file like parquet > Please indicate here what the logic types in question are. Done Line 66: ---- TYPES > You'll also want to see that SELECT works, I think. This file only contains queries that are testing table creation. Such a test probably doesn't belong here. Also, I am not sure if such a test is within the scope of this ticket. We just want to make sure that parquet columns which are annotated with ENUM logical type, e.g.: ``` optional binary string_col (ENUM); ``` will end up as string columns in impala table definition, just like it is the case for un-annotated parquet columns, e.g.: ``` optional binary string_col; ``` When an impala table is created, these columns become regular string columns and there are already several tests for querying string columns, I think. -- To view, visit http://gerrit.cloudera.org:8080/6550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jakub Kukul <jakub.ku...@gmail.com> Gerrit-Reviewer: Jakub Kukul <jakub.ku...@gmail.com> Gerrit-Reviewer: Jim Apple <jbapple-imp...@apache.org> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-HasComments: Yes