Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16893 )

Change subject: IMPALA-6434: Add support to decode RLE_DICTIONARY encoded pages
......................................................................


Patch Set 3:

(6 comments)

Few nits, but the code looks good to me overall.

http://gerrit.cloudera.org:8080/#/c/16893/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16893/3//COMMIT_MSG@10
PS3, Line 10: old PLAIN/PLAIN_DICTIONARY values.
Maybe you could emphasise that the data is still encoded the same way.


http://gerrit.cloudera.org:8080/#/c/16893/3//COMMIT_MSG@10
PS3, Line 10: PLAIN/
PLAIN is the new way AFAIK, so we use PLAIN for the dictionary page and 
RLE_DICTIONARY for the data pages.

While the old way was to use PLAIN_DICTIONARY everywhere, and it meant PLAIN 
encoding for the dictionary page and RLE encoded dict keys for the data pages.


http://gerrit.cloudera.org:8080/#/c/16893/3/be/src/exec/parquet/hdfs-parquet-table-writer.cc
File be/src/exec/parquet/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/16893/3/be/src/exec/parquet/hdfs-parquet-table-writer.cc@92
PS3, Line 92: use
nit: maybe write_new_parquet_dictionary_encodings to be more explicit?


http://gerrit.cloudera.org:8080/#/c/16893/3/be/src/exec/parquet/hdfs-parquet-table-writer.cc@881
PS3, Line 881: current_encoding_
I wonder if the code would be cleaner/less error-prone if 'current_encoding_' 
stored the actual encoding. So probably we could move this 'if' to the place 
where we set 'current_encoding_'.


http://gerrit.cloudera.org:8080/#/c/16893/3/be/src/exec/parquet/parquet-column-readers.cc
File be/src/exec/parquet/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/16893/3/be/src/exec/parquet/parquet-column-readers.cc@326
PS3, Line 326: so
nit: to


http://gerrit.cloudera.org:8080/#/c/16893/3/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/16893/3/testdata/data/README@593
PS3, Line 593:
is the newline intentional?



--
To view, visit http://gerrit.cloudera.org:8080/16893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90942022edcd5d96c720a1bde53879e50394660a
Gerrit-Change-Number: 16893
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Jan 2021 19:21:47 +0000
Gerrit-HasComments: Yes

Reply via email to