Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13396 )

Change subject: IMPALA-8450: Add support for zstd and lz4 in parquet
......................................................................


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/13396/6/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/13396/6/be/src/service/query-options.cc@761
PS6, Line 761: clevel, ZSTD_maxCLevel()));
nit: could be moved to the previous line


http://gerrit.cloudera.org:8080/#/c/13396/6/be/src/util/codec.h
File be/src/util/codec.h:

http://gerrit.cloudera.org:8080/#/c/13396/6/be/src/util/codec.h@63
PS6, Line 63:    private:
            :     friend class Codec;
            :     friend class HdfsParquetTableWriter;
            :     friend void TestCompression(int, int, int, 
THdfsCompression::type);
I would prefer to make the members public. If you want to keep private members, 
than this should be a "class" instead of "struct".


http://gerrit.cloudera.org:8080/#/c/13396/6/be/src/util/codec.h@68
PS6, Line 68: clevel_
Can you mention in a comment that currently only ZSTD uses this?
+ Using a longer name like compression_level_ would make the purpose clearer.


http://gerrit.cloudera.org:8080/#/c/13396/6/cmake_modules/FindZstd.cmake
File cmake_modules/FindZstd.cmake:

http://gerrit.cloudera.org:8080/#/c/13396/6/cmake_modules/FindZstd.cmake@36
PS6, Line 36:
nit: whitespace consistency


http://gerrit.cloudera.org:8080/#/c/13396/6/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/13396/6/tests/query_test/test_insert_parquet.py@144
PS6, Line 144: test_insert_parquet_multi_codecs
I would prefer to move most of the logic to a .test file.
An example .test file that creates tables with different query options:
https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/parquet-int64-timestamps.test


http://gerrit.cloudera.org:8080/#/c/13396/6/tests/query_test/test_insert_parquet.py@159
PS6, Line 159:     self.execute_query("set COMPRESSION_CODEC=ZSTD")
             :     insert_stmt = """insert into {0} values
             :         (6,false,8.99815,'2004-01-26 
05:00:31.759999','2004-01-26','Winter is coming'
             :          ,'~Dracarys!!~'),
             :         (7,false,9.99999,'1999-09-09 
09:00:59.999999','1999-09-09','I am Iron Man'
             :          ,'I am inevitable - <>#$%@*(#$^%*@$#'),
             :         (8,NULL,NULL,NULL,NULL,NULL,NULL)
The test could be probably shorter by CTAS-ing/inserting from alltypestiny, e.g 
"insert into {0} select * from functional.alltypestiny where id = 1 or id = 2", 
and compering the two tables at the end.  Note that some types (DATE) + NULLs 
are missing from alltypestiny.



--
To view, visit http://gerrit.cloudera.org:8080/13396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I98c6dcf3d0a873380e4fa4cf03eb7e924e4ee768
Gerrit-Change-Number: 13396
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 May 2019 15:49:10 +0000
Gerrit-HasComments: Yes

Reply via email to