Ethan Xue has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13857/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13857/2//COMMIT_MSG@11
PS2, Line 11: In Hadoop, the zlib library
            : (an implementation of the DEFLATE algorithm) is used
            : to compress text files into .DEFLATE files,
            : which are not in the raw deflate format but rather
            : the zlib format (has a zlib header and footer).
> would be good to mention that the zlib library supports three flavors of de
Done


http://gerrit.cloudera.org:8080/#/c/13857/2/be/src/util/codec.cc
File be/src/util/codec.cc:

http://gerrit.cloudera.org:8080/#/c/13857/2/be/src/util/codec.cc@150
PS2, Line 150:     case THdfsCompression::DEFLATE:
             :     case THdfsCompression::GZIP:
             :       decompressor->reset(new GzipDecompressor(mem_pool, reuse, 
false));
> so the decompressor doesn't need to differentiate between ZLIB, GZIP, and D
Yes. The compressor differentiates between GZIP and ZLIB/DEFLATE, as it needs 
to know what value of window bits to use. The decompressor on the other hand, 
can implicitly detect GZIP and ZLIB/DEFLATE formats: 
https://github.com/apache/impala/blob/2813d0c18414a5b7977cc713755daed7e53358ce/be/src/util/decompress.cc#L53-L58.
 Note again, as it can be confusing, that THdfsCompression::DEFLATE and 
THdfsCompression::ZLIB are equivalent.


http://gerrit.cloudera.org:8080/#/c/13857/1/tests/query_test/test_compressed_formats.py
File tests/query_test/test_compressed_formats.py:

http://gerrit.cloudera.org:8080/#/c/13857/1/tests/query_test/test_compressed_formats.py@71
PS1, Line 71:   def test_compressed_formats(self, vector):
> this test got skipped in your test run: https://jenkins.impala.io/job/ubunt
My guess is because of the code on line 65. If the exploration strategy is 
'core' then this test is skipped. I will rebuild this patch on Jenkins with 
exploration strategy set to 'exhaustive'.



--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 2
Gerrit-Owner: Ethan Xue <ethan....@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Ethan Xue <ethan....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>
Gerrit-Comment-Date: Tue, 23 Jul 2019 20:58:21 +0000
Gerrit-HasComments: Yes

Reply via email to