Juan Yu has posted comments on this change.

Change subject: IMPALA-3038: Add multistream gzip/bzip2 test coverage
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/2543/7/be/src/util/decompress-test.cc
File be/src/util/decompress-test.cc:

Line 255:     // Repeatedly pick random-size input data(~1MB), compress it, 
then concatenate
> What does the ~1MB mean? I think this is why I got confused about L270 earl
I try to simulate pbzip2, it split large input into smaller chunks then 
compress them in parallel and then concatenate result.
I take raw_input(this is 1M), shorten it to make variable length, then compress 
it. repeat those to get multiple streams.
int len = RAW_INPUT_SIZE - (rand() % 1024);
compressor->ProcessBlock(false, len, raw_input, &compressed_length, 
&compressed_stream);

The total output compressed data will be no more than 16M (this is to make sure 
it's larger the 8M IO buffer). for the raw input I generated, the compress 
ratio is about 2:1. so I limit the total input uncompressed data to no more 
than 32M.


Line 266:     EXPECT_OK(Codec::CreateCompressor(&mem_pool_, true, format, 
&compressor));
> Move created compressor above comment
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/2543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b0e1971145dd457e71fc9c00ce7c06fff8dea88
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Juan Yu <j...@cloudera.com>
Gerrit-Reviewer: Juan Yu <j...@cloudera.com>
Gerrit-Reviewer: Skye Wanderman-Milne <s...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to