Hello Abhishek Rawat, Sahil Takiar, Bikramjeet Vig, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13857

to look at the new patch set (#6).

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
......................................................................

IMPALA-8549: Add support for scanning DEFLATE text files

Hadoop tools such as Hive and MapReduce support
reading and writing text files compressed using
the deflate algorithm. In Hadoop, the zlib library
(an implementation of the DEFLATE algorithm) is used
to compress text files into .DEFLATE files,
which are not in the raw deflate format but rather
the zlib format (the zlib library supports three flavors
of deflate, and Hadoop is using the flavor that
compresses data into deflate with zlib wrappings rather
than the raw deflate format)

This patch adds support to Impala for scanning
.DEFLATE files of tables stored as text. To avoid confusion,
it should be noted that although these files have a
compression type of DEFLATE in Impala, they should be treated
as if their compression type is ZLIB.

Testing:
There is a pre-existing unit test that validates
compressing/decompressing data with compression type
DEFLATE. Also, modified existing end-to-end testing
that simulates querying files of various formats and
compression types. All core tests pass.

Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
---
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/functional-query_exhaustive.csv
M tests/query_test/test_compressed_formats.py
5 files changed, 19 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13857/6
--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 6
Gerrit-Owner: Ethan Xue <ethan....@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Ethan Xue <ethan....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>

Reply via email to