You would need to add a new codec to the Impala source tree. The codecs are
implemented in be/src/util/codec.h,  be/src/util/compress.h  and
be/src/util/decompress.h. There are a few other places you may need to
change. I would just "git grep -i gzip" to see how the gzip codec is
implemented.

For compressed text files you would also need to add support to the
frontend, e.g. in
fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java

I'm also not sure if there are any licensing issues here since the XZ
library is GPL licensed.

On Sat, Jun 10, 2017 at 5:41 PM, 孙清孟 <sqm2...@gmail.com> wrote:

> I have added lzma codec (hadoop-xz) to parquet(modify the parquet-format
> and parquet-mr)  for hive, and get a higher compression ratio.
>
> But how add a new codec for Impala?
>

Reply via email to