Alexander Behm created IMPALA-5990:
--------------------------------------

             Summary: End-to-end compression of metadata
                 Key: IMPALA-5990
                 URL: https://issues.apache.org/jira/browse/IMPALA-5990
             Project: IMPALA
          Issue Type: Improvement
          Components: Catalog, Frontend
    Affects Versions: Impala 2.9.0, Impala 2.8.0, Impala 2.10.0
            Reporter: Alexander Behm
            Assignee: Alexander Behm
            Priority: Critical


The metadata of large tables can become quite big making it costly to hold in 
the statestore and disseminate to coordinator impalads. The metadata can even 
get so big that fundamental limits like the JVM 2GB array size and the Thrift 
4GB are hit and lead to downtime.

For reducing the statestore metadata topic size we have an existing 
"--compact_catalog_topic" which LZ4 compresses the metadata payload for the C++ 
codepaths catalogd->statestore and statestore->impalad.
Unfortunately, the metadata is not compressed in the same way during the FE->BE 
transition on the catalogd and the BE->FE transition on the impalad.

The goal of this change is to enable end-to-end compression for the full path 
of metadata dissemination. The existing code paths also need significant 
cleanup/streamlining. Ideally, the new code should provide consistent size 
limits everywhere.






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to