[ https://issues.apache.org/jira/browse/IMPALA-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506446#comment-16506446 ]
Tianyi Wang commented on IMPALA-5990: ------------------------------------- Today I learned that a thrift message larger than 4GB can be used with TBufferedTransport and TBinaryProtocol. The limits are at other places: TMemoryBuffer cannot handle a message larger than 4GB, thrift cannot handle a single std::string larger than 4GB, etc. So after IMPALA-5990, we have seen ~6GB compressed catalog and it works just fine. > End-to-end compression of metadata > ---------------------------------- > > Key: IMPALA-5990 > URL: https://issues.apache.org/jira/browse/IMPALA-5990 > Project: IMPALA > Issue Type: Improvement > Components: Catalog, Frontend > Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0 > Reporter: Alexander Behm > Assignee: Tianyi Wang > Priority: Critical > Fix For: Impala 2.12.0 > > > The metadata of large tables can become quite big making it costly to hold in > the statestore and disseminate to coordinator impalads. The metadata can even > get so big that fundamental limits like the JVM 2GB array size and the Thrift > 4GB are hit and lead to downtime. > For reducing the statestore metadata topic size we have an existing > "compact_catalog_topic" flag which LZ4 compresses the metadata payload for > the C++ codepaths catalogd->statestore and statestore->impalad. > Unfortunately, the metadata is not compressed in the same way during the > FE->BE transition on the catalogd and the BE->FE transition on the impalad. > The goal of this change is to enable end-to-end compression for the full path > of metadata dissemination. The existing code paths also need significant > cleanup/streamlining. Ideally, the new code should provide consistent size > limits everywhere. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org