Bharath Vissapragada has uploaded a new patch set (#8). Change subject: IMPALA-5881: Use native allocation while building catalog updates ......................................................................
IMPALA-5881: Use native allocation while building catalog updates This patch moves the allocation of thrift structures for serializing the output of GetAllCatalogObjects() call into the native side. This is done to prevent the Catalog from hitting JVM array limitations (2GB maximum size) at scale. Additionally this patch also extends the --compact_catalog_top=true to apply TCompactProtocol for the above serialization to reduce the in-memory footprint. This patch also caps the native allocations to not go beyond 4GB due to Thrift library limitations. (Thrift internally uses uint32_t datatype to represent the message size which limits the size to ~4.2GB). Testing: Passed ASAN + HDFS core and DEBUG + HDFS exhaustive. Deployed the patch on a 16 node cluster and tested it on a Catalog-update topic of 3.5GB (uncompressed) or 780MB (compressed). Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 --- M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/rpc/jni-thrift-util.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/service/JniCatalog.java A fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java A fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java A fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java 10 files changed, 502 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7955/8 -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>