Hello Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/11341 to look at the new patch set (#5). Change subject: IMPALA-7424: Reduce in-memory footprint of incremental stats ...................................................................... IMPALA-7424: Reduce in-memory footprint of incremental stats Currently incremental stats are stored as chunked Base64 strings in the HMS parameters map of partition objects. Each of these strings when stored in the catalogd are Java 'String' objects that use UTF-16 encoding and take up to 2 bytes per character. This patch converts the string representation into a gzipped byte array form when the partition is loaded in the Catalogd and this state is maintained when transmitting them to the coordinators. To maintain backward compatibility, the persistent HMS representation of stats has not been modified. So the incremental stats are still written back to the chunked Base64 representation while serializing the partition state to HMS. On a real world catalogserver dominated by incremental stats memory footprint, this patch showed ~54% end-to-end heapsize reduction and ~79% reduction in the memory footprint of incremental stats data structures. This patch also improves the way the callers check if a partition has incremental stats by computing this information once and reusing it later. Without the patch, we deserialize the entire incremental stats structure everytime this information is needed and that triggers a spike in usage of working memory on catalogds/Impalads. Testing: Ran core tests on Catalog V1 Implementation. Ran some manual queries on Catalog V2 implementation. Change-Id: I39f02ebfa0c6e9b0baedd0d76058a1b34efb5a02 --- M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A fe/src/main/java/org/apache/impala/util/CompressionUtil.java M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java A fe/src/test/java/org/apache/impala/util/CompressionUtilTest.java 17 files changed, 333 insertions(+), 117 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/11341/5 -- To view, visit http://gerrit.cloudera.org:8080/11341 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I39f02ebfa0c6e9b0baedd0d76058a1b34efb5a02 Gerrit-Change-Number: 11341 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>