[ https://issues.apache.org/jira/browse/IMPALA-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17254327#comment-17254327 ]
Vihang Karajgaonkar commented on IMPALA-2840: --------------------------------------------- We already deduplicate the partition locations using this {{HdfsPartitionLocationCompressor}}. > Avoid storing redundant information about partitions in the catalog > ------------------------------------------------------------------- > > Key: IMPALA-2840 > URL: https://issues.apache.org/jira/browse/IMPALA-2840 > Project: IMPALA > Issue Type: Sub-task > Components: Catalog > Affects Versions: Impala 2.2.4 > Reporter: Dimitris Tsirogiannis > Priority: Major > Labels: catalog-server, memory, performance, ramp-up > > For each partition we store the entire path in a string. For tables with > large number of partitions, there is lots of redundancy that we should try > to avoid in order to reduce the catalog's memory footprint. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org