[ https://issues.apache.org/jira/browse/IMPALA-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gabor Kaszab reassigned IMPALA-11265: ------------------------------------- Assignee: Gabor Kaszab > Iceberg tables have a large memory footprint in catalog cache > ------------------------------------------------------------- > > Key: IMPALA-11265 > URL: https://issues.apache.org/jira/browse/IMPALA-11265 > Project: IMPALA > Issue Type: Improvement > Components: Catalog > Reporter: Quanlong Huang > Assignee: Gabor Kaszab > Priority: Major > Labels: impala-iceberg > > During the investigation of IMPALA-11260, I found the cache item size of a > (IcebergApiTableCacheKey, org.apache.iceberg.BaseTable) pair could be 30MB. > For instance, here are the cache items of the iceberg table > {{{}functional_parquet.iceberg_partitioned{}}}: > {code:java} > weigh=3792, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$TableCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$TableMetaRefImpl > weigh=14960, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$IcebergMetaCacheKey, > valueClass=class org.apache.impala.thrift.TPartialTableInfo > weigh=30546992, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$IcebergApiTableCacheKey, > valueClass=class org.apache.iceberg.BaseTable > weigh=496, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=496, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=496, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=512, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=472, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionListCacheKey, > valueClass=class java.util.ArrayList > weigh=10328, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl{code} > Note that this table just have 20 rows. The total memory footprint size is > 30MB. > For a normal partitioned partquet table, the memory footprint is not that > large. For instance, here are the cache items for > {{{}functional_parquet.alltypes{}}}: > {code:java} > weigh=4216, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$TableCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$TableMetaRefImpl > weigh=480, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=472, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=488, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=488, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=480, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=488, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=488, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=488, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=488, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=488, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=496, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=352, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=352, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$ColStatsCacheKey, > valueClass=class org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj > weigh=4248, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionListCacheKey, > valueClass=class java.util.ArrayList > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1288, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1288, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1288, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1288, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1288, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1288, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1288, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1296, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1288, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1288, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl > weigh=1288, keyClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionCacheKey, > valueClass=class > org.apache.impala.catalog.local.CatalogdMetaProvider$PartitionMetadataImpl{code} > The total size is around 45KB. > It worths double checking whether we need the whole > org.apache.iceberg.BaseTable object. Maybe we can just extract what Impala > needs into a custom value class. > CC [~boroknagyz] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org