[ 
https://issues.apache.org/jira/browse/HIVE-26893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Hemanth Gantasala reassigned HIVE-26893:
--------------------------------------------

    Assignee: Sai Hemanth Gantasala

> Extend batch partition APIs to ignore partition schemas
> -------------------------------------------------------
>
>                 Key: HIVE-26893
>                 URL: https://issues.apache.org/jira/browse/HIVE-26893
>             Project: Hive
>          Issue Type: New Feature
>          Components: Metastore
>            Reporter: Quanlong Huang
>            Assignee: Sai Hemanth Gantasala
>            Priority: Major
>
> There are several HMS APIs that return a list of partitions, e.g. 
> get_partitions_ps(), get_partitions_by_names(), add_partitions_req() with 
> needResult=true, etc. Each partition instance will have a unique list of 
> FieldSchemas as the partition schema:
> {code:java}
> org.apache.hadoop.hive.metastore.api.Partition
> -> org.apache.hadoop.hive.metastore.api.StorageDescriptor
>    ->  cols: list<org.apache.hadoop.hive.metastore.api.FieldSchema> {code}
> This could occupy a large memory footprint for wide tables (e.g. with 2k 
> cols). See the heap histogram in IMPALA-11812 as an example.
> Some engines like Impala doesn't actually use/respect the partition level 
> schema. It's a waste of network/serde resource to transmit them. It'd be nice 
> if these APIs provide an optional boolean flag for ignoring partition 
> schemas. So HMS clients (e.g. Impala) don't need to clear them later (to save 
> mem).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to