Sergey Shelukhin created HIVE-6265:
--------------------------------------
Summary: dedup Metastore data structures or at least protocol
Key: HIVE-6265
URL: https://issues.apache.org/jira/browse/HIVE-6265
Project: Hive
Issue Type: Improvement
Reporter: Sergey Shelukhin
Metastore currently stores SD per partition, and column schema/serde/... per SD.
Most of the time all the partitions have the same setup in a table, the only
different things in SD/CD/... being the location. In such cases, we don't need
to store these separately and send them to client when many partitions are
retrieved for a large table. While storage changes may be too complex wrt
backward compat, as well as with DataNucleus being in the picture and
controlling the db schema/persistence, at least we can avoid sending lots of
duplicate data to the client on the network; thrift protocol can be modified to
omit duplicate data in a backward compatible manner.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)