[
https://issues.apache.org/jira/browse/HIVE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500526#comment-13500526
]
Shreepadma Venugopalan commented on HIVE-3678:
----------------------------------------------
With the changes from HIVE-3712, the column schema has *no* dependency on any
specific db. The column schema, with the changes from HIVE-3712, uses simple
data types, which are supported across DBs. The primary motivation for making
the change to the schema in HIVE-3712 was to avoid storing column statistics
fields as a BLOB. The problem with using a BLOB is a) BLOBs are designed to
store large volumes of data in the order of GBs and are hence stored outside
the row. A consequence of this design is BLOBs don't perform well for storing
small amounts of data. While some DBs such as Oracle inline small BLOBs, all
DBs don't. While BLOBs are the only practical choice for storing data whose
size is not known in advance, it is an overkill for storing around 100 bytes of
data, and b) there is no uniform support across DB vendors and versions. Hence
I don't really see the value in storing this as a JSON BLOB.
> Add metastore upgrade scripts for column stats schema changes
> -------------------------------------------------------------
>
> Key: HIVE-3678
> URL: https://issues.apache.org/jira/browse/HIVE-3678
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Shreepadma Venugopalan
> Assignee: Shreepadma Venugopalan
> Fix For: 0.10.0
>
> Attachments: HIVE-3678.1.patch.txt
>
>
> Add upgrade script for column statistics schema changes for
> Postgres/MySQL/Oracle/Derby
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira