[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289085#comment-13289085 ]
Namit Jain commented on HIVE-2989: ---------------------------------- @Carl, most of the development in hive so far has been done in a iterative manner. There are so many examples of features that have been checked in multiple patches - indexes, views to name a few. This patch is not breaking anything existing, but is not ready for final consumption. There will be a couple of follow-ups which will make this patch useful for everyone's consumption. Are you proposing a new policy that only complete features should be allowed to be checked-in in a single patch ? That will slow the community down significantly. There will be multiple side-branches on which development will happen, and it will be very difficult to get them back in trunk. I don't think that has been the case for most of the development that has happened in the past. > Adding Table Links to Hive > -------------------------- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security > Affects Versions: 0.10.0 > Reporter: Bhushan Mandhani > Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, > HIVE-2989.6.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira