[jira] [Commented] (HIVE-2989) Adding Table Links to Hive

Namit Jain (JIRA) Mon, 04 Jun 2012 19:05:25 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289085#comment-13289085
 ]


Namit Jain commented on HIVE-2989:
----------------------------------

@Carl, most of the development in hive so far has been done in a iterative 
manner.
There are so many examples of features that have been checked in multiple 
patches - indexes, views to name a few.

This patch is not breaking anything existing, but is not ready for final 
consumption. There will be a couple of follow-ups 
which will make this patch useful for everyone's consumption. 

Are you proposing a new policy that only complete features should be allowed to 
be checked-in in a single patch ?
That will slow the community down significantly. There will be multiple 
side-branches on which development will happen,
and it will be very difficult to get them back in trunk.

I don't think that has been the case for most of the development that has 
happened in the past.
                
> Adding Table Links to Hive
> --------------------------
>
>                 Key: HIVE-2989
>                 URL: https://issues.apache.org/jira/browse/HIVE-2989
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore, Query Processor, Security
>    Affects Versions: 0.10.0
>            Reporter: Bhushan Mandhani
>            Assignee: Bhushan Mandhani
>         Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, 
> HIVE-2989.6.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2989) Adding Table Links to Hive

Reply via email to