[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2989: - Status: Open (was: Patch Available) @Bhushan: I added some more comments on phabricator. The main issue at this point is that the last version of the patch enabled SELECTs on tablelinks. This can't go in until a test is added that demonstrates someone can't use this to bypass SELECT authorization checks on target tables. > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.10.0 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.10.patch.txt, > HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, > HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.10.patch.txt Added querying and addressed latest comments. > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.10.0 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.10.patch.txt, > HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, > HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Hadoop Flags: (was: Reviewed) Status: Patch Available (was: Open) > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.10.0 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, > HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.9.patch.txt Updated per comments. > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.10.0 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, > HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2989: - Status: Open (was: Patch Available) @Bhushan: I added some review comments. Thanks. > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.10.0 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, > HIVE-2989.6.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-2989: Status: Patch Available (was: Open) > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.10.0 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, > HIVE-2989.6.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2989: - Status: Open (was: Patch Available) @Bhushan: Please post an up-to-date version of this patch on either phabricator or reviewboard. Thanks. > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.10.0 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, > HIVE-2989.6.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Status: Patch Available (was: Reopened) > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.10.0 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, > HIVE-2989.6.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.6.patch.txt > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.10.0 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, > HIVE-2989.6.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-2989: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed. Thanks Bhushan > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.10.0 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Affects Version/s: (was: 0.8.1) 0.10.0 Status: Patch Available (was: Open) > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.10.0 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.5.patch.txt Had forgotten to rerecord a couple of negative tests. > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.8.1 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.4.patch.txt Updated per Namit's comments. Except for comments 6 and 7. 7 was already done and 6 was not entirely applicable. > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.8.1 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.3.patch.txt Added DROP and updated per comments. > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.8.1 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, > HIVE-2989.3.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2989: - Fix Version/s: (was: 0.9.0) > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.8.1 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.2.patch.txt > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.8.1 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Fix For: 0.9.0 > > Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2989: - Status: Open (was: Patch Available) A couple comments: * There are open questions on the design proposal wiki page that need to be answered. * This patch doesn't include metastore upgrade scripts. Also, on it's own this patch doesn't provide any benefit to users, but it does impose an additional burden since it forces a metastore upgrade. I don't think it makes sense to accept this burden until we reach a state where the feature can be demonstrated end-to-end. > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.8.1 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Fix For: 0.9.0 > > Attachments: HIVE-2989.1.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Fix Version/s: (was: 0.10.0) 0.9.0 Affects Version/s: 0.8.1 Status: Patch Available (was: Open) > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Affects Versions: 0.8.1 >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Fix For: 0.9.0 > > Attachments: HIVE-2989.1.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.1.patch.txt > Adding Table Links to Hive > -- > > Key: HIVE-2989 > URL: https://issues.apache.org/jira/browse/HIVE-2989 > Project: Hive > Issue Type: Improvement > Components: Metastore, Query Processor, Security >Reporter: Bhushan Mandhani >Assignee: Bhushan Mandhani > Fix For: 0.10.0 > > Attachments: HIVE-2989.1.patch.txt > > Original Estimate: 672h > Remaining Estimate: 672h > > This will add Table Links to Hive. This will be an alternate mechanism for a > user to access tables and data in a database that is different from the one > he is associated with. This feature can be used to provide access control (if > access to databasename.tablename in queries and "use database X" is turned > off in conjunction). > If db X wants to access one or more partitions from table T in db Y, the user > will issue: > CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') > New partitions added to T will automatically be added to the link as well and > become available to X. However, if the link is specified to be static, that > will not be the case. The X user will then have to explicitly import each > partition of T that he needs. The command above will not actually make any > existing partitions of T available to X. Instead, we provide the following > command to add an existing partition to a link: > ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') > The user will need to execute the above for each existing partition that > needs to be imported. For future partitions, Hive will take care of this. An > imported partition can be dropped from a link using a similar command. We > just specify "DROP" instead of "ADD". For querying the linked table, the X > user will refer to it as T@Y. Link Tables will only have read access and not > be writable. The entire Table Link alongwith all its imported partitions can > be dropped as follows: > DROP LINK TO T@Y > The above commands are purely MetaStore operations. The implementation will > rely on replicating the entire partition metadata when a partition is added > to a link. For every link that is created, we will add a new row to table > TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or > "STATIC_LINK_TABLE" if the link has been specified as static). A new column > LINK_TBL_ID will be added which will contain the id of the imported table. It > will be NULL for all other table types including the regular managed tables. > When a partition is added to a link, the new row in the table PARTITIONS will > point to the LINK_TABLE in the same database and not the master table in the > other database. We will replicate all the metadata for this partition from > the master database. The advantage of this approach is that fewer changes > will be needed in query processing and DDL for LINK_TABLEs. Also, commands > like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for > LINK_TABLEs too. Of course, even though the metadata is not shared, the > underlying data on disk is still shared. Hive still needs to know that when > dropping a partition which belongs to a LINK_TABLE, it should not drop the > underlying data from HDFS. Views and external tables cannot be imported from > one database to another. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira