[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-06-12 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2989:
-

Status: Open  (was: Patch Available)

@Bhushan: I added some more comments on phabricator. The main issue at this 
point is that the last version of the patch enabled SELECTs on tablelinks. This 
can't go in until a test is added that demonstrates someone can't use this to 
bypass SELECT authorization checks on target tables.

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.10.0
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.10.patch.txt, 
> HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, 
> HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-06-12 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Attachment: HIVE-2989.10.patch.txt

Added querying and addressed latest comments.

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.10.0
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.10.patch.txt, 
> HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, 
> HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-06-11 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Hadoop Flags:   (was: Reviewed)
  Status: Patch Available  (was: Open)

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.10.0
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, 
> HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-06-11 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Attachment: HIVE-2989.9.patch.txt

Updated per comments.

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.10.0
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, 
> HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-06-04 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2989:
-

Status: Open  (was: Patch Available)

@Bhushan: I added some review comments. Thanks.

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.10.0
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, 
> HIVE-2989.6.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-06-04 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-2989:


Status: Patch Available  (was: Open)

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.10.0
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, 
> HIVE-2989.6.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-06-03 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2989:
-

Status: Open  (was: Patch Available)

@Bhushan: Please post an up-to-date version of this patch on either phabricator 
or reviewboard. Thanks.

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.10.0
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, 
> HIVE-2989.6.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-06-01 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Status: Patch Available  (was: Reopened)

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.10.0
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, 
> HIVE-2989.6.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-06-01 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Attachment: HIVE-2989.6.patch.txt

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.10.0
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, 
> HIVE-2989.6.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-06-01 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2989:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed. Thanks Bhushan

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.10.0
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-06-01 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Affects Version/s: (was: 0.8.1)
   0.10.0
   Status: Patch Available  (was: Open)

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.10.0
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-05-31 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Attachment: HIVE-2989.5.patch.txt

Had forgotten to rerecord a couple of negative tests.

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.8.1
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-05-31 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Attachment: HIVE-2989.4.patch.txt

Updated per Namit's comments. Except for comments 6 and 7. 7 was already done 
and 6 was not entirely applicable.

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.8.1
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-05-31 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Attachment: HIVE-2989.3.patch.txt

Added DROP and updated per comments.

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.8.1
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, 
> HIVE-2989.3.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-05-30 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2989:
-

Fix Version/s: (was: 0.9.0)

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.8.1
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-05-28 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Attachment: HIVE-2989.2.patch.txt

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.8.1
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Fix For: 0.9.0
>
> Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-05-25 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2989:
-

Status: Open  (was: Patch Available)

A couple comments:

* There are open questions on the design proposal wiki page that need to be 
answered.

* This patch doesn't include metastore upgrade scripts. Also, on it's own this 
patch doesn't provide any benefit to users, but it does impose an additional 
burden since it forces a metastore upgrade. I don't think it makes sense to 
accept this burden until we reach a state where the feature can be demonstrated 
end-to-end.


> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.8.1
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Fix For: 0.9.0
>
> Attachments: HIVE-2989.1.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-05-25 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Fix Version/s: (was: 0.10.0)
   0.9.0
Affects Version/s: 0.8.1
   Status: Patch Available  (was: Open)

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Affects Versions: 0.8.1
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Fix For: 0.9.0
>
> Attachments: HIVE-2989.1.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

2012-05-25 Thread Bhushan Mandhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhushan Mandhani updated HIVE-2989:
---

Attachment: HIVE-2989.1.patch.txt

> Adding Table Links to Hive
> --
>
> Key: HIVE-2989
> URL: https://issues.apache.org/jira/browse/HIVE-2989
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Query Processor, Security
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
> Fix For: 0.10.0
>
> Attachments: HIVE-2989.1.patch.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This will add Table Links to Hive. This will be an alternate mechanism for a 
> user to access tables and data in a database that is different from the one 
> he is associated with. This feature can be used to provide access control (if 
> access to databasename.tablename in queries and "use database X" is turned 
> off in conjunction).
> If db X wants to access one or more partitions from table T in db Y, the user 
> will issue:
> CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
> New partitions added to T will automatically be added to the link as well and 
> become available to X. However, if the link is specified to be static, that 
> will not be the case. The X user will then have to explicitly import each 
> partition of T that he needs. The command above will not actually make any 
> existing partitions of T available to X. Instead, we provide the following 
> command to add an existing partition to a link:
> ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
> The user will need to execute the above for each existing partition that 
> needs to be imported. For future partitions, Hive will take care of this. An 
> imported partition can be dropped from a link using a similar command. We 
> just specify "DROP" instead of "ADD". For querying the linked table, the X 
> user will refer to it as T@Y. Link Tables will only have read access and not 
> be writable. The entire Table Link alongwith all its imported partitions can 
> be dropped as follows:
> DROP LINK TO T@Y
> The above commands are purely MetaStore operations. The implementation will 
> rely on replicating the entire partition metadata when a partition is added 
> to a link.  For every link that is created, we will add a new row to table 
> TBLS. The TBL_TYPE column will have a new kind of value "LINK_TABLE" (or 
> "STATIC_LINK_TABLE" if the link has been specified as static). A new column 
> LINK_TBL_ID will be added which will contain the id of the imported table. It 
> will be NULL for all other table types including the regular managed tables. 
> When a partition is added to a link, the new row in the table PARTITIONS will 
> point to the LINK_TABLE in the same database  and not the master table in the 
> other database. We will replicate all the metadata for this partition from 
> the master database. The advantage of this approach is that fewer changes 
> will be needed in query processing and DDL for LINK_TABLEs. Also, commands 
> like "SHOW TABLES" and "SHOW PARTITIONS" will work as expected for 
> LINK_TABLEs too. Of course, even though the metadata is not shared, the 
> underlying data on disk is still shared. Hive still needs to know that when 
> dropping a partition which belongs to a LINK_TABLE, it should not drop the 
> underlying data from HDFS. Views and external tables cannot be imported from 
> one database to another.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira