[jira] [Commented] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13706598#comment-13706598 ] Bhushan Mandhani commented on HIVE-2989: Hi, Bhushan Mandhani is no longer at Facebook so this email address is no longer being monitored. If you need assistance, please contact another person who is currently at the company. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.10.patch.txt, HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4023) Improve Error Logging in MetaStore
Bhushan Mandhani created HIVE-4023: -- Summary: Improve Error Logging in MetaStore Key: HIVE-4023 URL: https://issues.apache.org/jira/browse/HIVE-4023 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Trivial The RetryingHMSHandler should log the entire stack trace before throwing an exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4023) Improve Error Logging in MetaStore
[ https://issues.apache.org/jira/browse/HIVE-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-4023: --- Attachment: HIVE-4023.1.patch.txt Improve Error Logging in MetaStore -- Key: HIVE-4023 URL: https://issues.apache.org/jira/browse/HIVE-4023 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Trivial Attachments: HIVE-4023.1.patch.txt The RetryingHMSHandler should log the entire stack trace before throwing an exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4023) Improve Error Logging in MetaStore
[ https://issues.apache.org/jira/browse/HIVE-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-4023: --- Status: Patch Available (was: Open) Improve Error Logging in MetaStore -- Key: HIVE-4023 URL: https://issues.apache.org/jira/browse/HIVE-4023 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Trivial Attachments: HIVE-4023.1.patch.txt The RetryingHMSHandler should log the entire stack trace before throwing an exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-3831) Add Command to Turn Sorting Off for a Bucketed Table
[ https://issues.apache.org/jira/browse/HIVE-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani resolved HIVE-3831. Resolution: Duplicate Add Command to Turn Sorting Off for a Bucketed Table Key: HIVE-3831 URL: https://issues.apache.org/jira/browse/HIVE-3831 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor If we have specified a bucketed table as sorted on some columns, there is no Hive command to turn the sorting off for that table. There are scenarios where we need to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-3716) Create Table Like should support TableProperties
[ https://issues.apache.org/jira/browse/HIVE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani resolved HIVE-3716. Resolution: Duplicate Fixed by Kevin Wilfong in another jira. Create Table Like should support TableProperties Key: HIVE-3716 URL: https://issues.apache.org/jira/browse/HIVE-3716 Project: Hive Issue Type: New Feature Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Trivial Create Table Like currently doesn't allow the specification of TableProperties for the created table. It will be useful to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3959) Update Partition Statistics in Metastore Layer
[ https://issues.apache.org/jira/browse/HIVE-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570910#comment-13570910 ] Bhushan Mandhani commented on HIVE-3959: Diff out at https://reviews.facebook.net/D8271 Still need some minor updates before I can submit the patch. Update Partition Statistics in Metastore Layer -- Key: HIVE-3959 URL: https://issues.apache.org/jira/browse/HIVE-3959 Project: Hive Issue Type: Improvement Components: Metastore, Statistics Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor When partitions are created using queries (insert overwrite and insert into) then the StatsTask updates all stats. However, when partitions are added directly through metadata-only partitions (either CLI or direct calls to Thrift Metastore) no stats are populated even if hive.stats.reliable is set to true. This puts us in a situation where we can't decide if stats are truly reliable or not. We propose that the fast stats (numFiles and totalSize) which don't require a scan of the data should always be populated and be completely reliable. For now we are still excluding rowCount and rawDataSize because that will make these operations very expensive. Currently they are quick metadata-only ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3959) Update Partition Statistics in Metastore Layer
Bhushan Mandhani created HIVE-3959: -- Summary: Update Partition Statistics in Metastore Layer Key: HIVE-3959 URL: https://issues.apache.org/jira/browse/HIVE-3959 Project: Hive Issue Type: Improvement Components: Metastore, Statistics Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor When partitions are created using queries (insert overwrite and insert into) then the StatsTask updates all stats. However, when partitions are added directly through metadata-only partitions (either CLI or direct calls to Thrift Metastore) no stats are populated even if hive.stats.reliable is set to true. This puts us in a situation where we can't decide if stats are truly reliable or not. We propose that the fast stats (numFiles and totalSize) which don't require a scan of the data should always be populated and be completely reliable. For now we are still excluding rowCount and rawDataSize because that will make these operations very expensive. Currently they are quick metadata-only ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3831) Add Command to Turn Sorting Off for a Bucketed Table
Bhushan Mandhani created HIVE-3831: -- Summary: Add Command to Turn Sorting Off for a Bucketed Table Key: HIVE-3831 URL: https://issues.apache.org/jira/browse/HIVE-3831 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor If we have specified a bucketed table as sorted on some columns, there is no Hive command to turn the sorting off for that table. There are scenarios where we need to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3806) Ptest failing due to Argument list too long errors
Bhushan Mandhani created HIVE-3806: -- Summary: Ptest failing due to Argument list too long errors Key: HIVE-3806 URL: https://issues.apache.org/jira/browse/HIVE-3806 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor ptest creates a really huge shell command to delete from each test host those .q files that it should not be running. For TestCliDriver, the command has become long enough that it is over the threshold allowed by the shell. We should rewrite it so that the same semantics is captured in a shorter command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3806) Ptest failing due to Argument list too long errors
[ https://issues.apache.org/jira/browse/HIVE-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13532934#comment-13532934 ] Bhushan Mandhani commented on HIVE-3806: Diff at https://reviews.facebook.net/D7413 Ptest failing due to Argument list too long errors Key: HIVE-3806 URL: https://issues.apache.org/jira/browse/HIVE-3806 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor ptest creates a really huge shell command to delete from each test host those .q files that it should not be running. For TestCliDriver, the command has become long enough that it is over the threshold allowed by the shell. We should rewrite it so that the same semantics is captured in a shorter command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3806) Ptest failing due to Argument list too long errors
[ https://issues.apache.org/jira/browse/HIVE-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3806: --- Attachment: HIVE-3806.1.patch.txt Ptest failing due to Argument list too long errors Key: HIVE-3806 URL: https://issues.apache.org/jira/browse/HIVE-3806 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3806.1.patch.txt ptest creates a really huge shell command to delete from each test host those .q files that it should not be running. For TestCliDriver, the command has become long enough that it is over the threshold allowed by the shell. We should rewrite it so that the same semantics is captured in a shorter command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3806) Ptest failing due to Argument list too long errors
[ https://issues.apache.org/jira/browse/HIVE-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3806: --- Status: Patch Available (was: Open) Ptest failing due to Argument list too long errors Key: HIVE-3806 URL: https://issues.apache.org/jira/browse/HIVE-3806 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3806.1.patch.txt ptest creates a really huge shell command to delete from each test host those .q files that it should not be running. For TestCliDriver, the command has become long enough that it is over the threshold allowed by the shell. We should rewrite it so that the same semantics is captured in a shorter command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3780) RetryingMetaStoreClient Should Log the Caught Exception
Bhushan Mandhani created HIVE-3780: -- Summary: RetryingMetaStoreClient Should Log the Caught Exception Key: HIVE-3780 URL: https://issues.apache.org/jira/browse/HIVE-3780 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Priority: Trivial Currently it logs the cause of the caught exception. It should log the caught exception itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3780) RetryingMetaStoreClient Should Log the Caught Exception
[ https://issues.apache.org/jira/browse/HIVE-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani reassigned HIVE-3780: -- Assignee: Bhushan Mandhani RetryingMetaStoreClient Should Log the Caught Exception --- Key: HIVE-3780 URL: https://issues.apache.org/jira/browse/HIVE-3780 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Trivial Currently it logs the cause of the caught exception. It should log the caught exception itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3780) RetryingMetaStoreClient Should Log the Caught Exception
[ https://issues.apache.org/jira/browse/HIVE-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13526762#comment-13526762 ] Bhushan Mandhani commented on HIVE-3780: Diff out at https://reviews.facebook.net/D7239 RetryingMetaStoreClient Should Log the Caught Exception --- Key: HIVE-3780 URL: https://issues.apache.org/jira/browse/HIVE-3780 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Trivial Currently it logs the cause of the caught exception. It should log the caught exception itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3780) RetryingMetaStoreClient Should Log the Caught Exception
[ https://issues.apache.org/jira/browse/HIVE-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3780: --- Attachment: HIVE-3780.1.patch.txt RetryingMetaStoreClient Should Log the Caught Exception --- Key: HIVE-3780 URL: https://issues.apache.org/jira/browse/HIVE-3780 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Trivial Attachments: HIVE-3780.1.patch.txt Currently it logs the cause of the caught exception. It should log the caught exception itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3780) RetryingMetaStoreClient Should Log the Caught Exception
[ https://issues.apache.org/jira/browse/HIVE-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3780: --- Status: Patch Available (was: Open) RetryingMetaStoreClient Should Log the Caught Exception --- Key: HIVE-3780 URL: https://issues.apache.org/jira/browse/HIVE-3780 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Trivial Attachments: HIVE-3780.1.patch.txt Currently it logs the cause of the caught exception. It should log the caught exception itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3400) Add Retries to Hive MetaStore Connections
[ https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3400: --- Attachment: HIVE-3400.3.patch.txt Add Retries to Hive MetaStore Connections - Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Labels: metastore Attachments: HIVE-3400.1.patch.txt, HIVE-3400.2.patch.txt, HIVE-3400.3.patch.txt Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3400) Add Retries to Hive MetaStore Connections
[ https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3400: --- Labels: metastore (was: ) Status: Patch Available (was: Open) Add Retries to Hive MetaStore Connections - Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Labels: metastore Attachments: HIVE-3400.1.patch.txt, HIVE-3400.2.patch.txt Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3400) Add Retries to Hive MetaStore Connections
[ https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507626#comment-13507626 ] Bhushan Mandhani commented on HIVE-3400: Ashutosh, I've uploaded and submitted the latest patch. Thanks. Add Retries to Hive MetaStore Connections - Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Labels: metastore Attachments: HIVE-3400.1.patch.txt, HIVE-3400.2.patch.txt Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3400) Add Retries to Hive MetaStore Connections
[ https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507768#comment-13507768 ] Bhushan Mandhani commented on HIVE-3400: Ashutosh, we no longer need HIVE-3612. Jean is about to abandon that diff. I think we should keep these RetryingRawStore changes here since RetryingHMSHandler already catches JDOExceptions. But I can take it out if you prefer that. Add Retries to Hive MetaStore Connections - Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Labels: metastore Attachments: HIVE-3400.1.patch.txt, HIVE-3400.2.patch.txt Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3400) Add Retries to Hive MetaStore Connections
[ https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505175#comment-13505175 ] Bhushan Mandhani commented on HIVE-3400: Yes, I'll do Submit Patch after making one key change that Carl pointed out. Working on that now. Add Retries to Hive MetaStore Connections - Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3400.1.patch.txt Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3744) Thrift create_table should check types of table columns
Bhushan Mandhani created HIVE-3744: -- Summary: Thrift create_table should check types of table columns Key: HIVE-3744 URL: https://issues.apache.org/jira/browse/HIVE-3744 Project: Hive Issue Type: Bug Components: Thrift API Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor The Thrift create_table() does not look at the datatype strings of Table objects coming in through Thrift. When someone fails to set one of them, we can end up with empty string for datatype and corrupt metadata. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3716) Create Table Like should support TableProperties
Bhushan Mandhani created HIVE-3716: -- Summary: Create Table Like should support TableProperties Key: HIVE-3716 URL: https://issues.apache.org/jira/browse/HIVE-3716 Project: Hive Issue Type: New Feature Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Trivial Create Table Like currently doesn't allow the specification of TableProperties for the created table. It will be useful to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3626) RetryingHMSHandler should wrap JDOException inside MetaException
[ https://issues.apache.org/jira/browse/HIVE-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3626: --- Attachment: HIVE-3626.1.patch.txt RetryingHMSHandler should wrap JDOException inside MetaException Key: HIVE-3626 URL: https://issues.apache.org/jira/browse/HIVE-3626 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3626.1.patch.txt RetryingHMSHandler catches and retries for JDOExceptions. If retry limit is exceeded, it throws the caught JDOException. Instead it should wrap this up in a MetaException. That way the error message and stack trace can be successfully communicated back to the Hive Client by the Thrift host and the client can show better error messages and log messages. Otherwise, everything appears as a TException with no further debugging info. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3626) RetryingHMSHandler should wrap JDOException inside MetaException
Bhushan Mandhani created HIVE-3626: -- Summary: RetryingHMSHandler should wrap JDOException inside MetaException Key: HIVE-3626 URL: https://issues.apache.org/jira/browse/HIVE-3626 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor RetryingHMSHandler catches and retries for JDOExceptions. If retry limit is exceeded, it throws the caught JDOException. Instead it should wrap this up in a MetaException. That way the error message and stack trace can be successfully communicated back to the Hive Client by the Thrift host and the client can show better error messages and log messages. Otherwise, everything appears as a TException with no further debugging info. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3626) RetryingHMSHandler should wrap JDOException inside MetaException
[ https://issues.apache.org/jira/browse/HIVE-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485292#comment-13485292 ] Bhushan Mandhani commented on HIVE-3626: Diff at https://reviews.facebook.net/D6255 RetryingHMSHandler should wrap JDOException inside MetaException Key: HIVE-3626 URL: https://issues.apache.org/jira/browse/HIVE-3626 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor RetryingHMSHandler catches and retries for JDOExceptions. If retry limit is exceeded, it throws the caught JDOException. Instead it should wrap this up in a MetaException. That way the error message and stack trace can be successfully communicated back to the Hive Client by the Thrift host and the client can show better error messages and log messages. Otherwise, everything appears as a TException with no further debugging info. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3215) JobDebugger should use RunningJob.getTrackingURL
[ https://issues.apache.org/jira/browse/HIVE-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3215: --- Attachment: HIVE-3215.5.patch.txt Refreshed patch. JobDebugger should use RunningJob.getTrackingURL - Key: HIVE-3215 URL: https://issues.apache.org/jira/browse/HIVE-3215 Project: Hive Issue Type: Bug Components: CLI, Diagnosability Reporter: Ramkumar Vadali Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3215.2.patch, HIVE-3215.3.patch.txt, HIVE-3215.4.patch.txt, HIVE-3215.5.patch.txt, HIVE-3215.patch When a MR job fails, the JobDebugger tries to construct the job tracker URL by connecting to the job tracker, but that is better done by using RunningJob#getTrackingURL. Also, it tries to construct URLs to the tasks, which is not reliable, because the job could have been retired and the URL would not work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3215) JobDebugger should use RunningJob.getTrackingURL
[ https://issues.apache.org/jira/browse/HIVE-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3215: --- Status: Patch Available (was: Open) JobDebugger should use RunningJob.getTrackingURL - Key: HIVE-3215 URL: https://issues.apache.org/jira/browse/HIVE-3215 Project: Hive Issue Type: Bug Components: CLI, Diagnosability Reporter: Ramkumar Vadali Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3215.2.patch, HIVE-3215.3.patch.txt, HIVE-3215.4.patch.txt, HIVE-3215.5.patch.txt, HIVE-3215.patch When a MR job fails, the JobDebugger tries to construct the job tracker URL by connecting to the job tracker, but that is better done by using RunningJob#getTrackingURL. Also, it tries to construct URLs to the tasks, which is not reliable, because the job could have been retired and the URL would not work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3400) Add Retries to Hive MetaStore Connections
[ https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3400: --- Status: Patch Available (was: Open) Add Retries to Hive MetaStore Connections - Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3400.1.patch.txt Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3400) Add Retries to Hive MetaStore Connections
[ https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3400: --- Status: Patch Available (was: Open) Add Retries to Hive MetaStore Connections - Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3400.1.patch.txt Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3400) Add Retries to Hive MetaStore Connections
[ https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3400: --- Attachment: HIVE-3400.1.patch.txt Add Retries to Hive MetaStore Connections - Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3400.1.patch.txt Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3400) Add Retries to Hive MetaStore Connections
[ https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13446419#comment-13446419 ] Bhushan Mandhani commented on HIVE-3400: Thanks Ashutosh. I've updated per your suggestion. Add Retries to Hive MetaStore Connections - Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3400.1.patch.txt Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3400) Add Retries to Hive MetaStore Connections
Bhushan Mandhani created HIVE-3400: -- Summary: Add Retries to Hive MetaStore Connections Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3400) Add Retries to Hive MetaStore Connections
[ https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438983#comment-13438983 ] Bhushan Mandhani commented on HIVE-3400: Diff at https://reviews.facebook.net/D4791 Add Retries to Hive MetaStore Connections - Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3400) Add Retries to Hive MetaStore Connections
[ https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439023#comment-13439023 ] Bhushan Mandhani commented on HIVE-3400: Yes, it is. Add Retries to Hive MetaStore Connections - Key: HIVE-3400 URL: https://issues.apache.org/jira/browse/HIVE-3400 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Currently, when using Thrift to access the MetaStore, if the Thrift host dies, there is no mechanism to reconnect to some other host even if the MetaStore URIs variable in the Conf contains multiple hosts. Hive should retry and reconnect rather than throwing a communication link error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3337) Create Table Like should copy configured Table Parameters
[ https://issues.apache.org/jira/browse/HIVE-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434799#comment-13434799 ] Bhushan Mandhani commented on HIVE-3337: Carl, did you get time to re-run the tests for this? Thanks. Create Table Like should copy configured Table Parameters - Key: HIVE-3337 URL: https://issues.apache.org/jira/browse/HIVE-3337 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Labels: configuration Attachments: HIVE-3337.1.patch.txt Currently, Create Table A Like B does not copy any Table Parameters of B into A. We will add a HiveConf variable that will allow users to specify a list of parameters that they want copied over to A for this command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3337) Create Table Like should copy configured Table Parameters
[ https://issues.apache.org/jira/browse/HIVE-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3337: --- Attachment: HIVE-3337.1.patch.txt Create Table Like should copy configured Table Parameters - Key: HIVE-3337 URL: https://issues.apache.org/jira/browse/HIVE-3337 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3337.1.patch.txt Currently, Create Table A Like B does not copy any Table Parameters of B into A. We will add a HiveConf variable that will allow users to specify a list of parameters that they want copied over to A for this command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3337) Create Table Like should copy configured Table Parameters
[ https://issues.apache.org/jira/browse/HIVE-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3337: --- Labels: configuration (was: ) Status: Patch Available (was: Open) Submitting patch corresponding to latest diff. Create Table Like should copy configured Table Parameters - Key: HIVE-3337 URL: https://issues.apache.org/jira/browse/HIVE-3337 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Labels: configuration Attachments: HIVE-3337.1.patch.txt Currently, Create Table A Like B does not copy any Table Parameters of B into A. We will add a HiveConf variable that will allow users to specify a list of parameters that they want copied over to A for this command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3337) Create Table Like should copy configured Table Parameters
Bhushan Mandhani created HIVE-3337: -- Summary: Create Table Like should copy configured Table Parameters Key: HIVE-3337 URL: https://issues.apache.org/jira/browse/HIVE-3337 Project: Hive Issue Type: Improvement Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Currently, Create Table A Like B does not copy any Table Parameters of B into A. We will add a HiveConf variable that will allow users to specify a list of parameters that they want copied over to A for this command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3337) Create Table Like should copy configured Table Parameters
[ https://issues.apache.org/jira/browse/HIVE-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13429405#comment-13429405 ] Bhushan Mandhani commented on HIVE-3337: Because users might want to just copy the schema over without bringing over all parameters. It is too much of a burden for them to examine each parameter and decide whether it is applicable for them or not. At Facebook, we will have a core set of parameters that will be copied and everything else excluded. Create Table Like should copy configured Table Parameters - Key: HIVE-3337 URL: https://issues.apache.org/jira/browse/HIVE-3337 Project: Hive Issue Type: Improvement Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Currently, Create Table A Like B does not copy any Table Parameters of B into A. We will add a HiveConf variable that will allow users to specify a list of parameters that they want copied over to A for this command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3337) Create Table Like should copy configured Table Parameters
[ https://issues.apache.org/jira/browse/HIVE-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13429438#comment-13429438 ] Bhushan Mandhani commented on HIVE-3337: We have other parameters related to anonymizing user data that are not known to Hive but need to be copied over. Create Table Like should copy configured Table Parameters - Key: HIVE-3337 URL: https://issues.apache.org/jira/browse/HIVE-3337 Project: Hive Issue Type: Improvement Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Currently, Create Table A Like B does not copy any Table Parameters of B into A. We will add a HiveConf variable that will allow users to specify a list of parameters that they want copied over to A for this command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3337) Create Table Like should copy configured Table Parameters
[ https://issues.apache.org/jira/browse/HIVE-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13429450#comment-13429450 ] Bhushan Mandhani commented on HIVE-3337: Diff is at https://reviews.facebook.net/D4521 Create Table Like should copy configured Table Parameters - Key: HIVE-3337 URL: https://issues.apache.org/jira/browse/HIVE-3337 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Currently, Create Table A Like B does not copy any Table Parameters of B into A. We will add a HiveConf variable that will allow users to specify a list of parameters that they want copied over to A for this command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3215) JobDebugger should use RunningJob.getTrackingURL
[ https://issues.apache.org/jira/browse/HIVE-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3215: --- Attachment: HIVE-3215.4.patch.txt JobDebugger should use RunningJob.getTrackingURL - Key: HIVE-3215 URL: https://issues.apache.org/jira/browse/HIVE-3215 Project: Hive Issue Type: Bug Components: CLI Reporter: Ramkumar Vadali Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3215.2.patch, HIVE-3215.3.patch.txt, HIVE-3215.4.patch.txt, HIVE-3215.patch When a MR job fails, the JobDebugger tries to construct the job tracker URL by connecting to the job tracker, but that is better done by using RunningJob#getTrackingURL. Also, it tries to construct URLs to the tasks, which is not reliable, because the job could have been retired and the URL would not work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3215) JobDebugger should use RunningJob.getTrackingURL
[ https://issues.apache.org/jira/browse/HIVE-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425447#comment-13425447 ] Bhushan Mandhani commented on HIVE-3215: Uploaded updated patch. JobDebugger should use RunningJob.getTrackingURL - Key: HIVE-3215 URL: https://issues.apache.org/jira/browse/HIVE-3215 Project: Hive Issue Type: Bug Components: CLI Reporter: Ramkumar Vadali Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3215.2.patch, HIVE-3215.3.patch.txt, HIVE-3215.4.patch.txt, HIVE-3215.patch When a MR job fails, the JobDebugger tries to construct the job tracker URL by connecting to the job tracker, but that is better done by using RunningJob#getTrackingURL. Also, it tries to construct URLs to the tasks, which is not reliable, because the job could have been retired and the URL would not work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3215) JobDebugger should use RunningJob.getTrackingURL
[ https://issues.apache.org/jira/browse/HIVE-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani reassigned HIVE-3215: -- Assignee: Bhushan Mandhani (was: Ramkumar Vadali) JobDebugger should use RunningJob.getTrackingURL - Key: HIVE-3215 URL: https://issues.apache.org/jira/browse/HIVE-3215 Project: Hive Issue Type: Bug Components: CLI Reporter: Ramkumar Vadali Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3215.2.patch, HIVE-3215.patch When a MR job fails, the JobDebugger tries to construct the job tracker URL by connecting to the job tracker, but that is better done by using RunningJob#getTrackingURL. Also, it tries to construct URLs to the tasks, which is not reliable, because the job could have been retired and the URL would not work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3215) JobDebugger should use RunningJob.getTrackingURL
[ https://issues.apache.org/jira/browse/HIVE-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3215: --- Attachment: HIVE-3215.3.patch.txt JobDebugger should use RunningJob.getTrackingURL - Key: HIVE-3215 URL: https://issues.apache.org/jira/browse/HIVE-3215 Project: Hive Issue Type: Bug Components: CLI Reporter: Ramkumar Vadali Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3215.2.patch, HIVE-3215.3.patch.txt, HIVE-3215.patch When a MR job fails, the JobDebugger tries to construct the job tracker URL by connecting to the job tracker, but that is better done by using RunningJob#getTrackingURL. Also, it tries to construct URLs to the tasks, which is not reliable, because the job could have been retired and the URL would not work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3215) JobDebugger should use RunningJob.getTrackingURL
[ https://issues.apache.org/jira/browse/HIVE-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3215: --- Status: Patch Available (was: Open) JobDebugger should use RunningJob.getTrackingURL - Key: HIVE-3215 URL: https://issues.apache.org/jira/browse/HIVE-3215 Project: Hive Issue Type: Bug Components: CLI Reporter: Ramkumar Vadali Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3215.2.patch, HIVE-3215.3.patch.txt, HIVE-3215.patch When a MR job fails, the JobDebugger tries to construct the job tracker URL by connecting to the job tracker, but that is better done by using RunningJob#getTrackingURL. Also, it tries to construct URLs to the tasks, which is not reliable, because the job could have been retired and the URL would not work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3215) JobDebugger should use RunningJob.getTrackingURL
[ https://issues.apache.org/jira/browse/HIVE-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424101#comment-13424101 ] Bhushan Mandhani commented on HIVE-3215: Phabricator diff is at https://reviews.facebook.net/D4401 JobDebugger should use RunningJob.getTrackingURL - Key: HIVE-3215 URL: https://issues.apache.org/jira/browse/HIVE-3215 Project: Hive Issue Type: Bug Components: CLI Reporter: Ramkumar Vadali Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3215.2.patch, HIVE-3215.3.patch.txt, HIVE-3215.patch When a MR job fails, the JobDebugger tries to construct the job tracker URL by connecting to the job tracker, but that is better done by using RunningJob#getTrackingURL. Also, it tries to construct URLs to the tasks, which is not reliable, because the job could have been retired and the URL would not work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3124) Error in Removing ProtectMode from a Table
[ https://issues.apache.org/jira/browse/HIVE-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3124: --- Attachment: HIVE-3124.1.patch.txt Error in Removing ProtectMode from a Table -- Key: HIVE-3124 URL: https://issues.apache.org/jira/browse/HIVE-3124 Project: Hive Issue Type: Bug Components: Metastore, Query Processor Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3124.1.patch.txt hive alter table table_name disable NO_DROP CASCADE; Failed with exception null FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3124) Error in Removing ProtectMode from a Table
[ https://issues.apache.org/jira/browse/HIVE-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3124: --- Affects Version/s: 0.9.0 Status: Patch Available (was: Open) Error in Removing ProtectMode from a Table -- Key: HIVE-3124 URL: https://issues.apache.org/jira/browse/HIVE-3124 Project: Hive Issue Type: Bug Components: Metastore, Query Processor Affects Versions: 0.9.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3124.1.patch.txt hive alter table table_name disable NO_DROP CASCADE; Failed with exception null FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3124) Error in Removing ProtectMode from a Table
[ https://issues.apache.org/jira/browse/HIVE-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396279#comment-13396279 ] Bhushan Mandhani commented on HIVE-3124: Carl, the uploaded version is the most recent version. Error in Removing ProtectMode from a Table -- Key: HIVE-3124 URL: https://issues.apache.org/jira/browse/HIVE-3124 Project: Hive Issue Type: Bug Components: Metastore, Query Processor Affects Versions: 0.9.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Attachments: HIVE-3124.1.patch.txt hive alter table table_name disable NO_DROP CASCADE; Failed with exception null FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3144) Custom Deserializers Should be Used for Thrift Table Objects
Bhushan Mandhani created HIVE-3144: -- Summary: Custom Deserializers Should be Used for Thrift Table Objects Key: HIVE-3144 URL: https://issues.apache.org/jira/browse/HIVE-3144 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Some tables have custom deserializers. When constructing the column set for these, the deserializer class should be used instead of just reading from the MetaStore. This is already happening when constructing the Table object in package org/apache/hadoop/hive/ql/metadata. However, Thrift API is returning incorrect results for these. We want to move this logic out of this Table object and lower into the stack to fix this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3144) Custom Deserializers Should be Used for Thrift Table Objects
[ https://issues.apache.org/jira/browse/HIVE-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13295888#comment-13295888 ] Bhushan Mandhani commented on HIVE-3144: Yes it is. Custom Deserializers Should be Used for Thrift Table Objects Key: HIVE-3144 URL: https://issues.apache.org/jira/browse/HIVE-3144 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Some tables have custom deserializers. When constructing the column set for these, the deserializer class should be used instead of just reading from the MetaStore. This is already happening when constructing the Table object in package org/apache/hadoop/hive/ql/metadata. However, Thrift API is returning incorrect results for these. We want to move this logic out of this Table object and lower into the stack to fix this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3144) Custom Deserializers Should be Used for Thrift Table Objects
[ https://issues.apache.org/jira/browse/HIVE-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13295894#comment-13295894 ] Bhushan Mandhani commented on HIVE-3144: No they don't solve this problem. One of those two patches is anyway uncommitted. Custom Deserializers Should be Used for Thrift Table Objects Key: HIVE-3144 URL: https://issues.apache.org/jira/browse/HIVE-3144 Project: Hive Issue Type: Bug Components: Metastore Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Some tables have custom deserializers. When constructing the column set for these, the deserializer class should be used instead of just reading from the MetaStore. This is already happening when constructing the Table object in package org/apache/hadoop/hive/ql/metadata. However, Thrift API is returning incorrect results for these. We want to move this logic out of this Table object and lower into the stack to fix this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.10.patch.txt Added querying and addressed latest comments. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.10.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3124) Error in Removing ProtectMode from a Table
Bhushan Mandhani created HIVE-3124: -- Summary: Error in Removing ProtectMode from a Table Key: HIVE-3124 URL: https://issues.apache.org/jira/browse/HIVE-3124 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor hive alter table table_name disable NO_DROP CASCADE; Failed with exception null FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3124) Error in Removing ProtectMode from a Table
[ https://issues.apache.org/jira/browse/HIVE-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294117#comment-13294117 ] Bhushan Mandhani commented on HIVE-3124: Diff out at https://reviews.facebook.net/D3615 Error in Removing ProtectMode from a Table -- Key: HIVE-3124 URL: https://issues.apache.org/jira/browse/HIVE-3124 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor hive alter table table_name disable NO_DROP CASCADE; Failed with exception null FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.9.patch.txt Updated per comments. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Hadoop Flags: (was: Reviewed) Status: Patch Available (was: Open) Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3113) Querying of Table Links
Bhushan Mandhani created HIVE-3113: -- Summary: Querying of Table Links Key: HIVE-3113 URL: https://issues.apache.org/jira/browse/HIVE-3113 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Implementation of querying of Table Links -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3114) Split Thrift interface for Table Link Creation
Bhushan Mandhani created HIVE-3114: -- Summary: Split Thrift interface for Table Link Creation Key: HIVE-3114 URL: https://issues.apache.org/jira/browse/HIVE-3114 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Table Link creation through Thrift currently goes through the same method as Table creation. We want to move it out of there and into it's own method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3115) Table Links and Authorization
Bhushan Mandhani created HIVE-3115: -- Summary: Table Links and Authorization Key: HIVE-3115 URL: https://issues.apache.org/jira/browse/HIVE-3115 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Incorporate Table Links into the existing authorization framework in Hive. Add tests to check that no breach of security permissions is possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289891#comment-13289891 ] Bhushan Mandhani commented on HIVE-2989: Carl, we discussed this among us and strongly believe the user should be able to refer to the target table in his queries without having to think about What do I call this table? Briefly, we were even considering the conventional X.Y access but that had other issues and we wanted to disable that syntax completely in our system. I have made the changes you wanted for name validation in MetaStoreUtils and there is minimal special case logic there. You can look at it when I update the diff. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289897#comment-13289897 ] Bhushan Mandhani commented on HIVE-2989: One criteria was that the user should not be able to create a Managed Table with a name that looks like the name of a Table Link. If User A creates such a table with name X_at_Y, and user B comes along and tries to create a link to X@Y the command will fail. Worse, user B could see the table X_at_Y exists and query it assuming it is a link to the table X he is trying to query. Also, the X@Y syntax is already being used by Oracle for its Database Links. So we decided to use the same syntax. When you change the name of a target table, the name of the Links should be updated to reflect that. Sambavi will do that in the Alter Table patch. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289056#comment-13289056 ] Bhushan Mandhani commented on HIVE-2989: It is not a bug. It is currently being implemented by Sambavi. She is handling Alter Link and Alter Table commands. This particular patch contains Create, Desc and Drop. Sambavi is working on another patch that will do the two Alter commands. I will do querying in a follow-up patch of my own. There is a lot of functionality here that needs to be built out. We are planning to do this in multiple patches. We would like to get this one in first because everything else will build on top of this. We have thought about all these questions you are bringing up and created the corresponding tasks. I'll update the design doc with this info. I and Sambavi will create the corresponding Jiras as well. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289072#comment-13289072 ] Bhushan Mandhani commented on HIVE-2989: When a target table is dropped, all links pointing to it will automatically get dropped as well. That task is on my plate. I was planning to do that after querying. When someone comes along and creates a new target table, there is no link pointing to it. So there is no security vulnerability. I don't think we need to change struct TableIdentifier. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289074#comment-13289074 ] Bhushan Mandhani commented on HIVE-2989: Alter Table will be modified so that metadata changes made to the target table will propagate to the link table if appropriate. That is the expected behavior. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288340#comment-13288340 ] Bhushan Mandhani commented on HIVE-2989: @Carl It is in https://reviews.facebook.net/D3405. Thanks. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Affects Version/s: (was: 0.8.1) 0.10.0 Status: Patch Available (was: Open) Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani reopened HIVE-2989: Reopening for further comments if any. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.6.patch.txt Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Status: Patch Available (was: Reopened) Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287662#comment-13287662 ] Bhushan Mandhani commented on HIVE-2989: Please review promptly. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.3.patch.txt Added DROP and updated per comments. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.8.1 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.4.patch.txt Updated per Namit's comments. Except for comments 6 and 7. 7 was already done and 6 was not entirely applicable. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.8.1 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.5.patch.txt Had forgotten to rerecord a couple of negative tests. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.8.1 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284600#comment-13284600 ] Bhushan Mandhani commented on HIVE-2989: Carl, I've included the MetaStore upgrade scripts now. Regarding your first comment, let's discuss that on the design wiki page to keep the discussion in one place. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.8.1 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Fix For: 0.9.0 Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.1.patch.txt Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Fix For: 0.10.0 Attachments: HIVE-2989.1.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Fix Version/s: (was: 0.10.0) 0.9.0 Affects Version/s: 0.8.1 Status: Patch Available (was: Open) Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.8.1 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Fix For: 0.9.0 Attachments: HIVE-2989.1.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3040) Hive Exec jar should not include Thrift
Bhushan Mandhani created HIVE-3040: -- Summary: Hive Exec jar should not include Thrift Key: HIVE-3040 URL: https://issues.apache.org/jira/browse/HIVE-3040 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.8.1 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Fix For: 0.8.1 hive-exec jar includes Thrift classes even though it does not need them. This can create problems because it can load some wrong version of Thrift and other jars that need Thrift get stuck with the wrong version. We will remove Thrift from this jar. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3001) Returning Meaningful Error Codes Messages
[ https://issues.apache.org/jira/browse/HIVE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3001: --- Attachment: HIVE-3001.1.patch.txt Reviewed in https://reviews.facebook.net/D3153 Returning Meaningful Error Codes Messages --- Key: HIVE-3001 URL: https://issues.apache.org/jira/browse/HIVE-3001 Project: Hive Issue Type: New Feature Components: Diagnosability Affects Versions: 0.9.1 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Fix For: 0.9.1 Attachments: HIVE-3001.1.patch.txt Original Estimate: 48h Remaining Estimate: 48h Hive does not return meaningful error messages for runtime errors. Also, the same error code is returned for a whole bunch of unrelated errors. A programmatic caller cannot decide if it should retry or give up. This JIRA will get the ball rolling for having Hive return useful error codes and display useful messages when something goes wrong. I propose the following partitioning of error codes: 1 to 1: Errors that occur during semantic analysis and compilation of the query. Hive already does a pretty good job for these. Error codes will be attached to the error messages currently being used. 2 to 2: Runtime errors where Hive believes that retries will not succeed and the caller should not bother retrying. 3 to 3: Runtime errors which Hive thinks are probably transient and retrying may succeed. 4 to 4: Runtime errors where Hive is unable to say anything about whether retries will succeed or not. Ideally, we want to avoid using this range as much as possible. Once we have this in place, over time we can migrate errors occurring in Hive operators to use this scheme. This patch will deal with setting up the error code space, setting up the mechanism for failed MapReduce tasks to relay the error code back to Hive client, and using this new scheme for a couple of common errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3001) Returning Meaningful Error Codes Messages
[ https://issues.apache.org/jira/browse/HIVE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-3001: --- Fix Version/s: (was: 0.9.1) 0.9.0 Labels: diagnostics (was: ) Affects Version/s: (was: 0.9.1) 0.8.1 Status: Patch Available (was: Open) Returning Meaningful Error Codes Messages --- Key: HIVE-3001 URL: https://issues.apache.org/jira/browse/HIVE-3001 Project: Hive Issue Type: New Feature Components: Diagnosability Affects Versions: 0.8.1 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Labels: diagnostics Fix For: 0.9.0 Attachments: HIVE-3001.1.patch.txt Original Estimate: 48h Remaining Estimate: 48h Hive does not return meaningful error messages for runtime errors. Also, the same error code is returned for a whole bunch of unrelated errors. A programmatic caller cannot decide if it should retry or give up. This JIRA will get the ball rolling for having Hive return useful error codes and display useful messages when something goes wrong. I propose the following partitioning of error codes: 1 to 1: Errors that occur during semantic analysis and compilation of the query. Hive already does a pretty good job for these. Error codes will be attached to the error messages currently being used. 2 to 2: Runtime errors where Hive believes that retries will not succeed and the caller should not bother retrying. 3 to 3: Runtime errors which Hive thinks are probably transient and retrying may succeed. 4 to 4: Runtime errors where Hive is unable to say anything about whether retries will succeed or not. Ideally, we want to avoid using this range as much as possible. Once we have this in place, over time we can migrate errors occurring in Hive operators to use this scheme. This patch will deal with setting up the error code space, setting up the mechanism for failed MapReduce tasks to relay the error code back to Hive client, and using this new scheme for a couple of common errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3001) Returning Meaningful Error Codes Messages
Bhushan Mandhani created HIVE-3001: -- Summary: Returning Meaningful Error Codes Messages Key: HIVE-3001 URL: https://issues.apache.org/jira/browse/HIVE-3001 Project: Hive Issue Type: New Feature Components: Diagnosability Affects Versions: 0.9.1 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Fix For: 0.9.1 Hive does not return meaningful error messages for runtime errors. Also, the same error code is returned for a whole bunch of unrelated errors. A programmatic caller cannot decide if it should retry or give up. This JIRA will get the ball rolling for having Hive return useful error codes and display useful messages when something goes wrong. I propose the following partitioning of error codes: 1 to 1: Errors that occur during semantic analysis and compilation of the query. Hive already does a pretty good job for these. Error codes will be attached to the error messages currently being used. 2 to 2: Runtime errors where Hive believes that retries will not succeed and the caller should not bother retrying. 3 to 3: Runtime errors which Hive thinks are probably transient and retrying may succeed. 4 to 4: Runtime errors where Hive is unable to say anything about whether retries will succeed or not. Ideally, we want to avoid using this range as much as possible. Once we have this in place, over time we can migrate errors occurring in Hive operators to use this scheme. This patch will deal with setting up the error code space, setting up the mechanism for failed MapReduce tasks to relay the error code back to Hive client, and using this new scheme for a couple of common errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2989) Adding Table Links to Hive
Bhushan Mandhani created HIVE-2989: -- Summary: Adding Table Links to Hive Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Fix For: 0.10.0 This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira