[jira] Created: (HIVE-1679) MetaStore does not detect and rollback failed transactions
MetaStore does not detect and rollback failed transactions -- Key: HIVE-1679 URL: https://issues.apache.org/jira/browse/HIVE-1679 Project: Hadoop Hive Issue Type: Bug Components: Metastore Affects Versions: 0.5.0, 0.6.0, 0.7.0 Reporter: Carl Steinbach Most of the methods in HiveMetaStore and ObjectStore adhere to the following idiom when interacting with the ObjectStore: {code} boolean success = false; try { ms.openTransaction(); /* do some stuff */ success = ms.commitTransaction(); } finally { if (!success) { ms.rollbackTransaction(); } } {code} The problem with this is that ObjectStore.commitTransaction() always returns TRUE: {code} public boolean commitTransaction() { assert (openTrasactionCalls = 1); if (!currentTransaction.isActive()) { throw new RuntimeException( Commit is called, but transaction is not active. Either there are + mismatching open and close calls or rollback was called in the same trasaction); } openTrasactionCalls--; if ((openTrasactionCalls == 0) currentTransaction.isActive()) { transactionStatus = TXN_STATUS.COMMITED; currentTransaction.commit(); } return true; } {code} Consequently, the transaction appears to always succeed and ObjectStore is never directed to rollback transactions that have actually failed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1681) ObjectStore.commitTransaction() does not properly handle transactions that have already been rolled back
ObjectStore.commitTransaction() does not properly handle transactions that have already been rolled back Key: HIVE-1681 URL: https://issues.apache.org/jira/browse/HIVE-1681 Project: Hadoop Hive Issue Type: Bug Components: Metastore Affects Versions: 0.5.0, 0.6.0, 0.7.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Here's the code for ObjectStore.commitTransaction() and ObjectStore.rollbackTransaction(): {code} public boolean commitTransaction() { assert (openTrasactionCalls = 1); if (!currentTransaction.isActive()) { throw new RuntimeException( Commit is called, but transaction is not active. Either there are + mismatching open and close calls or rollback was called in the same trasaction); } openTrasactionCalls--; if ((openTrasactionCalls == 0) currentTransaction.isActive()) { transactionStatus = TXN_STATUS.COMMITED; currentTransaction.commit(); } return true; } public void rollbackTransaction() { if (openTrasactionCalls 1) { return; } openTrasactionCalls = 0; if (currentTransaction.isActive() transactionStatus != TXN_STATUS.ROLLBACK) { transactionStatus = TXN_STATUS.ROLLBACK; // could already be rolled back currentTransaction.rollback(); } } {code} Now suppose a nested transaction throws an exception which results in the nested pseudo-transaction calling rollbackTransaction(). This causes rollbackTransaction() to rollback the actual transaction, as well as to set openTransactionCalls=0 and transactionStatus = TXN_STATUS.ROLLBACK. Suppose also that this nested transaction squelches the original exception. In this case the stack will unwind and the caller will eventually try to commit the transaction by calling commitTransaction() which will see that currentTransaction.isActive() returns FALSE and will throw a RuntimeException. The fix for this problem is that commitTransaction() needs to first check transactionStatus and return immediately if transactionStatus==TXN_STATUS.ROLLBACK. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1681) ObjectStore.commitTransaction() does not properly handle transactions that have already been rolled back
[ https://issues.apache.org/jira/browse/HIVE-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1681: - Attachment: HIVE-1681.1.patch.txt ObjectStore.commitTransaction() does not properly handle transactions that have already been rolled back Key: HIVE-1681 URL: https://issues.apache.org/jira/browse/HIVE-1681 Project: Hadoop Hive Issue Type: Bug Components: Metastore Affects Versions: 0.5.0, 0.6.0, 0.7.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1681.1.patch.txt Here's the code for ObjectStore.commitTransaction() and ObjectStore.rollbackTransaction(): {code} public boolean commitTransaction() { assert (openTrasactionCalls = 1); if (!currentTransaction.isActive()) { throw new RuntimeException( Commit is called, but transaction is not active. Either there are + mismatching open and close calls or rollback was called in the same trasaction); } openTrasactionCalls--; if ((openTrasactionCalls == 0) currentTransaction.isActive()) { transactionStatus = TXN_STATUS.COMMITED; currentTransaction.commit(); } return true; } public void rollbackTransaction() { if (openTrasactionCalls 1) { return; } openTrasactionCalls = 0; if (currentTransaction.isActive() transactionStatus != TXN_STATUS.ROLLBACK) { transactionStatus = TXN_STATUS.ROLLBACK; // could already be rolled back currentTransaction.rollback(); } } {code} Now suppose a nested transaction throws an exception which results in the nested pseudo-transaction calling rollbackTransaction(). This causes rollbackTransaction() to rollback the actual transaction, as well as to set openTransactionCalls=0 and transactionStatus = TXN_STATUS.ROLLBACK. Suppose also that this nested transaction squelches the original exception. In this case the stack will unwind and the caller will eventually try to commit the transaction by calling commitTransaction() which will see that currentTransaction.isActive() returns FALSE and will throw a RuntimeException. The fix for this problem is that commitTransaction() needs to first check transactionStatus and return immediately if transactionStatus==TXN_STATUS.ROLLBACK. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-1157. -- Resolution: Duplicate UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, HIVE-1157.patch.v4.txt, HIVE-1157.patch.v5.txt, HIVE-1157.patch.v6.txt, HIVE-1157.v2.patch.txt, output.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1427) Provide metastore schema migration scripts (0.5 - 0.6)
[ https://issues.apache.org/jira/browse/HIVE-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1427: - Attachment: HIVE-1427.1.patch.txt HIVE-1427.1.patch.txt: * Upgrade scripts for derby and mysql. * Includes all schema changes between 0.5.0 and branch-0.6, along with proposed changes in HIVE-1364. I'm in the process of running upgrade tests on Derby and MySQL. Provide metastore schema migration scripts (0.5 - 0.6) --- Key: HIVE-1427 URL: https://issues.apache.org/jira/browse/HIVE-1427 Project: Hadoop Hive Issue Type: Task Components: Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.6.0 Attachments: HIVE-1427.1.patch.txt At a minimum this ticket covers packaging up example MySQL migration scripts (cumulative across all schema changes from 0.5 to 0.6) and explaining what to do with them in the release notes. This is also probably a good point at which to decide and clearly state which Metastore DBs we officially support in production, e.g. do we need to provide migration scripts for Derby? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1427) Provide metastore schema migration scripts (0.5 - 0.6)
[ https://issues.apache.org/jira/browse/HIVE-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1427: - Status: Patch Available (was: Open) Provide metastore schema migration scripts (0.5 - 0.6) --- Key: HIVE-1427 URL: https://issues.apache.org/jira/browse/HIVE-1427 Project: Hadoop Hive Issue Type: Task Components: Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.6.0 Attachments: HIVE-1427.1.patch.txt At a minimum this ticket covers packaging up example MySQL migration scripts (cumulative across all schema changes from 0.5 to 0.6) and explaining what to do with them in the release notes. This is also probably a good point at which to decide and clearly state which Metastore DBs we officially support in production, e.g. do we need to provide migration scripts for Derby? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-1676) show table extended like does not work well with wildcards
[ https://issues.apache.org/jira/browse/HIVE-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-1676. -- Resolution: Duplicate show table extended like does not work well with wildcards -- Key: HIVE-1676 URL: https://issues.apache.org/jira/browse/HIVE-1676 Project: Hadoop Hive Issue Type: Bug Reporter: Pradeep Kamath Priority: Minor As evident from the output below though there are tables that match the wildcard, the output from show table extended like does not contain the matches. {noformat} bin/hive -e show tables 'foo*' Hive history file=/tmp/pradeepk/hive_job_log_pradeepk_201009301037_568707409.txt OK foo foo2 Time taken: 3.417 seconds bin/hive -e show table extended like 'foo*' Hive history file=/tmp/pradeepk/hive_job_log_pradeepk_201009301037_410056681.txt OK Time taken: 2.948 seconds {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1427) Provide metastore schema migration scripts (0.5 - 0.6)
[ https://issues.apache.org/jira/browse/HIVE-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12916750#action_12916750 ] Carl Steinbach commented on HIVE-1427: -- @Ning: Will do. Can someone please review and commit HIVE-1364 since this ticket depends on it? Provide metastore schema migration scripts (0.5 - 0.6) --- Key: HIVE-1427 URL: https://issues.apache.org/jira/browse/HIVE-1427 Project: Hadoop Hive Issue Type: Task Components: Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.6.0 At a minimum this ticket covers packaging up example MySQL migration scripts (cumulative across all schema changes from 0.5 to 0.6) and explaining what to do with them in the release notes. This is also probably a good point at which to decide and clearly state which Metastore DBs we officially support in production, e.g. do we need to provide migration scripts for Derby? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1526) Hive should depend on a release version of Thrift
[ https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1526: - Fix Version/s: (was: 0.6.0) Hive should depend on a release version of Thrift - Key: HIVE-1526 URL: https://issues.apache.org/jira/browse/HIVE-1526 Project: Hadoop Hive Issue Type: Task Components: Build Infrastructure, Clients Reporter: Carl Steinbach Assignee: Todd Lipcon Fix For: 0.7.0 Attachments: HIVE-1526.2.patch.txt, hive-1526.txt, libfb303.jar, libthrift.jar Hive should depend on a release version of Thrift, and ideally it should use Ivy to resolve this dependency. The Thrift folks are working on adding Thrift artifacts to a maven repository here: https://issues.apache.org/jira/browse/THRIFT-363 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1526) Hive should depend on a release version of Thrift
[ https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12916751#action_12916751 ] Carl Steinbach commented on HIVE-1526: -- @Ning: I removed the 0.6 tag. Can you please review this change? Thanks. Hive should depend on a release version of Thrift - Key: HIVE-1526 URL: https://issues.apache.org/jira/browse/HIVE-1526 Project: Hadoop Hive Issue Type: Task Components: Build Infrastructure, Clients Reporter: Carl Steinbach Assignee: Todd Lipcon Fix For: 0.7.0 Attachments: HIVE-1526.2.patch.txt, hive-1526.txt, libfb303.jar, libthrift.jar Hive should depend on a release version of Thrift, and ideally it should use Ivy to resolve this dependency. The Thrift folks are working on adding Thrift artifacts to a maven repository here: https://issues.apache.org/jira/browse/THRIFT-363 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (HIVE-1524) parallel execution failed if mapred.job.name is set
[ https://issues.apache.org/jira/browse/HIVE-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reopened HIVE-1524: -- Reopening for backport to 0.6 parallel execution failed if mapred.job.name is set --- Key: HIVE-1524 URL: https://issues.apache.org/jira/browse/HIVE-1524 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.5.0 Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.6.0, 0.7.0 Attachments: HIVE-1524-for-Hive-0.6.patch, HIVE-1524.2.patch, HIVE-1524.patch The plan file name was generated based on mapred.job.name. If the user specify mapred.job.name before the query, two parallel queries will have conflict plan file name. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1524) parallel execution failed if mapred.job.name is set
[ https://issues.apache.org/jira/browse/HIVE-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1524: - Fix Version/s: 0.6.0 Component/s: Query Processor parallel execution failed if mapred.job.name is set --- Key: HIVE-1524 URL: https://issues.apache.org/jira/browse/HIVE-1524 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.5.0 Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.6.0, 0.7.0 Attachments: HIVE-1524-for-Hive-0.6.patch, HIVE-1524.2.patch, HIVE-1524.patch The plan file name was generated based on mapred.job.name. If the user specify mapred.job.name before the query, two parallel queries will have conflict plan file name. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12915961#action_12915961 ] Carl Steinbach commented on HIVE-675: - Hi Ning, I'm looking into it. add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675-backport-v6.1.patch.txt, HIVE-675-backport-v6.2.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12915986#action_12915986 ] Carl Steinbach commented on HIVE-675: - @Ning: Can you please delete metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreRemote.java? In the backport patch this file is deleted and replaced with TestRemoteHiveMetaStore.java, but it looks like for some reason this file was not actually deleted when the patch was applied. add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675-backport-v6.1.patch.txt, HIVE-675-backport-v6.2.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1157: - Attachment: HIVE-1157.patch.v5.txt Attaching an updated version of Phil's patch that applies cleanly with -p0 UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, HIVE-1157.patch.v4.txt, HIVE-1157.patch.v5.txt, HIVE-1157.v2.patch.txt, output.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1157: - Status: Patch Available (was: Open) UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, HIVE-1157.patch.v4.txt, HIVE-1157.patch.v5.txt, HIVE-1157.v2.patch.txt, output.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1526) Hive should depend on a release version of Thrift
[ https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1526: - Status: Patch Available (was: Open) Hive should depend on a release version of Thrift - Key: HIVE-1526 URL: https://issues.apache.org/jira/browse/HIVE-1526 Project: Hadoop Hive Issue Type: Task Components: Build Infrastructure, Clients Reporter: Carl Steinbach Assignee: Todd Lipcon Fix For: 0.6.0, 0.7.0 Attachments: HIVE-1526.2.patch.txt, hive-1526.txt, libfb303.jar, libthrift.jar Hive should depend on a release version of Thrift, and ideally it should use Ivy to resolve this dependency. The Thrift folks are working on adding Thrift artifacts to a maven repository here: https://issues.apache.org/jira/browse/THRIFT-363 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1526) Hive should depend on a release version of Thrift
[ https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1526: - Attachment: HIVE-1526.2.patch.txt HIVE-1526.2.patch.txt: * Manage slf4j dependencies with Ivy. * Added slf4j dependencies to eclipse classpath. * Added thriftif macro to ${hive.root}/build.xml which triggers recompilation of all thrift stubs. * Modified odbc/Makefile to use Thrift libs and headers in THRIFT_HOME instead of the ones that were checked into service/include. * Modified odbc/Makefile to build thrift generated cpp artifacts in ql/src * Removed thrift headers/code from service/include (HIVE-1527) * Added some missing #includes to the hiveclient source files in odbc/src/cpp. Testing: * Tested eclipse launch configurations. * Built CPP hiveclient lib and tested against HiveServer using HiveClientTestC program. Hive should depend on a release version of Thrift - Key: HIVE-1526 URL: https://issues.apache.org/jira/browse/HIVE-1526 Project: Hadoop Hive Issue Type: Task Components: Build Infrastructure, Clients Reporter: Carl Steinbach Assignee: Todd Lipcon Fix For: 0.6.0, 0.7.0 Attachments: HIVE-1526.2.patch.txt, hive-1526.txt, libfb303.jar, libthrift.jar Hive should depend on a release version of Thrift, and ideally it should use Ivy to resolve this dependency. The Thrift folks are working on adding Thrift artifacts to a maven repository here: https://issues.apache.org/jira/browse/THRIFT-363 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1530) Include hive-default.xml and hive-log4j.properties in hive-common JAR
[ https://issues.apache.org/jira/browse/HIVE-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1530: - Status: Patch Available (was: Open) Assignee: Carl Steinbach Fix Version/s: 0.7.0 Include hive-default.xml and hive-log4j.properties in hive-common JAR - Key: HIVE-1530 URL: https://issues.apache.org/jira/browse/HIVE-1530 Project: Hadoop Hive Issue Type: Improvement Components: Configuration Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.7.0 Attachments: HIVE-1530.1.patch.txt hive-common-*.jar should include hive-default.xml and hive-log4j.properties, and similarly hive-exec-*.jar should include hive-exec-log4j.properties. The hive-default.xml file that currently sits in the conf/ directory should be removed. Motivations for this change: * We explicitly tell users that they should never modify hive-default.xml yet give them the opportunity to do so by placing the file in the conf dir. * Many users are familiar with the Hadoop configuration mechanism that does not require *-default.xml files to be present in the HADOOP_CONF_DIR, and assume that the same is true for HIVE_CONF_DIR. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1526) Hive should depend on a release version of Thrift
[ https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12913933#action_12913933 ] Carl Steinbach commented on HIVE-1526: -- @Todd: Can you please regenerate this patch? Both 'patch -p0' and 'git apply -p0' fail. Thanks. Hive should depend on a release version of Thrift - Key: HIVE-1526 URL: https://issues.apache.org/jira/browse/HIVE-1526 Project: Hadoop Hive Issue Type: Task Components: Build Infrastructure, Clients Reporter: Carl Steinbach Assignee: Todd Lipcon Fix For: 0.6.0, 0.7.0 Attachments: hive-1526.txt, libfb303.jar, libthrift.jar Hive should depend on a release version of Thrift, and ideally it should use Ivy to resolve this dependency. The Thrift folks are working on adding Thrift artifacts to a maven repository here: https://issues.apache.org/jira/browse/THRIFT-363 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1526) Hive should depend on a release version of Thrift
[ https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1526: - Fix Version/s: 0.6.0 0.7.0 Component/s: Clients Hive should depend on a release version of Thrift - Key: HIVE-1526 URL: https://issues.apache.org/jira/browse/HIVE-1526 Project: Hadoop Hive Issue Type: Task Components: Build Infrastructure, Clients Reporter: Carl Steinbach Assignee: Todd Lipcon Fix For: 0.6.0, 0.7.0 Attachments: hive-1526.txt, libfb303.jar, libthrift.jar Hive should depend on a release version of Thrift, and ideally it should use Ivy to resolve this dependency. The Thrift folks are working on adding Thrift artifacts to a maven repository here: https://issues.apache.org/jira/browse/THRIFT-363 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)
[ https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1364: - Attachment: HIVE-1364.3.patch.txt HIVE-1364.3.backport-060.patch.txt Increase the maximum length of SERDEPROPERTIES values (currently 767 characters) Key: HIVE-1364 URL: https://issues.apache.org/jira/browse/HIVE-1364 Project: Hadoop Hive Issue Type: Bug Components: Metastore Affects Versions: 0.5.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: HIVE-1364.2.patch.txt, HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, HIVE-1364.patch The value component of a SERDEPROPERTIES key/value pair is currently limited to a maximum length of 767 characters. I believe that the motivation for limiting the length to 767 characters is that this value is the maximum allowed length of an index in a MySQL database running on the InnoDB engine: http://bugs.mysql.com/bug.php?id=13315 * The Metastore OR mapping currently limits many fields (including SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite the fact that these fields are not indexed. * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535. * We can expect many users to hit the 767 character limit on SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping serdeproperty to map a table that has many columns. I propose increasing the maximum allowed length of SERDEPROPERTIES.PARAM_VALUE to 8192. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)
[ https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1364: - Status: Patch Available (was: Open) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters) Key: HIVE-1364 URL: https://issues.apache.org/jira/browse/HIVE-1364 Project: Hadoop Hive Issue Type: Bug Components: Metastore Affects Versions: 0.5.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: HIVE-1364.2.patch.txt, HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, HIVE-1364.patch The value component of a SERDEPROPERTIES key/value pair is currently limited to a maximum length of 767 characters. I believe that the motivation for limiting the length to 767 characters is that this value is the maximum allowed length of an index in a MySQL database running on the InnoDB engine: http://bugs.mysql.com/bug.php?id=13315 * The Metastore OR mapping currently limits many fields (including SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite the fact that these fields are not indexed. * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535. * We can expect many users to hit the 767 character limit on SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping serdeproperty to map a table that has many columns. I propose increasing the maximum allowed length of SERDEPROPERTIES.PARAM_VALUE to 8192. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)
[ https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1364: - Attachment: HIVE-1364.4.patch.txt HIVE-1364.4.backport-060.patch.txt Updated version of the patch with changes requested by John. Increase the maximum length of SERDEPROPERTIES values (currently 767 characters) Key: HIVE-1364 URL: https://issues.apache.org/jira/browse/HIVE-1364 Project: Hadoop Hive Issue Type: Bug Components: Metastore Affects Versions: 0.5.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: HIVE-1364.2.patch.txt, HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, HIVE-1364.4.backport-060.patch.txt, HIVE-1364.4.patch.txt, HIVE-1364.patch The value component of a SERDEPROPERTIES key/value pair is currently limited to a maximum length of 767 characters. I believe that the motivation for limiting the length to 767 characters is that this value is the maximum allowed length of an index in a MySQL database running on the InnoDB engine: http://bugs.mysql.com/bug.php?id=13315 * The Metastore OR mapping currently limits many fields (including SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite the fact that these fields are not indexed. * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535. * We can expect many users to hit the 767 character limit on SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping serdeproperty to map a table that has many columns. I propose increasing the maximum allowed length of SERDEPROPERTIES.PARAM_VALUE to 8192. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1526) Hive should depend on a release version of Thrift
[ https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914255#action_12914255 ] Carl Steinbach commented on HIVE-1526: -- Sorry, that was a false alarm about the patch. Turns out the github Hive mirror lags the main repo by about a week. @Todd: This patch introduces unsatisfied dependencies on slf4j-api and slf4j-log4j12. Can you please update the patch to pull these dependencies down with Ivy? Hive should depend on a release version of Thrift - Key: HIVE-1526 URL: https://issues.apache.org/jira/browse/HIVE-1526 Project: Hadoop Hive Issue Type: Task Components: Build Infrastructure, Clients Reporter: Carl Steinbach Assignee: Todd Lipcon Fix For: 0.6.0, 0.7.0 Attachments: hive-1526.txt, libfb303.jar, libthrift.jar Hive should depend on a release version of Thrift, and ideally it should use Ivy to resolve this dependency. The Thrift folks are working on adding Thrift artifacts to a maven repository here: https://issues.apache.org/jira/browse/THRIFT-363 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1517) ability to select across a database
[ https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1517: - Fix Version/s: (was: 0.6.0) @John: Yes, it's a checkpoint patch. Moving this to 0.7.0. ability to select across a database --- Key: HIVE-1517 URL: https://issues.apache.org/jira/browse/HIVE-1517 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Carl Steinbach Fix For: 0.7.0 Attachments: HIVE-1517.1.patch.txt After https://issues.apache.org/jira/browse/HIVE-675, we need a way to be able to select across a database for this feature to be useful. For eg: use db1 create table foo(); use db2 select .. from db1.foo. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1517) ability to select across a database
[ https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1517: - Attachment: HIVE-1517.1.patch.txt ability to select across a database --- Key: HIVE-1517 URL: https://issues.apache.org/jira/browse/HIVE-1517 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: HIVE-1517.1.patch.txt After https://issues.apache.org/jira/browse/HIVE-675, we need a way to be able to select across a database for this feature to be useful. For eg: use db1 create table foo(); use db2 select .. from db1.foo. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1615) Web Interface JSP needs Refactoring for removed meta store methods
[ https://issues.apache.org/jira/browse/HIVE-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1615: - Affects Version/s: 0.7.0 Web Interface JSP needs Refactoring for removed meta store methods -- Key: HIVE-1615 URL: https://issues.apache.org/jira/browse/HIVE-1615 Project: Hadoop Hive Issue Type: Bug Components: Web UI Affects Versions: 0.6.0, 0.7.0 Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.7.0 Attachments: hive-1615.patch.2.txt, hive-1615.patch.txt Some meta store methods being called from JSP have been removed. Really should prioritize compiling jsp into servlet code again. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1615) Web Interface JSP needs Refactoring for removed meta store methods
[ https://issues.apache.org/jira/browse/HIVE-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12910375#action_12910375 ] Carl Steinbach commented on HIVE-1615: -- This needs to be backported to 0.6. I verified that hive-1615.patch.2.txt applies cleanly to the 0.6 branch. Web Interface JSP needs Refactoring for removed meta store methods -- Key: HIVE-1615 URL: https://issues.apache.org/jira/browse/HIVE-1615 Project: Hadoop Hive Issue Type: Bug Components: Web UI Affects Versions: 0.6.0, 0.7.0 Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.7.0 Attachments: hive-1615.patch.2.txt, hive-1615.patch.txt Some meta store methods being called from JSP have been removed. Really should prioritize compiling jsp into servlet code again. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-33) [Hive]: Add ability to compute statistics on hive tables
[ https://issues.apache.org/jira/browse/HIVE-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-33: --- Issue Type: New Feature (was: Bug) [Hive]: Add ability to compute statistics on hive tables Key: HIVE-33 URL: https://issues.apache.org/jira/browse/HIVE-33 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Ashish Thusoo Assignee: Ahmed M Aly Add commands to collect partition and column level statistics in hive. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1361) table/partition level statistics
[ https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1361: - Fix Version/s: 0.7.0 Affects Version/s: (was: 0.6.0) Component/s: Query Processor table/partition level statistics Key: HIVE-1361 URL: https://issues.apache.org/jira/browse/HIVE-1361 Project: Hadoop Hive Issue Type: Sub-task Components: Query Processor Reporter: Ning Zhang Assignee: Ahmed M Aly Fix For: 0.7.0 Attachments: HIVE-1361.java_only.patch, HIVE-1361.patch, stats0.patch At the first step, we gather table-level stats for non-partitioned table and partition-level stats for partitioned table. Future work could extend the table level stats to partitioned table as well. There are 3 major milestones in this subtask: 1) extend the insert statement to gather table/partition level stats on-the-fly. 2) extend metastore API to support storing and retrieving stats for a particular table/partition. 3) add an ANALYZE TABLE [PARTITION] statement in Hive QL to gather stats for existing tables/partitions. The proposed stats are: Partition-level stats: - number of rows - total size in bytes - number of files - max, min, average row sizes - max, min, average file sizes Table-level stats in addition to partition level stats: - number of partitions -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1649) Ability to update counters and status from TRANSFORM scripts
Ability to update counters and status from TRANSFORM scripts Key: HIVE-1649 URL: https://issues.apache.org/jira/browse/HIVE-1649 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Carl Steinbach Hadoop Streaming supports the ability to update counters and status by writing specially coded messages to the script's stderr stream. A streaming process can use the stderr to emit counter information. {{reporter:counter:group,counter,amount}} should be sent to stderr to update the counter. A streaming process can use the stderr to emit status information. To set a status, {{reporter:status:message}} should be sent to stderr. Hive should support these same features with its TRANSFORM mechanism. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-675: Attachment: HIVE-675-backport-v6.2.patch.txt HIVE-675-backport-v6.2.patch.txt includes HIVE-1607. add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675-backport-v6.1.patch.txt, HIVE-675-backport-v6.2.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-675: Status: Patch Available (was: Reopened) add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675-backport-v6.1.patch.txt, HIVE-675-backport-v6.2.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-675: Attachment: HIVE-675-backport-v6.1.patch.txt Backport for 0.6.0. Should I squash HIVE-1607 into this or backport it separately? add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675-backport-v6.1.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1636) Implement SHOW TABLES {FROM | IN} db_name
Implement SHOW TABLES {FROM | IN} db_name --- Key: HIVE-1636 URL: https://issues.apache.org/jira/browse/HIVE-1636 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach Make it possible to list the tables in a specific database using the following syntax borrowed from MySQL: {noformat} SHOW TABLES [{FROM|IN} db_name] {noformat} See http://dev.mysql.com/doc/refman/5.0/en/show-tables.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-802) Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it
[ https://issues.apache.org/jira/browse/HIVE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-802: Attachment: datanucleus-core-1.1.2-patched.jar Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it - Key: HIVE-802 URL: https://issues.apache.org/jira/browse/HIVE-802 Project: Hadoop Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.5.0 Reporter: Todd Lipcon Assignee: Arvind Prabhakar Attachments: datanucleus-core-1.1.2-patched.jar There's a bug in DataNucleus that causes this issue: http://www.jpox.org/servlet/jira/browse/NUCCORE-371 To reproduce, simply put your hive source tree in a directory that contains a '+' character. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1613) hive --service jar looks for hadoop version but was not defined
[ https://issues.apache.org/jira/browse/HIVE-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1613: - Fix Version/s: 0.7.0 hive --service jar looks for hadoop version but was not defined --- Key: HIVE-1613 URL: https://issues.apache.org/jira/browse/HIVE-1613 Project: Hadoop Hive Issue Type: Bug Components: Clients Affects Versions: 0.5.1 Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.6.0, 0.7.0 Attachments: hive-1613.patch.txt hive --service jar fails. I have to open another ticket to clean up the scripts and unify functions like version detection. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1613) hive --service jar looks for hadoop version but was not defined
[ https://issues.apache.org/jira/browse/HIVE-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907402#action_12907402 ] Carl Steinbach commented on HIVE-1613: -- +1 Looks good. hive --service jar looks for hadoop version but was not defined --- Key: HIVE-1613 URL: https://issues.apache.org/jira/browse/HIVE-1613 Project: Hadoop Hive Issue Type: Bug Components: Clients Affects Versions: 0.5.1 Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.6.0, 0.7.0 Attachments: hive-1613.patch.txt hive --service jar fails. I have to open another ticket to clean up the scripts and unify functions like version detection. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
[ https://issues.apache.org/jira/browse/HIVE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reopened HIVE-1607: -- Backport to 0.6.0 Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675 Key: HIVE-1607 URL: https://issues.apache.org/jira/browse/HIVE-1607 Project: Hadoop Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.7.0 Attachments: HIVE-1607.1.patch.txt, HIVE-1607.2.patch.txt Several methods were removed from the IMetaStoreClient interface as part of HIVE-675: {code} /** * Drop the table. * * @param tableName * The table to drop * @param deleteData * Should we delete the underlying data * @throws MetaException * Could not drop table properly. * @throws UnknownTableException * The table wasn't found. * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * The table wasn't found. */ public void dropTable(String tableName, boolean deleteData) throws MetaException, UnknownTableException, TException, NoSuchObjectException; /** * Get a table object. * * @param tableName * Name of the table to fetch. * @return An object representing the table. * @throws MetaException * Could not fetch the table * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * In case the table wasn't found. */ public Table getTable(String tableName) throws MetaException, TException, NoSuchObjectException; public boolean tableExists(String databaseName, String tableName) throws MetaException, TException, UnknownDBException; {code} These methods should be reinstated with a deprecation warning. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1623) Factor out Hadoop version check logic in bin/hive scripts
Factor out Hadoop version check logic in bin/hive scripts - Key: HIVE-1623 URL: https://issues.apache.org/jira/browse/HIVE-1623 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: Carl Steinbach The same Hadoop version check logic is repeated in each of the following files: bin/ext/hiveserver.sh bin/ext/hwi.sh bin/ext/metastore.sh bin/ext/util/execHiveCmd.sh This code should be refactored into a version check function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
[ https://issues.apache.org/jira/browse/HIVE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1607: - Fix Version/s: 0.6.0 Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675 Key: HIVE-1607 URL: https://issues.apache.org/jira/browse/HIVE-1607 Project: Hadoop Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: HIVE-1607.1.patch.txt, HIVE-1607.2.patch.txt Several methods were removed from the IMetaStoreClient interface as part of HIVE-675: {code} /** * Drop the table. * * @param tableName * The table to drop * @param deleteData * Should we delete the underlying data * @throws MetaException * Could not drop table properly. * @throws UnknownTableException * The table wasn't found. * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * The table wasn't found. */ public void dropTable(String tableName, boolean deleteData) throws MetaException, UnknownTableException, TException, NoSuchObjectException; /** * Get a table object. * * @param tableName * Name of the table to fetch. * @return An object representing the table. * @throws MetaException * Could not fetch the table * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * In case the table wasn't found. */ public Table getTable(String tableName) throws MetaException, TException, NoSuchObjectException; public boolean tableExists(String databaseName, String tableName) throws MetaException, TException, UnknownDBException; {code} These methods should be reinstated with a deprecation warning. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907048#action_12907048 ] Carl Steinbach commented on HIVE-675: - @Paul: No, not yet, but I think the following script should work: {code} ALTER TABLE DBS MODIFY DESC VARCHAR(4000); ALTER TABLE DBS ADD COLUMN DB_LOCATION_URI VARCHAR(4000); {code} add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1517) ability to select across a database
[ https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1517: - Fix Version/s: 0.6.0 0.7.0 ability to select across a database --- Key: HIVE-1517 URL: https://issues.apache.org/jira/browse/HIVE-1517 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 After https://issues.apache.org/jira/browse/HIVE-675, we need a way to be able to select across a database for this feature to be useful. For eg: use db1 create table foo(); use db2 select .. from db1.foo. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reopened HIVE-675: - Working on a backport for 0.6.0 add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-675: Fix Version/s: 0.6.0 add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1446) Move Hive Documentation from the wiki to version control
[ https://issues.apache.org/jira/browse/HIVE-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1446: - Fix Version/s: (was: 0.6.0) Postponing this work until 0.7.0 Move Hive Documentation from the wiki to version control Key: HIVE-1446 URL: https://issues.apache.org/jira/browse/HIVE-1446 Project: Hadoop Hive Issue Type: Task Components: Documentation Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.7.0 Attachments: hive-1446-part-1.diff, hive-1446.diff, hive-logo-wide.png Move the Hive Language Manual (and possibly some other documents) from the Hive wiki to version control. This work needs to be coordinated with the hive-dev and hive-user community in order to avoid missing any edits as well as to avoid or limit unavailability of the docs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1476) Hive's metastore when run as a thrift service creates directories as the service user instead of the real user issuing create table/alter table etc.
[ https://issues.apache.org/jira/browse/HIVE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905622#action_12905622 ] Carl Steinbach commented on HIVE-1476: -- @Venkatesh: THRIFT-814 covers adding SPNEGO support to Thrift. Hive's metastore when run as a thrift service creates directories as the service user instead of the real user issuing create table/alter table etc. Key: HIVE-1476 URL: https://issues.apache.org/jira/browse/HIVE-1476 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.6.0, 0.7.0 Reporter: Pradeep Kamath Attachments: HIVE-1476.patch, HIVE-1476.patch.2 If the thrift metastore service is running as the user hive then all table directories as a result of create table are created as that user rather than the user who actually issued the create table command. This is different semantically from non-thrift mode (i.e. local mode) when clients directly connect to the metastore. In the latter case, directories are created as the real user. The thrift mode should do the same. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1476) Hive's metastore when run as a thrift service creates directories as the service user instead of the real user issuing create table/alter table etc.
[ https://issues.apache.org/jira/browse/HIVE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905623#action_12905623 ] Carl Steinbach commented on HIVE-1476: -- Edit: I mean THRIFT-889. Hive's metastore when run as a thrift service creates directories as the service user instead of the real user issuing create table/alter table etc. Key: HIVE-1476 URL: https://issues.apache.org/jira/browse/HIVE-1476 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.6.0, 0.7.0 Reporter: Pradeep Kamath Attachments: HIVE-1476.patch, HIVE-1476.patch.2 If the thrift metastore service is running as the user hive then all table directories as a result of create table are created as that user rather than the user who actually issued the create table command. This is different semantically from non-thrift mode (i.e. local mode) when clients directly connect to the metastore. In the latter case, directories are created as the real user. The thrift mode should do the same. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar
[ https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905686#action_12905686 ] Carl Steinbach commented on HIVE-1546: -- bq. we've agreed at the high level on the approach of creating Howl as a wrapper around Hive I thought Howl was supposed to be a wrapper around (and replacement for) the Hive metastore, not all of Hive. I think there are clear advantages to Hive and Howl sharing the same metastore code as long as they access this facility through the public API, but can't say the same for the two projects using the same CLI code if it means allowing external projects to depend on loosely defined set of internal APIs. What benefits are we hoping to achieve by having Howl and Hive share the same CLI code, especially if Howl is only interested in a small part of it? What are the drawbacks of instead encouraging the Howl project to copy the CLI code and maintain their own version? Ability to plug custom Semantic Analyzers for Hive Grammar -- Key: HIVE-1546 URL: https://issues.apache.org/jira/browse/HIVE-1546 Project: Hadoop Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.7.0 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, hive-1546_2.patch It will be useful if Semantic Analysis phase is made pluggable such that other projects can do custom analysis of hive queries before doing metastore operations on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar
[ https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905692#action_12905692 ] Carl Steinbach commented on HIVE-1546: -- What do you think of this option: we check the Howl SemanticAnalyzer into the Hive source tree and provide a config option that optionally enables it? This gives Howl the features they need without making the SemanticAnalyzer API public. Ability to plug custom Semantic Analyzers for Hive Grammar -- Key: HIVE-1546 URL: https://issues.apache.org/jira/browse/HIVE-1546 Project: Hadoop Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.7.0 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, hive-1546_2.patch It will be useful if Semantic Analysis phase is made pluggable such that other projects can do custom analysis of hive queries before doing metastore operations on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar
[ https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905707#action_12905707 ] Carl Steinbach commented on HIVE-1546: -- I'm +1 on the approach outlined by John. Ability to plug custom Semantic Analyzers for Hive Grammar -- Key: HIVE-1546 URL: https://issues.apache.org/jira/browse/HIVE-1546 Project: Hadoop Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.7.0 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, hive-1546_2.patch It will be useful if Semantic Analysis phase is made pluggable such that other projects can do custom analysis of hive queries before doing metastore operations on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-849) database_name.table_name.column_name not supported
[ https://issues.apache.org/jira/browse/HIVE-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905719#action_12905719 ] Carl Steinbach commented on HIVE-849: - @Namit: Correct, but this issue is also covered by HIVE-1517, and the comments in that ticket provide more details, so I decided to resolve this ticket as a duplicate of HIVE-1517. database_name.table_name.column_name not supported Key: HIVE-849 URL: https://issues.apache.org/jira/browse/HIVE-849 Project: Hadoop Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Carl Steinbach -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar
[ https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905757#action_12905757 ] Carl Steinbach commented on HIVE-1546: -- I gather from Ashutosh's latest patch that you want to do the following: * Provide your own implementation of HiveSemanticAnalyzerFactory. * Subclass SemanticAnalyzer * Subclass DDLSemanticAnalzyer I looked at the public and protected members in these classes and think that at a minimum we would have to mark the following classes as limited private and evolving: * HiveSemanticAnalyzerFactory * BaseSemanticAnalyzer * SemanticAnalyzer * DDLSemanticAnalyzer * ASTNode * HiveParser (i.e. Hive's ANTLR grammar) * SemanticAnalyzer Context (org.apache.hadoop.hive.ql.Context) * Task and FetchTask * QB * QBParseInfo * QBMetaData * QBJoinTree * CreateTableDesc So anytime we touch one of these classes we would need to coordinate with the Howl folks to make sure we aren't breaking one of their plugins? I don't think this is a good tradeoff if the main benefit we can expect is a simpler build and release process for Howl. Ability to plug custom Semantic Analyzers for Hive Grammar -- Key: HIVE-1546 URL: https://issues.apache.org/jira/browse/HIVE-1546 Project: Hadoop Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.7.0 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, hive-1546_2.patch It will be useful if Semantic Analysis phase is made pluggable such that other projects can do custom analysis of hive queries before doing metastore operations on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1609) Support partition filtering in metastore
[ https://issues.apache.org/jira/browse/HIVE-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905769#action_12905769 ] Carl Steinbach commented on HIVE-1609: -- DynamicSerDe is the component that has a JavaCC dependency. I think DynamicSerDe (and TCTLSeparatedProtocol) were deprecated a long time ago. Should we try to remove this code? Support partition filtering in metastore Key: HIVE-1609 URL: https://issues.apache.org/jira/browse/HIVE-1609 Project: Hadoop Hive Issue Type: New Feature Components: Metastore Reporter: Ajay Kidave Fix For: 0.7.0 Attachments: hive_1609.patch, hive_1609_2.patch The metastore needs to have support for returning a list of partitions based on user specified filter conditions. This will be useful for tools which need to do partition pruning. Howl is one such use case. The way partition pruning is done during hive query execution need not be changed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1609) Support partition filtering in metastore
[ https://issues.apache.org/jira/browse/HIVE-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1609: - Fix Version/s: 0.7.0 (was: 0.6.0) Affects Version/s: (was: 0.5.0) Support partition filtering in metastore Key: HIVE-1609 URL: https://issues.apache.org/jira/browse/HIVE-1609 Project: Hadoop Hive Issue Type: New Feature Components: Metastore Reporter: Ajay Kidave Fix For: 0.7.0 Attachments: hive_1609.patch The metastore needs to have support for returning a list of partitions based on user specified filter conditions. This will be useful for tools which need to do partition pruning. Howl is one such use case. The way partition pruning is done during hive query execution need not be changed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1609) Support partition filtering in metastore
[ https://issues.apache.org/jira/browse/HIVE-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1609: - Status: Open (was: Patch Available) Support partition filtering in metastore Key: HIVE-1609 URL: https://issues.apache.org/jira/browse/HIVE-1609 Project: Hadoop Hive Issue Type: New Feature Components: Metastore Reporter: Ajay Kidave Fix For: 0.7.0 Attachments: hive_1609.patch The metastore needs to have support for returning a list of partitions based on user specified filter conditions. This will be useful for tools which need to do partition pruning. Howl is one such use case. The way partition pruning is done during hive query execution need not be changed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar
[ https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905390#action_12905390 ] Carl Steinbach commented on HIVE-1546: -- @Ashutosh: Can you provide some background on what you hope to accomplish with this? What is the motivating use case, i.e. what custom SemanticAnalyzer do you plan to write? Also, how are the new INPUTDRIVER and OUTPUTDRIVER properties used? By adding these to the Hive grammar it seems like we may be providing a mechanism for defining tables in the MetaStore that Hive can't read or write to. If that's the case what are your plans for adding this support to Hive? Ability to plug custom Semantic Analyzers for Hive Grammar -- Key: HIVE-1546 URL: https://issues.apache.org/jira/browse/HIVE-1546 Project: Hadoop Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.7.0 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, hive-1546_2.patch It will be useful if Semantic Analysis phase is made pluggable such that other projects can do custom analysis of hive queries before doing metastore operations on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar
[ https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905406#action_12905406 ] Carl Steinbach commented on HIVE-1546: -- If the main motivation for this ticket is the ability to produce a crippled version of the HiveCLI that is only capable of executing DDL, then I think we should consider simpler approaches that don't involve making SemanticAnalyzer a public API. SemanticAnalyzer is in serious need of refactoring. Making this API public will severely restrict our ability to do this work in the future. Ability to plug custom Semantic Analyzers for Hive Grammar -- Key: HIVE-1546 URL: https://issues.apache.org/jira/browse/HIVE-1546 Project: Hadoop Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.7.0 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, hive-1546_2.patch It will be useful if Semantic Analysis phase is made pluggable such that other projects can do custom analysis of hive queries before doing metastore operations on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar
[ https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905411#action_12905411 ] Carl Steinbach commented on HIVE-1546: -- @ John: Good to know, but what's the motivation for this change? Was is it covered in the back-channel discussions you mentioned above? And is making SemanticAnalyzer a public API really a good idea? Ability to plug custom Semantic Analyzers for Hive Grammar -- Key: HIVE-1546 URL: https://issues.apache.org/jira/browse/HIVE-1546 Project: Hadoop Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.7.0 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, hive-1546_2.patch It will be useful if Semantic Analysis phase is made pluggable such that other projects can do custom analysis of hive queries before doing metastore operations on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1545) Add a bunch of UDFs and UDAFs
[ https://issues.apache.org/jira/browse/HIVE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1545: - Component/s: UDF Add a bunch of UDFs and UDAFs - Key: HIVE-1545 URL: https://issues.apache.org/jira/browse/HIVE-1545 Project: Hadoop Hive Issue Type: New Feature Components: UDF Reporter: Jonathan Chang Assignee: Jonathan Chang Priority: Minor Attachments: udfs.tar.gz Here some UD(A)Fs which can be incorporated into the Hive distribution: UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 5, 3) returns 1. UDFBucket - Find the bucket in which the first argument belongs. e.g., BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x b_{i} but = b_{i+1}. Returns 0 if x is smaller than all the buckets. UDFFindInArray - Finds the 1-index of the first element in the array given as the second argument. Returns 0 if not found. Returns NULL if either argument is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, array(1,2,3)) will return 0. UDFGreatCircleDist - Finds the great circle distance (in km) between two lat/long coordinates (in degrees). UDFLDA - Performs LDA inference on a vector given fixed topics. UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 whenever any of its parameters changes. UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 5. UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches in an array. UDFUnescape - Returns the string unescaped (using C/Java style unescaping). UDFWhich - Given a boolean array, return the indices which are TRUE. UDFJaccard UDAFCollect - Takes all the values associated with a row and converts it into a list. Make sure to have: set hive.map.aggr = false; UDAFCollectMap - Like collect except that it takes tuples and generates a map. UDAFEntropy - Compute the entropy of a column. UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two columns. UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value of VAL. UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated with the N (passed as the third parameter) largest values of VAL. UDAFHistogram -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1016) Ability to access DistributedCache from UDFs
[ https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904860#action_12904860 ] Carl Steinbach commented on HIVE-1016: -- @Namit: GenericUDF.initialize() is called both at compile-time and run-time. Ability to access DistributedCache from UDFs Key: HIVE-1016 URL: https://issues.apache.org/jira/browse/HIVE-1016 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1016.1.patch.txt There have been several requests on the mailing list for information about how to access the DistributedCache from UDFs, e.g.: http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html While responses to these emails suggested several workarounds, the only correct way of accessing the distributed cache is via the static methods of Hadoop's DistributedCache class, and all of these methods require that the JobConf be passed in as a parameter. Hence, giving UDFs access to the distributed cache reduces to giving UDFs access to the JobConf. I propose the following changes to GenericUDF/UDAF/UDTF: * Add an exec_init(Configuration conf) method that is called during Operator initialization at runtime. * Change the name of the initialize method to compile_init to make it clear that this method is called at compile-time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904364#action_12904364 ] Carl Steinbach commented on HIVE-675: - @John: Are you referring to the changes I made to create_database/get_database/get_databases/drop_database in hive_metastore.thrift? In that file I replaced {{liststring get_databases() throws(1:MetaException o1)}} with {{liststring get_databases(1:string pattern) throws(1:MetaException o1)}} I can easily revert this change, but want to know if there are are other things you think I need to fix. I think the changes I made to create_database and drop_database should not be an issue since these calls weren't actually supported in previous versions. add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1016) Ability to access DistributedCache from UDFs
[ https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1016: - Attachment: HIVE-1016.1.patch.txt Ability to access DistributedCache from UDFs Key: HIVE-1016 URL: https://issues.apache.org/jira/browse/HIVE-1016 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1016.1.patch.txt There have been several requests on the mailing list for information about how to access the DistributedCache from UDFs, e.g.: http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html While responses to these emails suggested several workarounds, the only correct way of accessing the distributed cache is via the static methods of Hadoop's DistributedCache class, and all of these methods require that the JobConf be passed in as a parameter. Hence, giving UDFs access to the distributed cache reduces to giving UDFs access to the JobConf. I propose the following changes to GenericUDF/UDAF/UDTF: * Add an exec_init(Configuration conf) method that is called during Operator initialization at runtime. * Change the name of the initialize method to compile_init to make it clear that this method is called at compile-time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1016) Ability to access DistributedCache from UDFs
[ https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1016: - Status: Patch Available (was: Open) Ability to access DistributedCache from UDFs Key: HIVE-1016 URL: https://issues.apache.org/jira/browse/HIVE-1016 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1016.1.patch.txt There have been several requests on the mailing list for information about how to access the DistributedCache from UDFs, e.g.: http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html While responses to these emails suggested several workarounds, the only correct way of accessing the distributed cache is via the static methods of Hadoop's DistributedCache class, and all of these methods require that the JobConf be passed in as a parameter. Hence, giving UDFs access to the distributed cache reduces to giving UDFs access to the JobConf. I propose the following changes to GenericUDF/UDAF/UDTF: * Add an exec_init(Configuration conf) method that is called during Operator initialization at runtime. * Change the name of the initialize method to compile_init to make it clear that this method is called at compile-time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904388#action_12904388 ] Carl Steinbach commented on HIVE-675: - I removed these methods since they implicitly target the {{default}} database. I can put them back with a deprecation warning, but I also want to point out that old code that depends on these methods is probably no longer correct now that Hive supports multiple databases. Removing these methods entirely may be the easiest way to help people find these errors. add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1016) Ability to access DistributedCache from UDFs
[ https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904400#action_12904400 ] Carl Steinbach commented on HIVE-1016: -- @Namit: I initially preferred that approach too, and I think it would make sense if all of the UDF classes inherited from the same abstract base class. However, we have a bunch of unrelated UDF base classes (UDF, UDAF, GenericUDF, GenericUDAFEvaluator (which already has a runtime init() method), and GenericUDTF), and taking the approach you suggested would require modifications to all of these classes as well as the code that calls them. I also think it's likely that we'll want to make more runtime context available to UDFs in the future, and it's easier to proxy this through the UDFContext singleton than to keep adding methods to each of the different UDF base classes. Ability to access DistributedCache from UDFs Key: HIVE-1016 URL: https://issues.apache.org/jira/browse/HIVE-1016 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1016.1.patch.txt There have been several requests on the mailing list for information about how to access the DistributedCache from UDFs, e.g.: http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html While responses to these emails suggested several workarounds, the only correct way of accessing the distributed cache is via the static methods of Hadoop's DistributedCache class, and all of these methods require that the JobConf be passed in as a parameter. Hence, giving UDFs access to the distributed cache reduces to giving UDFs access to the JobConf. I propose the following changes to GenericUDF/UDAF/UDTF: * Add an exec_init(Configuration conf) method that is called during Operator initialization at runtime. * Change the name of the initialize method to compile_init to make it clear that this method is called at compile-time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1016) Ability to access DistributedCache from UDFs
[ https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1016: - Status: Open (was: Patch Available) Ok, I'll rework the patch with your suggestions. Ability to access DistributedCache from UDFs Key: HIVE-1016 URL: https://issues.apache.org/jira/browse/HIVE-1016 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1016.1.patch.txt There have been several requests on the mailing list for information about how to access the DistributedCache from UDFs, e.g.: http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html While responses to these emails suggested several workarounds, the only correct way of accessing the distributed cache is via the static methods of Hadoop's DistributedCache class, and all of these methods require that the JobConf be passed in as a parameter. Hence, giving UDFs access to the distributed cache reduces to giving UDFs access to the JobConf. I propose the following changes to GenericUDF/UDAF/UDTF: * Add an exec_init(Configuration conf) method that is called during Operator initialization at runtime. * Change the name of the initialize method to compile_init to make it clear that this method is called at compile-time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675 Key: HIVE-1607 URL: https://issues.apache.org/jira/browse/HIVE-1607 Project: Hadoop Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach Several methods were removed from the IMetaStoreClient interface as part of HIVE-675: {code} /** * Drop the table. * * @param tableName * The table to drop * @param deleteData * Should we delete the underlying data * @throws MetaException * Could not drop table properly. * @throws UnknownTableException * The table wasn't found. * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * The table wasn't found. */ public void dropTable(String tableName, boolean deleteData) throws MetaException, UnknownTableException, TException, NoSuchObjectException; /** * Get a table object. * * @param tableName * Name of the table to fetch. * @return An object representing the table. * @throws MetaException * Could not fetch the table * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * In case the table wasn't found. */ public Table getTable(String tableName) throws MetaException, TException, NoSuchObjectException; public boolean tableExists(String databaseName, String tableName) throws MetaException, TException, UnknownDBException; {code} These methods should be reinstated with a deprecation warning. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904420#action_12904420 ] Carl Steinbach commented on HIVE-675: - @Ning: I filed HIVE-1607. Patch to follow shortly. Sorry for inconvenience. add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
[ https://issues.apache.org/jira/browse/HIVE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1607: - Attachment: HIVE-1607.1.patch.txt Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675 Key: HIVE-1607 URL: https://issues.apache.org/jira/browse/HIVE-1607 Project: Hadoop Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1607.1.patch.txt Several methods were removed from the IMetaStoreClient interface as part of HIVE-675: {code} /** * Drop the table. * * @param tableName * The table to drop * @param deleteData * Should we delete the underlying data * @throws MetaException * Could not drop table properly. * @throws UnknownTableException * The table wasn't found. * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * The table wasn't found. */ public void dropTable(String tableName, boolean deleteData) throws MetaException, UnknownTableException, TException, NoSuchObjectException; /** * Get a table object. * * @param tableName * Name of the table to fetch. * @return An object representing the table. * @throws MetaException * Could not fetch the table * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * In case the table wasn't found. */ public Table getTable(String tableName) throws MetaException, TException, NoSuchObjectException; public boolean tableExists(String databaseName, String tableName) throws MetaException, TException, UnknownDBException; {code} These methods should be reinstated with a deprecation warning. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
[ https://issues.apache.org/jira/browse/HIVE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1607: - Status: Patch Available (was: Open) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675 Key: HIVE-1607 URL: https://issues.apache.org/jira/browse/HIVE-1607 Project: Hadoop Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1607.1.patch.txt Several methods were removed from the IMetaStoreClient interface as part of HIVE-675: {code} /** * Drop the table. * * @param tableName * The table to drop * @param deleteData * Should we delete the underlying data * @throws MetaException * Could not drop table properly. * @throws UnknownTableException * The table wasn't found. * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * The table wasn't found. */ public void dropTable(String tableName, boolean deleteData) throws MetaException, UnknownTableException, TException, NoSuchObjectException; /** * Get a table object. * * @param tableName * Name of the table to fetch. * @return An object representing the table. * @throws MetaException * Could not fetch the table * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * In case the table wasn't found. */ public Table getTable(String tableName) throws MetaException, TException, NoSuchObjectException; public boolean tableExists(String databaseName, String tableName) throws MetaException, TException, UnknownDBException; {code} These methods should be reinstated with a deprecation warning. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
[ https://issues.apache.org/jira/browse/HIVE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1607: - Attachment: HIVE-1607.2.patch.txt Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675 Key: HIVE-1607 URL: https://issues.apache.org/jira/browse/HIVE-1607 Project: Hadoop Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1607.1.patch.txt, HIVE-1607.2.patch.txt Several methods were removed from the IMetaStoreClient interface as part of HIVE-675: {code} /** * Drop the table. * * @param tableName * The table to drop * @param deleteData * Should we delete the underlying data * @throws MetaException * Could not drop table properly. * @throws UnknownTableException * The table wasn't found. * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * The table wasn't found. */ public void dropTable(String tableName, boolean deleteData) throws MetaException, UnknownTableException, TException, NoSuchObjectException; /** * Get a table object. * * @param tableName * Name of the table to fetch. * @return An object representing the table. * @throws MetaException * Could not fetch the table * @throws TException * A thrift communication error occurred * @throws NoSuchObjectException * In case the table wasn't found. */ public Table getTable(String tableName) throws MetaException, TException, NoSuchObjectException; public boolean tableExists(String databaseName, String tableName) throws MetaException, TException, UnknownDBException; {code} These methods should be reinstated with a deprecation warning. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1594) Typo of hive.merge.size.smallfiles.avgsize prevents change of value
[ https://issues.apache.org/jira/browse/HIVE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1594: - Fix Version/s: (was: 0.7.0) Affects Version/s: (was: 0.6.0) (was: 0.7.0) Typo of hive.merge.size.smallfiles.avgsize prevents change of value --- Key: HIVE-1594 URL: https://issues.apache.org/jira/browse/HIVE-1594 Project: Hadoop Hive Issue Type: Bug Components: Configuration Affects Versions: 0.5.0 Reporter: Yun Huang Yong Assignee: Yun Huang Yong Priority: Minor Fix For: 0.6.0 Attachments: HIVE-1594-0.5.patch, HIVE-1594.patch The setting is described as namehive.merge.size.smallfiles.avgsize/name, however common/src/java/org/apache/hadoop/hive/conf/HiveConf.java reads it as hive.merge.smallfiles.avgsize (note the missing '.size.') so the user's setting has no effect and the value is stuck at the default of 16MB. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1401) Web Interface can ony browse default
[ https://issues.apache.org/jira/browse/HIVE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1401: - Fix Version/s: (was: 0.7.0) Web Interface can ony browse default Key: HIVE-1401 URL: https://issues.apache.org/jira/browse/HIVE-1401 Project: Hadoop Hive Issue Type: New Feature Components: Web UI Affects Versions: 0.5.0 Reporter: Edward Capriolo Assignee: Edward Capriolo Fix For: 0.6.0 Attachments: HIVE-1401-1-patch.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1307) More generic and efficient merge method
[ https://issues.apache.org/jira/browse/HIVE-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1307: - Fix Version/s: 0.6.0 (was: 0.7.0) Affects Version/s: (was: 0.6.0) Component/s: Query Processor More generic and efficient merge method --- Key: HIVE-1307 URL: https://issues.apache.org/jira/browse/HIVE-1307 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.6.0 Attachments: HIVE-1307.0.patch, HIVE-1307.2.patch, HIVE-1307.3.patch, HIVE-1307.3_java.patch, HIVE-1307.4.patch, HIVE-1307.5.patch, HIVE-1307.6.patch, HIVE-1307.7.patch, HIVE-1307.8.patch, HIVE-1307.9.patch, HIVE-1307.patch, HIVE-1307_2_branch_0.6.patch, HIVE-1307_branch_0.6.patch, HIVE-1307_java_only.patch Currently if hive.merge.mapfiles/mapredfiles=true, a new mapreduce job is create to read the input files and output to one reducer for merging. This MR job is created at compile time and one MR job for one partition. In the case of dynamic partition case, multiple partitions could be created at execution time and generating merging MR job at compile time is impossible. We should generalize the merge framework to allow multiple partitions and most of the time a map-only job should be sufficient if we use CombineHiveInputFormat. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1543) set abort in ExecMapper when Hive's record reader got an IOException
[ https://issues.apache.org/jira/browse/HIVE-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1543: - Component/s: Query Processor set abort in ExecMapper when Hive's record reader got an IOException Key: HIVE-1543 URL: https://issues.apache.org/jira/browse/HIVE-1543 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.6.0 Attachments: HIVE-1543.1.patch, HIVE-1543.2_branch0.6.patch, HIVE-1543.patch, HIVE-1543_branch0.6.patch When RecordReader got an IOException, ExecMapper does not know and will close the operators as if there is not error. We should catch this exception and avoid writing partial results to HDFS which will be removed later anyways. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1531) Make Hive build work with Ivy versions 2.1.0
[ https://issues.apache.org/jira/browse/HIVE-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1531: - Fix Version/s: (was: 0.7.0) Make Hive build work with Ivy versions 2.1.0 -- Key: HIVE-1531 URL: https://issues.apache.org/jira/browse/HIVE-1531 Project: Hadoop Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.6.0 Attachments: HIVE-1531.patch.txt Many projects in the Hadoop ecosystem still use Ivy 2.0.0 (including Hadoop and Pig), yet Hive requires version 2.1.0. Ordinarily this would not be a problem, but many users have a copy of an older version of Ivy in their $ANT_HOME directory, and this copy will always get picked up in preference to what the Hive build downloads for itself. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1411) DataNucleus throws NucleusException if core-3.1.1 JAR appears more than once on CLASSPATH
[ https://issues.apache.org/jira/browse/HIVE-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1411: - Fix Version/s: (was: 0.7.0) DataNucleus throws NucleusException if core-3.1.1 JAR appears more than once on CLASSPATH - Key: HIVE-1411 URL: https://issues.apache.org/jira/browse/HIVE-1411 Project: Hadoop Hive Issue Type: Bug Components: Metastore Affects Versions: 0.4.0, 0.4.1, 0.5.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.6.0 Attachments: HIVE-1411.patch.txt DataNucleus barfs when the core-3.1.1 JAR file appears more than once on the CLASSPATH: {code} 2010-03-06 12:33:25,565 ERROR exec.DDLTask (SessionState.java:printError(279)) - FAILED: Error in metadata: javax.jdo.JDOFatalInter nalException: Unexpected exception caught. NestedThrowables: java.lang.reflect.InvocationTargetException org.apache.hadoop.hive.ql.metadata.HiveException: javax.jdo.JDOFatalInternalException: Unexpected exception caught. NestedThrowables: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:258) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:879) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:103) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:379) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:285) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: javax.jdo.JDOFatalInternalException: Unexpected exception caught. NestedThrowables: java.lang.reflect.InvocationTargetException at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1186) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:803) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:698) at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:164) at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:181) at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:125) at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:104) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:130) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:146) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:118) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.(HiveMetaStore.java:100) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:74) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:783) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:794) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:252) ... 12 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at javax.jdo.JDOHelper$16.run(JDOHelper.java:1956) at java.security.AccessController.doPrivileged(Native Method) at javax.jdo.JDOHelper.invoke(JDOHelper.java:1951) at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1159) ... 28 more Caused by: org.datanucleus.exceptions.NucleusException: Plugin (Bundle) org.eclipse.jdt.core is already registered. Ensure you do nt have multiple JAR versions of the same plugin in the classpath. The URL file:/Users/hadop/hadoop-0.20.1+152/build/ivy/lib/Hadoo p/common/core-3.1.1.jar is already registered, and you are trying to register an identical plugin located at URL file:/Users/hado p/hadoop-0.20.1+152/lib/core-3.1.1.jar. at
[jira] Updated: (HIVE-1492) FileSinkOperator should remove duplicated files from the same task based on file sizes
[ https://issues.apache.org/jira/browse/HIVE-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1492: - Fix Version/s: (was: 0.7.0) Affects Version/s: (was: 0.7.0) Component/s: Query Processor FileSinkOperator should remove duplicated files from the same task based on file sizes -- Key: HIVE-1492 URL: https://issues.apache.org/jira/browse/HIVE-1492 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.6.0 Attachments: HIVE-1492.patch, HIVE-1492_branch-0.6.patch FileSinkOperator.jobClose() calls Utilities.removeTempOrDuplicateFiles() to retain only one file for each task. A task could produce multiple files due to failed attempts or speculative runs. The largest file should be retained rather than the first file for each task. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1016) Ability to access DistributedCache from UDFs
[ https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903748#action_12903748 ] Carl Steinbach commented on HIVE-1016: -- Yes, I'm working on it. I'll have a patch ready for review by Monday. (Reassigned this back to myself). Ability to access DistributedCache from UDFs Key: HIVE-1016 URL: https://issues.apache.org/jira/browse/HIVE-1016 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach There have been several requests on the mailing list for information about how to access the DistributedCache from UDFs, e.g.: http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html While responses to these emails suggested several workarounds, the only correct way of accessing the distributed cache is via the static methods of Hadoop's DistributedCache class, and all of these methods require that the JobConf be passed in as a parameter. Hence, giving UDFs access to the distributed cache reduces to giving UDFs access to the JobConf. I propose the following changes to GenericUDF/UDAF/UDTF: * Add an exec_init(Configuration conf) method that is called during Operator initialization at runtime. * Change the name of the initialize method to compile_init to make it clear that this method is called at compile-time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1016) Ability to access DistributedCache from UDFs
[ https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reassigned HIVE-1016: Assignee: Carl Steinbach (was: Namit Jain) Ability to access DistributedCache from UDFs Key: HIVE-1016 URL: https://issues.apache.org/jira/browse/HIVE-1016 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach There have been several requests on the mailing list for information about how to access the DistributedCache from UDFs, e.g.: http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html While responses to these emails suggested several workarounds, the only correct way of accessing the distributed cache is via the static methods of Hadoop's DistributedCache class, and all of these methods require that the JobConf be passed in as a parameter. Hence, giving UDFs access to the distributed cache reduces to giving UDFs access to the JobConf. I propose the following changes to GenericUDF/UDAF/UDTF: * Add an exec_init(Configuration conf) method that is called during Operator initialization at runtime. * Change the name of the initialize method to compile_init to make it clear that this method is called at compile-time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903082#action_12903082 ] Carl Steinbach commented on HIVE-675: - Cited the wrong lines from the patch. Here are the correct ones: {noformat} ... diff --git metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreRemote.java metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreRemote.java deleted file mode 100644 index bc950b9..000 --- metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreRemote.java +++ /dev/null ... {noformat} add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1593) udtf_explode.q is an empty file
[ https://issues.apache.org/jira/browse/HIVE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reassigned HIVE-1593: Assignee: Carl Steinbach (was: Paul Yang) udtf_explode.q is an empty file --- Key: HIVE-1593 URL: https://issues.apache.org/jira/browse/HIVE-1593 Project: Hadoop Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.6.0 Reporter: John Sichi Assignee: Carl Steinbach Priority: Minor Fix For: 0.7.0 Attachments: HIVE-1593.1.patch.txt jsichi-mac:clientpositive jsichi$ pwd /Users/jsichi/open/hive-trunk/ql/src/test/queries/clientpositive jsichi-mac:clientpositive jsichi$ cat udtf_explode.q jsichi-mac:clientpositive jsichi$ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1593) udtf_explode.q is an empty file
[ https://issues.apache.org/jira/browse/HIVE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1593: - Status: Patch Available (was: Open) * Reverted the contents of udtf_explode.q to its pre-HIVE-1031 state. * Updated udtf_explode.q.out udtf_explode.q is an empty file --- Key: HIVE-1593 URL: https://issues.apache.org/jira/browse/HIVE-1593 Project: Hadoop Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.6.0 Reporter: John Sichi Assignee: Carl Steinbach Priority: Minor Fix For: 0.7.0 Attachments: HIVE-1593.1.patch.txt jsichi-mac:clientpositive jsichi$ pwd /Users/jsichi/open/hive-trunk/ql/src/test/queries/clientpositive jsichi-mac:clientpositive jsichi$ cat udtf_explode.q jsichi-mac:clientpositive jsichi$ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-675: Attachment: HIVE-675.13.patch.txt add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-675: Status: Patch Available (was: Open) @Namit: Looks like I was missing some tab characters in the test outputs. I verified that all tests pass with HIVE-675.13.patch.txt add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1369) LazySimpleSerDe should be able to read classes that support some form of toString()
[ https://issues.apache.org/jira/browse/HIVE-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902589#action_12902589 ] Carl Steinbach commented on HIVE-1369: -- @Namit: Will do. LazySimpleSerDe should be able to read classes that support some form of toString() --- Key: HIVE-1369 URL: https://issues.apache.org/jira/browse/HIVE-1369 Project: Hadoop Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.5.0 Reporter: Alex Kozlov Assignee: Alex Kozlov Priority: Minor Fix For: 0.7.0 Attachments: HIVE-1369.patch, HIVE-1369.svn.patch Original Estimate: 2h Remaining Estimate: 2h Currently LazySimpleSerDe is able to deserialize only BytesWritable or Text objects. It should be pretty easy to extend the class to read any object that implements toString() method. Ideas or concerns? Alex K -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902636#action_12902636 ] Carl Steinbach commented on HIVE-675: - That's correct. I will start work on the backport patch after this gets committed to trunk. add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902041#action_12902041 ] Carl Steinbach commented on HIVE-675: - Namit, will do. add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-675: Attachment: HIVE-675.12.patch.txt add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1589) Add HBase/ZK JARs to Eclipse classpath
Add HBase/ZK JARs to Eclipse classpath -- Key: HIVE-1589 URL: https://issues.apache.org/jira/browse/HIVE-1589 Project: Hadoop Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach The eclipse configuration was broken by the addition of HBase and ZK JARs in HIVE-1293. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1589) Add HBase/ZK JARs to Eclipse classpath
[ https://issues.apache.org/jira/browse/HIVE-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1589: - Attachment: HIVE-1589.1.patch.txt Add HBase/ZK JARs to Eclipse classpath -- Key: HIVE-1589 URL: https://issues.apache.org/jira/browse/HIVE-1589 Project: Hadoop Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1589.1.patch.txt The eclipse configuration was broken by the addition of HBase and ZK JARs in HIVE-1293. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1589) Add HBase/ZK JARs to Eclipse classpath
[ https://issues.apache.org/jira/browse/HIVE-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1589: - Status: Patch Available (was: Open) Add HBase/ZK JARs to Eclipse classpath -- Key: HIVE-1589 URL: https://issues.apache.org/jira/browse/HIVE-1589 Project: Hadoop Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1589.1.patch.txt The eclipse configuration was broken by the addition of HBase and ZK JARs in HIVE-1293. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1590) DDLTask.dropDatabase() needs to populate inputs/outputs
DDLTask.dropDatabase() needs to populate inputs/outputs --- Key: HIVE-1590 URL: https://issues.apache.org/jira/browse/HIVE-1590 Project: Hadoop Hive Issue Type: Bug Components: Metastore, Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach From Namit: bq. Also, inputs and outputs need to be populated for 'drop database ..' It should consist of all the tables/partitions in that database. Note that for the time being this does not make a difference since we don't allow you to drop a database that contains tables. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-675) add database/schema support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-675: Status: Patch Available (was: Open) add database/schema support Hive QL --- Key: HIVE-675 URL: https://issues.apache.org/jira/browse/HIVE-675 Project: Hadoop Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Prasad Chakka Assignee: Carl Steinbach Fix For: 0.6.0, 0.7.0 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt Currently all Hive tables reside in single namespace (default). Hive should support multiple namespaces (databases or schemas) such that users can create tables in their specific namespaces. These name spaces can have different warehouse directories (with a default naming scheme) and possibly different properties. There is already some support for this in metastore but Hive query parser should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1211) Tapping logs from child processes
[ https://issues.apache.org/jira/browse/HIVE-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1211: - Attachment: HIVE-1211.3.patch.txt Tapping logs from child processes - Key: HIVE-1211 URL: https://issues.apache.org/jira/browse/HIVE-1211 Project: Hadoop Hive Issue Type: Improvement Components: Logging Reporter: bc Wong Assignee: bc Wong Fix For: 0.7.0 Attachments: HIVE-1211-2.patch, HIVE-1211.1.patch, HIVE-1211.3.patch.txt Stdout/stderr from child processes (e.g. {{MapRedTask}}) are redirected to the parent's stdout/stderr. There is little one can do to to sort out which log is from which query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1211) Tapping logs from child processes
[ https://issues.apache.org/jira/browse/HIVE-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902128#action_12902128 ] Carl Steinbach commented on HIVE-1211: -- Rebased the patch to trunk. +1 Tapping logs from child processes - Key: HIVE-1211 URL: https://issues.apache.org/jira/browse/HIVE-1211 Project: Hadoop Hive Issue Type: Improvement Components: Logging Reporter: bc Wong Assignee: bc Wong Fix For: 0.7.0 Attachments: HIVE-1211-2.patch, HIVE-1211.1.patch, HIVE-1211.3.patch.txt Stdout/stderr from child processes (e.g. {{MapRedTask}}) are redirected to the parent's stdout/stderr. There is little one can do to to sort out which log is from which query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1211) Tapping logs from child processes
[ https://issues.apache.org/jira/browse/HIVE-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1211: - Attachment: HIVE-1211.4.patch.txt Tapping logs from child processes - Key: HIVE-1211 URL: https://issues.apache.org/jira/browse/HIVE-1211 Project: Hadoop Hive Issue Type: Improvement Components: Logging Reporter: bc Wong Assignee: bc Wong Fix For: 0.7.0 Attachments: HIVE-1211-2.patch, HIVE-1211.1.patch, HIVE-1211.3.patch.txt, HIVE-1211.4.patch.txt Stdout/stderr from child processes (e.g. {{MapRedTask}}) are redirected to the parent's stdout/stderr. There is little one can do to to sort out which log is from which query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902208#action_12902208 ] Carl Steinbach commented on HIVE-1157: -- Hi Philip, please rebase the patch and I will take a look. Thanks. UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, HIVE-1157.v2.patch.txt, output.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.