[jira] [Commented] (HIVE-2961) Remove need for storage descriptors for view partitions
[ https://issues.apache.org/jira/browse/HIVE-2961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257330#comment-13257330 ] Kevin Wilfong commented on HIVE-2961: - We can revert those two patches instead. This patch does fix a null pointer exception that occurs for DESCRIBE FORMATTED on a view partition. But I can submit that in a separate patch. Remove need for storage descriptors for view partitions --- Key: HIVE-2961 URL: https://issues.apache.org/jira/browse/HIVE-2961 Project: Hive Issue Type: Improvement Affects Versions: 0.9.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2961.D2877.1.patch Storage descriptors were introduced for view partitions as part of HIVE-2795. This was to allow view partitions to have the concept of a region as well as to fix a NPE that resulted from calling describe formatted on them. Since regions are no longer necessary for view partitions and the NPE can be fixed by not displaying storage information for view partitions (or displaying the view's storage information if this is preferred, although, since a view partition is purely metadata, this does not seem necessary), these are no longer needed. This also means the Python script added which retroactively adds storage descriptors to existing view partitions can be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2960) Stop testing concat of partitions containing control characters.
[ https://issues.apache.org/jira/browse/HIVE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256078#comment-13256078 ] Kevin Wilfong commented on HIVE-2960: - Attached escape2.q.out because the diff for some reason thought it was a binary file. Stop testing concat of partitions containing control characters. Key: HIVE-2960 URL: https://issues.apache.org/jira/browse/HIVE-2960 Project: Hive Issue Type: Test Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2960.D2865.1.patch, escape2.q.out We have been, for a short while, testing to make sure that concatenation commands work with partitions that contain ASCII control characters. This happened to work up until recently due to a happy coincidence in the way the Hive object's HiveConf was updated. Namely, it was updated often enough that it got configs set by the user, but not so often that it got the value for hive.query.string. With some recent changes, it now needs to be updated more often, see https://issues.apache.org/jira/browse/HIVE-2918 This breaks the process of launching a job to merge partitions that contain ASCII control characters. The job conf is constructed using the updated Hive conf containing the value of hive.query.string which contains ASCII control characters. When the job conf is converted to XML it fails because these characters are illegal. Given that any query has, even prior to this change, failed when that query contained ASCII control characters, and hence these partitions cannot be queried directly, it seems reasonable to no longer support concatenating them either (which this change will allow for). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2931) conf settings may be ignored
[ https://issues.apache.org/jira/browse/HIVE-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253174#comment-13253174 ] Kevin Wilfong commented on HIVE-2931: - I'd prefer to use your patch because, as you said, it is nearly an identical fix, and it has test cases. conf settings may be ignored Key: HIVE-2931 URL: https://issues.apache.org/jira/browse/HIVE-2931 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Kevin Wilfong Attachments: HIVE-2931.D2781.1.patch, hive.2931.1.patch This is a pretty serious problem. If a conf variable is changed, Hive may not pick up the variable unless the metastore variables are changed. When any session variables are changed, it might be simpler to update the corresponding Hive conf. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2918) Hive Dynamic Partition Insert - move task not considering 'hive.exec.max.dynamic.partitions' from CLI
[ https://issues.apache.org/jira/browse/HIVE-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253800#comment-13253800 ] Kevin Wilfong commented on HIVE-2918: - I created a task to resolve the test failures here https://issues.apache.org/jira/browse/HIVE-2952 Hive Dynamic Partition Insert - move task not considering 'hive.exec.max.dynamic.partitions' from CLI - Key: HIVE-2918 URL: https://issues.apache.org/jira/browse/HIVE-2918 Project: Hive Issue Type: Bug Affects Versions: 0.7.1, 0.8.0, 0.8.1 Environment: Cent OS 64 bit Reporter: Bejoy KS Assignee: Carl Steinbach Attachments: HIVE-2918.D2703.1.patch Dynamic Partition insert showing an error with the number of partitions created even after the default value of 'hive.exec.max.dynamic.partitions' is bumped high to 2000. Error Message: Failed with exception Number of dynamic partitions created is 1413, which is more than 1000. To solve this try to set hive.exec.max.dynamic.partitions to at least 1413. These are the following properties set on hive CLI hive set hive.exec.dynamic.partition=true; hive set hive.exec.dynamic.partition.mode=nonstrict; hive set hive.exec.max.dynamic.partitions=2000; hive set hive.exec.max.dynamic.partitions.pernode=2000; This is the query with console error log hive INSERT OVERWRITE TABLE partn_dyn Partition (pobox) SELECT country,state,pobox FROM non_partn_dyn; Total MapReduce jobs = 2 Launching Job 1 out of 2 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201204021529_0002, Tracking URL = http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201204021529_0002 Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=0.0.0.0:8021 -kill job_201204021529_0002 2012-04-02 16:05:28,619 Stage-1 map = 0%, reduce = 0% 2012-04-02 16:05:39,701 Stage-1 map = 100%, reduce = 0% 2012-04-02 16:05:50,800 Stage-1 map = 100%, reduce = 100% Ended Job = job_201204021529_0002 Ended Job = 248865587, job is filtered out (removed at runtime). Moving data to: hdfs://0.0.0.0/tmp/hive-cloudera/hive_2012-04-02_16-05-24_919_5976014408587784412/-ext-1 Loading data to table default.partn_dyn partition (pobox=null) Failed with exception Number of dynamic partitions created is 1413, which is more than 1000. To solve this try to set hive.exec.max.dynamic.partitions to at least 1413. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask I checked the job.xml of the first map only job, there the value hive.exec.max.dynamic.partitions=2000 is reflected but the move task is taking the default value from hive-site.xml . If I change the value in hive-site.xml then the job completes successfully. Bottom line,the property 'hive.exec.max.dynamic.partitions'set on CLI is not being considered by move task -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2952) escape1.q and escape2.q failing in trunk
[ https://issues.apache.org/jira/browse/HIVE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253803#comment-13253803 ] Kevin Wilfong commented on HIVE-2952: - I must have been mistaken about the date, I reverted to a revision before 4/11 and I still see the tests failing. Either that or there is something persistent causing the issue. escape1.q and escape2.q failing in trunk Key: HIVE-2952 URL: https://issues.apache.org/jira/browse/HIVE-2952 Project: Hive Issue Type: Test Environment: Mac OSX Lion Reporter: Kevin Wilfong Priority: Critical escape1.q and escape2.q have started failing in at least the Mac OS, but they succeed in Linux The last time I saw them succeed in Mac was on 4/11 night -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2952) escape1.q and escape2.q failing in trunk
[ https://issues.apache.org/jira/browse/HIVE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253904#comment-13253904 ] Kevin Wilfong commented on HIVE-2952: - For me, it fails while trying to process the partition part=a with the exception [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output from: pfile:/Users/kevinwilfong/Documents/old_hive/build/ql/scratchdir/hive_2012-04-13_17-01-55_592_712696764807580029/_task_tmp.-ext-10002/part=a/_tmp.00_0 to: pfile:/Users/kevinwilfong/Documents/old_hive/build/ql/scratchdir/hive_2012-04-13_17-01-55_592_712696764807580029/_tmp.-ext-10002/part=a/00_0 [junit] at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:202) [junit] at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:99) [junit] at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:720) [junit] at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557) [junit] at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) [junit] at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) [junit] at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) [junit] at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193) [junit] at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) [junit] at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) [junit] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) [junit] at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176) [junit] Caused by: java.io.FileNotFoundException: File file:/Users/kevinwilfong/Documents/old_hive/build/ql/scratchdir/hive_2012-04-13_17-01-55_592_712696764807580029/_task_tmp.-ext-10002/part=a/_tmp.00_0 does not exist. [junit] at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361) [junit] at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192) [junit] at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142) [junit] at org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253) [junit] at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:406) [junit] at org.apache.hadoop.fs.FilterFileSystem.rename(FilterFileSystem.java:138) [junit] at org.apache.hadoop.fs.ProxyFileSystem.rename(ProxyFileSystem.java:159) [junit] at org.apache.hadoop.fs.FilterFileSystem.rename(FilterFileSystem.java:138) [junit] at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:196) [junit] ... 11 more This is the first lower case letter that is processed, and all capital letters are processed. FWIW I noticed that Mac is not case sensitive for directory names, but Linux is, so this may be a clue escape1.q and escape2.q failing in trunk Key: HIVE-2952 URL: https://issues.apache.org/jira/browse/HIVE-2952 Project: Hive Issue Type: Test Environment: Mac OSX Lion Reporter: Kevin Wilfong Priority: Critical escape1.q and escape2.q have started failing in at least the Mac OS, but they succeed in Linux The last time I saw them succeed in Mac was on 4/11 night -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2918) Hive Dynamic Partition Insert - move task not considering 'hive.exec.max.dynamic.partitions' from CLI
[ https://issues.apache.org/jira/browse/HIVE-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253953#comment-13253953 ] Kevin Wilfong commented on HIVE-2918: - @Carl The issue with escape2.q and this patch is that now, when the BlockMergeTask runs, the conf it gets from the Hive object has the value for hive.query.string populated, since the conf in the Hive objectis being updated more frequently than before this patch. The query string for many of the concatenate commands in that test include characters which are illegal in XML 1.0, which it looks like Hadoop is trying to produce using the conf when a job is submitted. This is an open issue in Hadoop https://issues.apache.org/jira/browse/HADOOP-7542 There are a couple ways I can think of so that we could deal with this issue: 1) sanitize the query String wherever we set it (Driver's execute method and SessionState's setCmd method) This may have the added benefit of allowing users to execute queries (not just DDL commands) involving such characters. This could potentially have the issue of escaping characters which were not escaped before and do not need to be depending on how we handle the sanitization process (this would happen for example, if we used the Apache commons library's Java escape method). 2) sanitize it, or remove it from the job conf in the BlockMergeTask. The only two places we could run into this issue are in the BlockMergeTask and MapRedTask. We already running into this issue in MapRedTask, and were only avoiding it in the BlockMergeTask (it appears) by luck, or somebody intentionally using the conf from the Hive object there rather than the one in the BlockMergeTask Hive Dynamic Partition Insert - move task not considering 'hive.exec.max.dynamic.partitions' from CLI - Key: HIVE-2918 URL: https://issues.apache.org/jira/browse/HIVE-2918 Project: Hive Issue Type: Bug Affects Versions: 0.7.1, 0.8.0, 0.8.1 Environment: Cent OS 64 bit Reporter: Bejoy KS Assignee: Carl Steinbach Attachments: HIVE-2918.D2703.1.patch Dynamic Partition insert showing an error with the number of partitions created even after the default value of 'hive.exec.max.dynamic.partitions' is bumped high to 2000. Error Message: Failed with exception Number of dynamic partitions created is 1413, which is more than 1000. To solve this try to set hive.exec.max.dynamic.partitions to at least 1413. These are the following properties set on hive CLI hive set hive.exec.dynamic.partition=true; hive set hive.exec.dynamic.partition.mode=nonstrict; hive set hive.exec.max.dynamic.partitions=2000; hive set hive.exec.max.dynamic.partitions.pernode=2000; This is the query with console error log hive INSERT OVERWRITE TABLE partn_dyn Partition (pobox) SELECT country,state,pobox FROM non_partn_dyn; Total MapReduce jobs = 2 Launching Job 1 out of 2 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201204021529_0002, Tracking URL = http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201204021529_0002 Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=0.0.0.0:8021 -kill job_201204021529_0002 2012-04-02 16:05:28,619 Stage-1 map = 0%, reduce = 0% 2012-04-02 16:05:39,701 Stage-1 map = 100%, reduce = 0% 2012-04-02 16:05:50,800 Stage-1 map = 100%, reduce = 100% Ended Job = job_201204021529_0002 Ended Job = 248865587, job is filtered out (removed at runtime). Moving data to: hdfs://0.0.0.0/tmp/hive-cloudera/hive_2012-04-02_16-05-24_919_5976014408587784412/-ext-1 Loading data to table default.partn_dyn partition (pobox=null) Failed with exception Number of dynamic partitions created is 1413, which is more than 1000. To solve this try to set hive.exec.max.dynamic.partitions to at least 1413. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask I checked the job.xml of the first map only job, there the value hive.exec.max.dynamic.partitions=2000 is reflected but the move task is taking the default value from hive-site.xml . If I change the value in hive-site.xml then the job completes successfully. Bottom line,the property 'hive.exec.max.dynamic.partitions'set on CLI is not being considered by move task -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2923) testAclPositive in TestZooKeeperTokenStore failing in clean checkout when run on Mac
[ https://issues.apache.org/jira/browse/HIVE-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252521#comment-13252521 ] Kevin Wilfong commented on HIVE-2923: - Thanks Thomas, that fixed the issue. testAclPositive in TestZooKeeperTokenStore failing in clean checkout when run on Mac Key: HIVE-2923 URL: https://issues.apache.org/jira/browse/HIVE-2923 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Environment: Mac OSX Lion Reporter: Kevin Wilfong Assignee: Thomas Weise Priority: Blocker Fix For: 0.9.0 Attachments: HIVE-2923.patch When running testAclPositive in TestZooKeeperTokenStore in a clean checkout, it fails with the error: Failed to validate token path. org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: Failed to validate token path. at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.init(ZooKeeperTokenStore.java:207) at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.setConf(ZooKeeperTokenStore.java:225) at org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore.testAclPositive(TestZooKeeperTokenStore.java:170) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906) Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /zktokenstore-testAcl at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:778) at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:119) at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.init(ZooKeeperTokenStore.java:204) ... 17 more This message is also printed to standard out: Unable to load realm mapping info from SCDynamicStore The test seems to run fine in Linux, but more than one developer has reported this on a Mac. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2937) TestHiveServerSessions hangs when executed directly
[ https://issues.apache.org/jira/browse/HIVE-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253073#comment-13253073 ] Kevin Wilfong commented on HIVE-2937: - @Ashutosh I dug deeper into the issue. I verified that the second HiveServerHandler does exit the method, just via an exception rather than via the return. This prevents the HiveServerHandler from being created. The test actually gets stuck making the first execute call on the second HiveClient, presumably because there is no handler on the other end of the connection to handle the request, so it gets stuck waiting for a response. I admit I am equally confused as to why this started showing up so frequently recently. I had seen this problem a few times before, I had assumed it was caused by running the tests twice on the same machine. Now however, I've run into this problem every time I try to run tests, since yesterday. As far as I can tell, there is no, and never was any logic to handle this problem. So Navis's diff seems worth committing to fix that issue, and it is at least a plus that it gets us around this issue, even if the root cause of why this issue suddenly became so prominent is still unsolved. I would like to commit this patch, unless you have any objections. TestHiveServerSessions hangs when executed directly --- Key: HIVE-2937 URL: https://issues.apache.org/jira/browse/HIVE-2937 Project: Hive Issue Type: Test Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-2937.D2697.1.patch {code} ant test -Doffline=true -Dtestcase=TestHiveServerSessions {code} Hangs infinitely. I couldn't imagine exact cause of the problem, but found that by adding 'new HiveServer.HiveServerHandler();' in setup(), test resulted to success. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2937) TestHiveServerSessions hangs when executed directly
[ https://issues.apache.org/jira/browse/HIVE-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251808#comment-13251808 ] Kevin Wilfong commented on HIVE-2937: - I ran into this issue too. It seems to be caused by creating the HiveClient's too quickly. The initialization of the HiveClient initializes the HiveServerHandler which initializes the HMSHandler. The initializations of the HMSHandler's happen in such quick succession that when the second call to create the default db occurs, the first call hasn't finished creating the db yet so it attempts to create the same db and never gets out of this method (possibly a derby issue). It could also be fixed by adding a Thread.sleep between creating HiveClient's, but Navis's solution seems much more appropriate. @Ashutosh I don't think it is related to either of those patches as previous builds appear to fail for the same reason https://builds.apache.org/job/Hive-trunk-h0.21/1355/ also both of those patches should have only affected map reduce jobs, not the metastore. TestHiveServerSessions hangs when executed directly --- Key: HIVE-2937 URL: https://issues.apache.org/jira/browse/HIVE-2937 Project: Hive Issue Type: Test Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-2937.D2697.1.patch {code} ant test -Doffline=true -Dtestcase=TestHiveServerSessions {code} Hangs infinitely. I couldn't imagine exact cause of the problem, but found that by adding 'new HiveServer.HiveServerHandler();' in setup(), test resulted to success. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2907) Hive error when dropping a table with large number of partitions
[ https://issues.apache.org/jira/browse/HIVE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250791#comment-13250791 ] Kevin Wilfong commented on HIVE-2907: - +1 tests passed, will commit. Hive error when dropping a table with large number of partitions Key: HIVE-2907 URL: https://issues.apache.org/jira/browse/HIVE-2907 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0 Environment: General. Hive Metastore bug. Reporter: Mousom Dhar Gupta Assignee: Mousom Dhar Gupta Priority: Minor Fix For: 0.9.0 Attachments: HIVE-2907.1.patch.txt, HIVE-2907.2.patch.txt, HIVE-2907.3.patch.txt, HIVE-2907.D2505.1.patch, HIVE-2907.D2505.2.patch, HIVE-2907.D2505.3.patch, HIVE-2907.D2505.4.patch, HIVE-2907.D2505.5.patch, HIVE-2907.D2505.6.patch, HIVE-2907.D2505.7.patch Original Estimate: 10h Remaining Estimate: 10h Running into an Out Of Memory error when trying to drop a table with 128K partitions. The methods dropTable in metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java and dropTable in ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java encounter out of memory errors when dropping tables with lots of partitions because they try to load the metadata for every partition into memory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2530) Implement SHOW TBLPROPERTIES
[ https://issues.apache.org/jira/browse/HIVE-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250843#comment-13250843 ] Kevin Wilfong commented on HIVE-2530: - Ashutosh: I agree that the information can already be viewed via a describe formatted command, however, for tables with a large number of properties, or even a large number of columns or large column comments, it becomes difficult for users to quickly find the property they're looking for. This will help provide a clean view and ease readability. Implement SHOW TBLPROPERTIES Key: HIVE-2530 URL: https://issues.apache.org/jira/browse/HIVE-2530 Project: Hive Issue Type: New Feature Reporter: Adam Kramer Assignee: Lei Zhao Priority: Minor Attachments: HIVE-2530.D2589.1.patch, HIVE-2530.D2589.2.patch, HIVE-2530.D2589.3.patch Since table properties can be defined arbitrarily, they should be easy for a user to query from the command-line. SHOW TBLPROPERTIES tblname; ...would show all of them, one per row, key \t value SHOW TBLPROPERTIES tblname (FOOBAR); ...would just show the value for the FOOBAR tblproperty. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2797) Make the IP address of a Thrift client available to HMSHandler.
[ https://issues.apache.org/jira/browse/HIVE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237754#comment-13237754 ] Kevin Wilfong commented on HIVE-2797: - Looks like something my editor must have done automatically, something about adding or removing a new line at the end of files. Anyway, I removed those lines from the patch manually (I hadn't intended to change that line anyway). HIVE-2797.7.patch contains this, and I tried applying it to a fresh checkout and did not get that error. Make the IP address of a Thrift client available to HMSHandler. --- Key: HIVE-2797 URL: https://issues.apache.org/jira/browse/HIVE-2797 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2797.7.patch, HIVE-2797.D1701.1.patch, HIVE-2797.D1701.2.patch, HIVE-2797.D1701.3.patch, HIVE-2797.D1701.4.patch, HIVE-2797.D1701.5.patch, HIVE-2797.D1701.6.patch Currently, in unsecured mode, metastore Thrift calls are, from the HMSHandler's point of view, anonymous. If we expose the IP address of the Thrift client to the HMSHandler from the Processor, this will help to give some context, in particular for audit logging, of where the call is coming from. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-747) void type (null column) cannot be UNIONed with other type
[ https://issues.apache.org/jira/browse/HIVE-747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237217#comment-13237217 ] Kevin Wilfong commented on HIVE-747: This is now producing wrong results. select x from (select 'a' as x from src union all select NULL as x from src)a; returns NULL for every row. void type (null column) cannot be UNIONed with other type - Key: HIVE-747 URL: https://issues.apache.org/jira/browse/HIVE-747 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Ning Zhang Assignee: Ning Zhang hive select * from (select NULL from zshao_tt union all select 1 from zshao_tt)x; select * from (select NULL from zshao_tt union all select 1 from zshao_tt)x; FAILED: Error in semantic analysis: Schema of both sides of union should match: Column _c0 is of type void on first table and type int on second table -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-942) use bucketing for group by
[ https://issues.apache.org/jira/browse/HIVE-942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233789#comment-13233789 ] Kevin Wilfong commented on HIVE-942: If the table is bucketed and sorted on the group by keys, the group by operator would not need to do a hash, which would help with memory consumption. use bucketing for group by -- Key: HIVE-942 URL: https://issues.apache.org/jira/browse/HIVE-942 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Group by on a bucketed column can be completely performed on the mapper if the split can be adjusted to span the key boundary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2853) Add pre event listeners to metastore
[ https://issues.apache.org/jira/browse/HIVE-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225588#comment-13225588 ] Kevin Wilfong commented on HIVE-2853: - @Ashutosh This feature is just a new type of hook, unless you provide an implementation (and I am not providing any implementations in this patch) it should have no effect, and in particular, users will not see any change. Add pre event listeners to metastore Key: HIVE-2853 URL: https://issues.apache.org/jira/browse/HIVE-2853 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2853.D2175.1.patch Currently there are event listeners in the metastore which run after the completion of a method. It would be useful to have similar hooks which run before the metastore method is executed. These can be used to make validating names, locations, etc. customizable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2836) CREATE TABLE LIKE does not copy table type
[ https://issues.apache.org/jira/browse/HIVE-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222482#comment-13222482 ] Kevin Wilfong commented on HIVE-2836: - @Aniket Thanks for pointing that out. CREATE TABLE LIKE does not copy table type -- Key: HIVE-2836 URL: https://issues.apache.org/jira/browse/HIVE-2836 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Assignee: Kevin Wilfong CREATE TABLE t LIKE t2 If t2 is external t will still be managed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2833) Fix test failures caused by HIVE-2716
[ https://issues.apache.org/jira/browse/HIVE-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221164#comment-13221164 ] Kevin Wilfong commented on HIVE-2833: - The issue is that updateConnectionURL is now being called after the ObjectStore is initially constructed. We use a Metastore Connection URL Hook to generate the URL to be used by JDO. Since, this was being called after the ObjectStore was constructed, it was initially trying to connect to a default URL which was incorrect. I'll upload a patch once I come up with a test case. Fix test failures caused by HIVE-2716 - Key: HIVE-2833 URL: https://issues.apache.org/jira/browse/HIVE-2833 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Enis Soztutar -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2716) Move retry logic in HiveMetaStore to a separe class
[ https://issues.apache.org/jira/browse/HIVE-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221324#comment-13221324 ] Kevin Wilfong commented on HIVE-2716: - Ignore the Phabricator comment and patch for D2055, that was intended for another JIRA. Move retry logic in HiveMetaStore to a separe class --- Key: HIVE-2716 URL: https://issues.apache.org/jira/browse/HIVE-2716 Project: Hive Issue Type: Sub-task Components: Metastore Affects Versions: 0.9.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.9.0 Attachments: HIVE-2716.D1227.1.patch, HIVE-2716.D1227.2.patch, HIVE-2716.D1227.3.patch, HIVE-2716.D1227.4.patch, HIVE-2716.D2055.1.patch, HIVE-2716.patch In HIVE-1219, method retrying for raw store operation are introduced to handle jdo operations more robustly. However, the abstraction for the RawStore operations can be moved to a separate class implementing RawStore, which should clean up the code base for HiveMetaStore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2833) Fix test failures caused by HIVE-2716
[ https://issues.apache.org/jira/browse/HIVE-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221326#comment-13221326 ] Kevin Wilfong commented on HIVE-2833: - kevinwilfong requested code review of HIVE-2716 [jira] Move retry logic in HiveMetaStore to a separe class. Reviewers: JIRA This fixes the issue where the implementation of RawStore was being initialized in the metastore before the JDOConnectionURLHook was being called. I moved some code around so that the RawStore implementation is initialized in the RetryRawStore initialization after the hook is called. In addition, the URL now needs to be updated in the thread local configuration instead of the non-thread local one, as by the time the hook is now being called, the thread local configuration has already been initialized based on the other. In HIVE-1219, method retrying for raw store operation are introduced to handle jdo operations more robustly. However, the abstraction for the RawStore operations can be moved to a separate class implementing RawStore, which should clean up the code base for HiveMetaStore. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D2055 AFFECTED FILES metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreConnectionUrlHook.java metastore/src/test/org/apache/hadoop/hive/metastore/DummyJdoConnectionUrlHook.java metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java metastore/src/java/org/apache/hadoop/hive/metastore/RetryingRawStore.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/4407/ Tip: use the X-Herald-Rules header to filter Herald messages in your client. Fix test failures caused by HIVE-2716 - Key: HIVE-2833 URL: https://issues.apache.org/jira/browse/HIVE-2833 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Kevin Wilfong Attachments: HIVE-2716.D2055.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1054) CHANGE COLUMN does not support changing partition column types.
[ https://issues.apache.org/jira/browse/HIVE-1054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209524#comment-13209524 ] Kevin Wilfong commented on HIVE-1054: - @Ashutosh What if we only allow changing the type of a partition column? I agree adding/dropping/renaming partition columns has a host of difficulties to manage, but I thought changing a partition column's type shouldn't involve a change in hdfs, simply a metadata change. CHANGE COLUMN does not support changing partition column types. - Key: HIVE-1054 URL: https://issues.apache.org/jira/browse/HIVE-1054 Project: Hive Issue Type: Bug Reporter: He Yongqiang -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2799) change the following thrift apis to add a region
[ https://issues.apache.org/jira/browse/HIVE-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208972#comment-13208972 ] Kevin Wilfong commented on HIVE-2799: - I was under the impression that that's true in the sense that, if you are using an old client which thinks a method takes 1 parameter and the server is running new code which thinks that method takes 2 parameters it will still work. However, once you upgrade your client to use new code, you will always have to provide 2 parameters, at least if the client is in Java, I'm not sure if this applies to all languages Thrift supports. I wanted to makes sure that this would not break code that uses the current Thrift APIs, even after the clients are upgraded. change the following thrift apis to add a region Key: HIVE-2799 URL: https://issues.apache.org/jira/browse/HIVE-2799 Project: Hive Issue Type: New Feature Components: Metastore, Thrift API Reporter: Namit Jain Assignee: Kevin Wilfong liststring get_tables(1: string db_name, 2: string pattern) throws (1: MetaException o1) liststring get_all_tables(1: string db_name) throws (1: MetaException o1) Table get_table(1:string dbname, 2:string tbl_name) throws (1:MetaException o1, 2:NoSuchObjectException o2) listTable get_table_objects_by_name(1:string dbname, 2:liststring tbl_names) throws (1:MetaException o1, 2:InvalidOperationException o2, 3:UnknownDBException o3) liststring get_table_names_by_filter(1:string dbname, 2:string filter, 3:i16 max_tables=-1) throws (1:MetaException o1, 2:InvalidOperationException o2, 3:UnknownDBException o3) Partition add_partition(1:Partition new_part) throws(1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) i32 add_partitions(1:listPartition new_parts) throws(1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) Partition append_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals) throws (1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) Partition append_partition_by_name(1:string db_name, 2:string tbl_name, 3:string part_name) throws (1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) bool drop_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) bool drop_partition_by_name(1:string db_name, 2:string tbl_name, 3:string part_name, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) Partition get_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals) throws(1:MetaException o1, 2:NoSuchObjectException o2) Partition get_partition_with_auth(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4: string user_name, 5: liststring group_names) throws(1:MetaException o1, 2:NoSuchObjectException o2) Partition get_partition_by_name(1:string db_name 2:string tbl_name, 3:string part_name) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1) throws(1:NoSuchObjectException o1, 2:MetaException o2) listPartition get_partitions_with_auth(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1, 4: string user_name, 5: liststring group_names) throws(1:NoSuchObjectException o1, 2:MetaException o2) liststring get_partition_names(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1) throws(1:MetaException o2) listPartition get_partitions_ps(1:string db_name 2:string tbl_name 3:liststring part_vals, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_ps_with_auth(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:i16 max_parts=-1, 5: string user_name, 6: liststring group_names) throws(1:NoSuchObjectException o1, 2:MetaException o2) liststring get_partition_names_ps(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_by_filter(1:string db_name 2:string tbl_name 3:string filter, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_by_names(1:string db_name 2:string
[jira] [Commented] (HIVE-2612) support hive table/partitions exists in more than one region
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208978#comment-13208978 ] Kevin Wilfong commented on HIVE-2612: - @Ashutosh I'm sorry about that, there already were scripts with the name upgrade-0.8.0-to-0.9.0.mysql.sql, I didn't realize those were Hive version numbers, I thought this was a metastore versioning system. I can move the sql commands in those files into the 0.8.0-to-0.9.0 scripts and rename the schema-0.10.0 scripts to schema-0.9.0 support hive table/partitions exists in more than one region Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Kevin Wilfong Fix For: 0.9.0 Attachments: HIVE-2612.1.patch, HIVE-2612.2.patch.txt, HIVE-2612.3.patch.txt, HIVE-2612.4.patch.txt, HIVE-2612.6.patch.txt, HIVE-2612.7.patch.txt, HIVE-2612.8.patch.txt, HIVE-2612.D1569.1.patch, HIVE-2612.D1569.2.patch, HIVE-2612.D1569.3.patch, HIVE-2612.D1569.4.patch, HIVE-2612.D1569.5.patch, HIVE-2612.D1569.6.patch, HIVE-2612.D1569.7.patch, HIVE-2612.D1707.1.patch, hive.2612.5.patch 1) add region object into hive metastore 2) each partition/table has a primary region and a list of living regions, and also data location in each region -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2805) Move metastore upgrade scripts labeled 0.10.0 into scripts labeled 0.9.0
[ https://issues.apache.org/jira/browse/HIVE-2805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208980#comment-13208980 ] Kevin Wilfong commented on HIVE-2805: - See https://issues.apache.org/jira/browse/HIVE-2612 Move metastore upgrade scripts labeled 0.10.0 into scripts labeled 0.9.0 Key: HIVE-2805 URL: https://issues.apache.org/jira/browse/HIVE-2805 Project: Hive Issue Type: Task Reporter: Kevin Wilfong Assignee: Kevin Wilfong Move contents of upgrade-0.9.0-to-0.10.0.mysql.sql, upgrade-0.9.0-to-0.10.0.derby.sql into upgrade-0.8.0-to-0.9.0.mysql.sql, upgrade-0.8.0-to-0.9.0.derby.sql Rename hive-schema-0.10.0.derby.sql, hive-schema-0.10.0.mysql.sql to hive-schema-0.9.0.derby.sql, hive-schema-0.9.0.mysql.sql -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2799) change the following thrift apis to add a region
[ https://issues.apache.org/jira/browse/HIVE-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208997#comment-13208997 ] Kevin Wilfong commented on HIVE-2799: - @Ashutosh I agree, that will work as well. I meant I wanted to make sure I don't force people who don't want to use multi-region to have to add a region to their Thrift API calls, once both the server and client are upgraded. change the following thrift apis to add a region Key: HIVE-2799 URL: https://issues.apache.org/jira/browse/HIVE-2799 Project: Hive Issue Type: New Feature Components: Metastore, Thrift API Reporter: Namit Jain Assignee: Kevin Wilfong liststring get_tables(1: string db_name, 2: string pattern) throws (1: MetaException o1) liststring get_all_tables(1: string db_name) throws (1: MetaException o1) Table get_table(1:string dbname, 2:string tbl_name) throws (1:MetaException o1, 2:NoSuchObjectException o2) listTable get_table_objects_by_name(1:string dbname, 2:liststring tbl_names) throws (1:MetaException o1, 2:InvalidOperationException o2, 3:UnknownDBException o3) liststring get_table_names_by_filter(1:string dbname, 2:string filter, 3:i16 max_tables=-1) throws (1:MetaException o1, 2:InvalidOperationException o2, 3:UnknownDBException o3) Partition add_partition(1:Partition new_part) throws(1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) i32 add_partitions(1:listPartition new_parts) throws(1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) Partition append_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals) throws (1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) Partition append_partition_by_name(1:string db_name, 2:string tbl_name, 3:string part_name) throws (1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) bool drop_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) bool drop_partition_by_name(1:string db_name, 2:string tbl_name, 3:string part_name, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) Partition get_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals) throws(1:MetaException o1, 2:NoSuchObjectException o2) Partition get_partition_with_auth(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4: string user_name, 5: liststring group_names) throws(1:MetaException o1, 2:NoSuchObjectException o2) Partition get_partition_by_name(1:string db_name 2:string tbl_name, 3:string part_name) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1) throws(1:NoSuchObjectException o1, 2:MetaException o2) listPartition get_partitions_with_auth(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1, 4: string user_name, 5: liststring group_names) throws(1:NoSuchObjectException o1, 2:MetaException o2) liststring get_partition_names(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1) throws(1:MetaException o2) listPartition get_partitions_ps(1:string db_name 2:string tbl_name 3:liststring part_vals, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_ps_with_auth(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:i16 max_parts=-1, 5: string user_name, 6: liststring group_names) throws(1:NoSuchObjectException o1, 2:MetaException o2) liststring get_partition_names_ps(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_by_filter(1:string db_name 2:string tbl_name 3:string filter, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_by_names(1:string db_name 2:string tbl_name 3:liststring names) throws(1:MetaException o1, 2:NoSuchObjectException o2) bool drop_index_by_name(1:string db_name, 2:string tbl_name, 3:string index_name, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) Index get_index_by_name(1:string db_name
[jira] [Commented] (HIVE-2799) change the following thrift apis to add a region
[ https://issues.apache.org/jira/browse/HIVE-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209016#comment-13209016 ] Kevin Wilfong commented on HIVE-2799: - I'm definitely in favor of that approach. The primary reason I intended to add duplicate Thrift API calls was to keep open source users happy, but if people are content with simply wrapping them in HiveMetaStoreClient, I am more than happy to oblige. change the following thrift apis to add a region Key: HIVE-2799 URL: https://issues.apache.org/jira/browse/HIVE-2799 Project: Hive Issue Type: New Feature Components: Metastore, Thrift API Reporter: Namit Jain Assignee: Kevin Wilfong liststring get_tables(1: string db_name, 2: string pattern) throws (1: MetaException o1) liststring get_all_tables(1: string db_name) throws (1: MetaException o1) Table get_table(1:string dbname, 2:string tbl_name) throws (1:MetaException o1, 2:NoSuchObjectException o2) listTable get_table_objects_by_name(1:string dbname, 2:liststring tbl_names) throws (1:MetaException o1, 2:InvalidOperationException o2, 3:UnknownDBException o3) liststring get_table_names_by_filter(1:string dbname, 2:string filter, 3:i16 max_tables=-1) throws (1:MetaException o1, 2:InvalidOperationException o2, 3:UnknownDBException o3) Partition add_partition(1:Partition new_part) throws(1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) i32 add_partitions(1:listPartition new_parts) throws(1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) Partition append_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals) throws (1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) Partition append_partition_by_name(1:string db_name, 2:string tbl_name, 3:string part_name) throws (1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) bool drop_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) bool drop_partition_by_name(1:string db_name, 2:string tbl_name, 3:string part_name, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) Partition get_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals) throws(1:MetaException o1, 2:NoSuchObjectException o2) Partition get_partition_with_auth(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4: string user_name, 5: liststring group_names) throws(1:MetaException o1, 2:NoSuchObjectException o2) Partition get_partition_by_name(1:string db_name 2:string tbl_name, 3:string part_name) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1) throws(1:NoSuchObjectException o1, 2:MetaException o2) listPartition get_partitions_with_auth(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1, 4: string user_name, 5: liststring group_names) throws(1:NoSuchObjectException o1, 2:MetaException o2) liststring get_partition_names(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1) throws(1:MetaException o2) listPartition get_partitions_ps(1:string db_name 2:string tbl_name 3:liststring part_vals, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_ps_with_auth(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:i16 max_parts=-1, 5: string user_name, 6: liststring group_names) throws(1:NoSuchObjectException o1, 2:MetaException o2) liststring get_partition_names_ps(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_by_filter(1:string db_name 2:string tbl_name 3:string filter, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_by_names(1:string db_name 2:string tbl_name 3:liststring names) throws(1:MetaException o1, 2:NoSuchObjectException o2) bool drop_index_by_name(1:string db_name, 2:string tbl_name, 3:string index_name, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) Index
[jira] [Commented] (HIVE-2799) change the following thrift apis to add a region
[ https://issues.apache.org/jira/browse/HIVE-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207033#comment-13207033 ] Kevin Wilfong commented on HIVE-2799: - Adding the following to the list to be changed. void create_table(1:Table tbl) throws(1:AlreadyExistsException o1, 2:InvalidObjectException o2, 3:MetaException o3, 4:NoSuchObjectException o4) void drop_table(1:string dbname, 2:string name, 3:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o3) Index add_index(1:Index new_index, 2: Table index_table) throws(1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) change the following thrift apis to add a region Key: HIVE-2799 URL: https://issues.apache.org/jira/browse/HIVE-2799 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Kevin Wilfong liststring get_tables(1: string db_name, 2: string pattern) throws (1: MetaException o1) liststring get_all_tables(1: string db_name) throws (1: MetaException o1) Table get_table(1:string dbname, 2:string tbl_name) throws (1:MetaException o1, 2:NoSuchObjectException o2) listTable get_table_objects_by_name(1:string dbname, 2:liststring tbl_names) throws (1:MetaException o1, 2:InvalidOperationException o2, 3:UnknownDBException o3) liststring get_table_names_by_filter(1:string dbname, 2:string filter, 3:i16 max_tables=-1) throws (1:MetaException o1, 2:InvalidOperationException o2, 3:UnknownDBException o3) Partition add_partition(1:Partition new_part) throws(1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) i32 add_partitions(1:listPartition new_parts) throws(1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) Partition append_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals) throws (1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) Partition append_partition_by_name(1:string db_name, 2:string tbl_name, 3:string part_name) throws (1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) bool drop_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) bool drop_partition_by_name(1:string db_name, 2:string tbl_name, 3:string part_name, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) Partition get_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals) throws(1:MetaException o1, 2:NoSuchObjectException o2) Partition get_partition_with_auth(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4: string user_name, 5: liststring group_names) throws(1:MetaException o1, 2:NoSuchObjectException o2) Partition get_partition_by_name(1:string db_name 2:string tbl_name, 3:string part_name) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1) throws(1:NoSuchObjectException o1, 2:MetaException o2) listPartition get_partitions_with_auth(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1, 4: string user_name, 5: liststring group_names) throws(1:NoSuchObjectException o1, 2:MetaException o2) liststring get_partition_names(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1) throws(1:MetaException o2) listPartition get_partitions_ps(1:string db_name 2:string tbl_name 3:liststring part_vals, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_ps_with_auth(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:i16 max_parts=-1, 5: string user_name, 6: liststring group_names) throws(1:NoSuchObjectException o1, 2:MetaException o2) liststring get_partition_names_ps(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_by_filter(1:string db_name 2:string tbl_name 3:string filter, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_by_names(1:string db_name 2:string tbl_name 3:liststring names) throws(1:MetaException o1, 2:NoSuchObjectException o2)
[jira] [Commented] (HIVE-2799) change the following thrift apis to add a region
[ https://issues.apache.org/jira/browse/HIVE-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207144#comment-13207144 ] Kevin Wilfong commented on HIVE-2799: - I will leave the current API's unchanged, I intend to add additional APIs, like create_table_using_region, which will allow the user to provide a region name in addition to all normal arguments. change the following thrift apis to add a region Key: HIVE-2799 URL: https://issues.apache.org/jira/browse/HIVE-2799 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Kevin Wilfong liststring get_tables(1: string db_name, 2: string pattern) throws (1: MetaException o1) liststring get_all_tables(1: string db_name) throws (1: MetaException o1) Table get_table(1:string dbname, 2:string tbl_name) throws (1:MetaException o1, 2:NoSuchObjectException o2) listTable get_table_objects_by_name(1:string dbname, 2:liststring tbl_names) throws (1:MetaException o1, 2:InvalidOperationException o2, 3:UnknownDBException o3) liststring get_table_names_by_filter(1:string dbname, 2:string filter, 3:i16 max_tables=-1) throws (1:MetaException o1, 2:InvalidOperationException o2, 3:UnknownDBException o3) Partition add_partition(1:Partition new_part) throws(1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) i32 add_partitions(1:listPartition new_parts) throws(1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) Partition append_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals) throws (1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) Partition append_partition_by_name(1:string db_name, 2:string tbl_name, 3:string part_name) throws (1:InvalidObjectException o1, 2:AlreadyExistsException o2, 3:MetaException o3) bool drop_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) bool drop_partition_by_name(1:string db_name, 2:string tbl_name, 3:string part_name, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) Partition get_partition(1:string db_name, 2:string tbl_name, 3:liststring part_vals) throws(1:MetaException o1, 2:NoSuchObjectException o2) Partition get_partition_with_auth(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4: string user_name, 5: liststring group_names) throws(1:MetaException o1, 2:NoSuchObjectException o2) Partition get_partition_by_name(1:string db_name 2:string tbl_name, 3:string part_name) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1) throws(1:NoSuchObjectException o1, 2:MetaException o2) listPartition get_partitions_with_auth(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1, 4: string user_name, 5: liststring group_names) throws(1:NoSuchObjectException o1, 2:MetaException o2) liststring get_partition_names(1:string db_name, 2:string tbl_name, 3:i16 max_parts=-1) throws(1:MetaException o2) listPartition get_partitions_ps(1:string db_name 2:string tbl_name 3:liststring part_vals, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_ps_with_auth(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:i16 max_parts=-1, 5: string user_name, 6: liststring group_names) throws(1:NoSuchObjectException o1, 2:MetaException o2) liststring get_partition_names_ps(1:string db_name, 2:string tbl_name, 3:liststring part_vals, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_by_filter(1:string db_name 2:string tbl_name 3:string filter, 4:i16 max_parts=-1) throws(1:MetaException o1, 2:NoSuchObjectException o2) listPartition get_partitions_by_names(1:string db_name 2:string tbl_name 3:liststring names) throws(1:MetaException o1, 2:NoSuchObjectException o2) bool drop_index_by_name(1:string db_name, 2:string tbl_name, 3:string index_name, 4:bool deleteData) throws(1:NoSuchObjectException o1, 2:MetaException o2) Index get_index_by_name(1:string db_name 2:string tbl_name, 3:string index_name)
[jira] [Commented] (HIVE-2612) support hive table/partitions exists in more than one region
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205762#comment-13205762 ] Kevin Wilfong commented on HIVE-2612: - Attached a patch containing the updated metastore upgrade scripts. support hive table/partitions exists in more than one region Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain Attachments: HIVE-2612.1.patch, HIVE-2612.2.patch.txt, HIVE-2612.3.patch.txt, HIVE-2612.4.patch.txt, HIVE-2612.6.patch.txt, HIVE-2612.D1569.1.patch, HIVE-2612.D1569.2.patch, HIVE-2612.D1569.3.patch, HIVE-2612.D1569.4.patch, HIVE-2612.D1569.5.patch, HIVE-2612.D1569.6.patch, HIVE-2612.D1569.7.patch, hive.2612.5.patch 1) add region object into hive metastore 2) each partition/table has a primary region and a list of living regions, and also data location in each region -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2612) support hive table/partitions exists in more than one region
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205793#comment-13205793 ] Kevin Wilfong commented on HIVE-2612: - Updated the patch with some small changes: cleaned up a few remaining mentions of cluster instead of region always check/set the primary region name for a table in the metastore support hive table/partitions exists in more than one region Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain Attachments: HIVE-2612.1.patch, HIVE-2612.2.patch.txt, HIVE-2612.3.patch.txt, HIVE-2612.4.patch.txt, HIVE-2612.6.patch.txt, HIVE-2612.7.patch.txt, HIVE-2612.D1569.1.patch, HIVE-2612.D1569.2.patch, HIVE-2612.D1569.3.patch, HIVE-2612.D1569.4.patch, HIVE-2612.D1569.5.patch, HIVE-2612.D1569.6.patch, HIVE-2612.D1569.7.patch, hive.2612.5.patch 1) add region object into hive metastore 2) each partition/table has a primary region and a list of living regions, and also data location in each region -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2612) support hive table/partitions coexistes in more than one clusters
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13202719#comment-13202719 ] Kevin Wilfong commented on HIVE-2612: - I attached a patch to this JIRA which provides scripts to update the metastore for derby, MySQL and postgres. It also changes the default cluster name to '' (empty string) and fixes an inconsistency where the size of PRIMARY_CLUSTER_NAME in SDS had a different size than the CLUSTER_NAME column in CLUSTER_SDS. support hive table/partitions coexistes in more than one clusters - Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain Attachments: HIVE-2612.1.patch, HIVE-2612.2.patch.txt, HIVE-2612.D1569.1.patch, HIVE-2612.D1569.2.patch, HIVE-2612.D1569.3.patch, HIVE-2612.D1569.4.patch, HIVE-2612.D1569.5.patch 1) add cluster object into hive metastore 2) each partition/table has a creation cluster and a list of living clusters, and also data location in each cluster -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2612) support hive table/partitions coexistes in more than one clusters
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13202783#comment-13202783 ] Kevin Wilfong commented on HIVE-2612: - This change will require the user to update their metastore schema. The scripts in the patch should be sufficient provided the schema is already up to date. The only schema changes needed are a new table and a new column is added to SDS, it should not take long, no more than five minutes depending on the size of the SDS table, to update. support hive table/partitions coexistes in more than one clusters - Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain Attachments: HIVE-2612.1.patch, HIVE-2612.2.patch.txt, HIVE-2612.D1569.1.patch, HIVE-2612.D1569.2.patch, HIVE-2612.D1569.3.patch, HIVE-2612.D1569.4.patch, HIVE-2612.D1569.5.patch, HIVE-2612.D1569.6.patch 1) add cluster object into hive metastore 2) each partition/table has a creation cluster and a list of living clusters, and also data location in each cluster -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2612) support hive table/partitions coexistes in more than one clusters
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13202935#comment-13202935 ] Kevin Wilfong commented on HIVE-2612: - I attached a patch which fixes an error seen where JDO was looking for a column which doesn't exist in the schema in the update scripts provided. The collection of MClusterStorageDescriptors was changed from a List to a Set, and a primary key was indicated in package.jdo. This fixes the error by removing the need to order the MClusterStorageDescriptors and providing a way to uniquely identify them. The primary key is already present in the upgrade scripts provided. support hive table/partitions coexistes in more than one clusters - Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain Attachments: HIVE-2612.1.patch, HIVE-2612.2.patch.txt, HIVE-2612.3.patch.txt, HIVE-2612.D1569.1.patch, HIVE-2612.D1569.2.patch, HIVE-2612.D1569.3.patch, HIVE-2612.D1569.4.patch, HIVE-2612.D1569.5.patch, HIVE-2612.D1569.6.patch, HIVE-2612.D1569.7.patch 1) add cluster object into hive metastore 2) each partition/table has a creation cluster and a list of living clusters, and also data location in each cluster -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2747) UNION ALL with subquery which selects NULL and performs group by fails
[ https://issues.apache.org/jira/browse/HIVE-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13199153#comment-13199153 ] Kevin Wilfong commented on HIVE-2747: - Namit, that looks like the issue in the JIRA https://issues.apache.org/jira/browse/HIVE-2305 UNION ALL with subquery which selects NULL and performs group by fails -- Key: HIVE-2747 URL: https://issues.apache.org/jira/browse/HIVE-2747 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Assignee: Kevin Wilfong Queries like the following from (select key, value, count(1) as count from src group by key, value union all select NULL as key, value, count(1) as count from src group by value) a select count(*); fail with the exception java.lang.NullPointerException at org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector.toString(StructObjectInspector.java:60) at java.lang.String.valueOf(String.java:2826) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:110) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:427) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:98) ... 18 more This should at least provide a more informative error message if not work. It works without the group by. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2206) add a new optimizer for query correlation discovery and optimization
[ https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190147#comment-13190147 ] Kevin Wilfong commented on HIVE-2206: - I tried running explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key; using HIVE-2206.8.r1224646.patch.txt and I get the following exception: FAILED: Hive Internal Error: java.lang.ClassCastException(org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator) java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator at org.apache.hadoop.hive.ql.optimizer.CorrelationOptimizer$CorrelationNodeProc.findPeerReduceSinkOperators(CorrelationOptimizer.java:256) at org.apache.hadoop.hive.ql.optimizer.CorrelationOptimizer$CorrelationNodeProc.process(CorrelationOptimizer.java:503) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102) at org.apache.hadoop.hive.ql.optimizer.CorrelationOptimizer.transform(CorrelationOptimizer.java:193) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:100) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7384) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) add a new optimizer for query correlation discovery and optimization Key: HIVE-2206 URL: https://issues.apache.org/jira/browse/HIVE-2206 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Yin Huai Attachments: HIVE-2206.1.patch.txt, HIVE-2206.2.patch.txt, HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt, HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt, HIVE-2206.8.r1224646.patch.txt, YSmartPatchForHive.patch, testQueries.2.q reference: http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2206) add a new optimizer for query correlation discovery and optimization
[ https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190151#comment-13190151 ] Kevin Wilfong commented on HIVE-2206: - Nevermind, sorry, it was the distribute by followed by sort by. add a new optimizer for query correlation discovery and optimization Key: HIVE-2206 URL: https://issues.apache.org/jira/browse/HIVE-2206 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Yin Huai Attachments: HIVE-2206.1.patch.txt, HIVE-2206.2.patch.txt, HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt, HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt, HIVE-2206.8.r1224646.patch.txt, YSmartPatchForHive.patch, testQueries.2.q reference: http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2206) add a new optimizer for query correlation discovery and optimization
[ https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190161#comment-13190161 ] Kevin Wilfong commented on HIVE-2206: - The above bug is a pre-existing issue with reduce sink reduplication. The following new exception is produced by the query: set hive.optimize.reducededuplication=false; explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key; FAILED: Hive Internal Error: java.lang.ArrayIndexOutOfBoundsException(1) java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.hadoop.hive.ql.optimizer.CorrelationOptimizerUtils.createCorrelationCompositeReducesinkOperaotr(CorrelationOptimizerUtils.java:599) at org.apache.hadoop.hive.ql.optimizer.CorrelationOptimizerUtils.applyCorrelation(CorrelationOptimizerUtils.java:365) at org.apache.hadoop.hive.ql.optimizer.CorrelationOptimizer.transform(CorrelationOptimizer.java:198) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:100) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7384) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) add a new optimizer for query correlation discovery and optimization Key: HIVE-2206 URL: https://issues.apache.org/jira/browse/HIVE-2206 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Yin Huai Attachments: HIVE-2206.1.patch.txt, HIVE-2206.2.patch.txt, HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt, HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt, HIVE-2206.8.r1224646.patch.txt, YSmartPatchForHive.patch, testQueries.2.q reference: http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2674) get_partitions_ps throws TApplicationException if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189354#comment-13189354 ] Kevin Wilfong commented on HIVE-2674: - Ran svn up, and updated the diff. get_partitions_ps throws TApplicationException if table doesn't exist - Key: HIVE-2674 URL: https://issues.apache.org/jira/browse/HIVE-2674 Project: Hive Issue Type: Bug Components: Metastore Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2674.D987.1.patch, HIVE-2674.D987.2.patch If the table passed to get_partition_ps doesn't exist, a NPE is thrown by getPartitionPsQueryResults. There should be a check here, which throws a NoSuchObjectException if the table doesn't exist. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2621) Allow multiple group bys with the same input data and spray keys to be run on the same reducer.
[ https://issues.apache.org/jira/browse/HIVE-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174945#comment-13174945 ] Kevin Wilfong commented on HIVE-2621: - There are currently two ways of getting common distincts, the current way checks that all distinct expressions in the subqueries are the same. My new code doesn't depend on this, it tries to construct subsets of the subqueries such that this is true for each subset. The advantage of doing it in the form if (optimizeMultiGroupBy) { ... } else { group queries by common distinct and group by expressions for each group: if (size of group 1 etc.) { new code } else { old code } } is that the block of code inside the optimizeMultiGroupBy if statement can produce 2 map reduce jobs where the new code might produce many. After looking at it more carefully, I can get rid of the singlemrMultiGroupBy if statement and the code within the block because it produces the same result that my new code would except that the new code can handle filters as well. After removing that code, the only remaining code above the if statement will be the poorly named getCommonDistinctExprs (as it only returns the common distinct expressions provided a lot of conditions are met including a requirement that all the distinct expressions are common), which I should be able to modify to use my new code. Allow multiple group bys with the same input data and spray keys to be run on the same reducer. --- Key: HIVE-2621 URL: https://issues.apache.org/jira/browse/HIVE-2621 Project: Hive Issue Type: New Feature Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2621.1.patch.txt, HIVE-2621.D567.1.patch, HIVE-2621.D567.2.patch, HIVE-2621.D567.3.patch Currently, when a user runs a query, such as a multi-insert, where each insertion subclause consists of a simple query followed by a group by, the group bys for each clause are run on a separate reducer. This requires writing the data for each group by clause to an intermediate file, and then reading it back. This uses a significant amount of the total CPU consumed by the query for an otherwise simple query. If the subclauses are grouped by their distinct expressions and group by keys, with all of the group by expressions for a group of subclauses run on a single reducer, this would reduce the amount of reading/writing to intermediate files for some queries. To do this, for each group of subclauses, in the mapper we would execute a the filters for each subclause 'or'd together (provided each subclause has a filter) followed by a reduce sink. In the reducer, the child operators would be each subclauses filter followed by the group by and any subsequent operations. Note that this would require turning off map aggregation, so we would need to make using this type of plan configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2621) Allow multiple group bys with the same input data and spray keys to be run on the same reducer.
[ https://issues.apache.org/jira/browse/HIVE-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13161369#comment-13161369 ] Kevin Wilfong commented on HIVE-2621: - diff is here https://reviews.facebook.net/D567 Allow multiple group bys with the same input data and spray keys to be run on the same reducer. --- Key: HIVE-2621 URL: https://issues.apache.org/jira/browse/HIVE-2621 Project: Hive Issue Type: New Feature Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2621.1.patch.txt, HIVE-2621.D567.1.patch Currently, when a user runs a query, such as a multi-insert, where each insertion subclause consists of a simple query followed by a group by, the group bys for each clause are run on a separate reducer. This requires writing the data for each group by clause to an intermediate file, and then reading it back. This uses a significant amount of the total CPU consumed by the query for an otherwise simple query. If the subclauses are grouped by their distinct expressions and group by keys, with all of the group by expressions for a group of subclauses run on a single reducer, this would reduce the amount of reading/writing to intermediate files for some queries. To do this, for each group of subclauses, in the mapper we would execute a the filters for each subclause 'or'd together (provided each subclause has a filter) followed by a reduce sink. In the reducer, the child operators would be each subclauses filter followed by the group by and any subsequent operations. Note that this would require turning off map aggregation, so we would need to make using this type of plan configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2619) Add hook to run in meatastore's endFunction which can collect more fb303 counters
[ https://issues.apache.org/jira/browse/HIVE-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13160659#comment-13160659 ] Kevin Wilfong commented on HIVE-2619: - I also created a diff here https://reviews.facebook.net/D555 I thought it was supposed to update here automatically. Add hook to run in meatastore's endFunction which can collect more fb303 counters - Key: HIVE-2619 URL: https://issues.apache.org/jira/browse/HIVE-2619 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2619.1.patch.txt Create the potential for hooks to run in the endFunction method of HMSHandler which take the name of a function and whether or not it succeeded. Also, override getCounters from fb303 to allow these hooks to add counters which they collect, should this be desired. These hooks can be similar to EventListeners, but they should be more generic. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2518) pull junit jar from maven repos via ivy
[ https://issues.apache.org/jira/browse/HIVE-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133575#comment-13133575 ] Kevin Wilfong commented on HIVE-2518: - @Ashutosh Sorry about that, I'll take care of it. pull junit jar from maven repos via ivy --- Key: HIVE-2518 URL: https://issues.apache.org/jira/browse/HIVE-2518 Project: Hive Issue Type: Improvement Reporter: He Yongqiang Assignee: Kevin Wilfong see https://issues.apache.org/jira/browse/HIVE-2505 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2505) Update junit jar in testlibs
[ https://issues.apache.org/jira/browse/HIVE-2505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130862#comment-13130862 ] Kevin Wilfong commented on HIVE-2505: - After applying the patch junit-4.10.jar needs to be added to the directory trunk/testlibs/ The jar junit-3.8.1.jar needs to be removed from that same directory as well. Update junit jar in testlibs Key: HIVE-2505 URL: https://issues.apache.org/jira/browse/HIVE-2505 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2505.1.patch.txt, junit-4.10.jar -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira