test errors
I am seeing the following errors on a fresh hive trunk ? [junit] Running org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec [junit] Test org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore FAILED (crashed) Is anyone else getting the same error ? Thanks, -namit
Re: Turn around on patches that do not need full unit testing
I agree. Having a short-test and long-test might make more sense. IE long-test includes funky serdes and UDFs. As for In the meanwhile, check in without test may introduce bug which can break production cluster.costly. the solution is not to run trunk. Run only releases. All the tests are run by jenkins post commit so we know when trunk is broken and we should not cut a release if all the tests are not passing. Also we should not knowingly break the build or leave it broken. IE would should strive to have all tests passing on trunk at all times, but not committing a typo patch for fear that the build might break does not make much sense. We can easily revert things in such a case. Edward On Sun, Jun 10, 2012 at 11:14 PM, Gang Liu g...@fb.com wrote: Yeah it is frustrated to take a long time to turn around for a tiny change. It is understood. In the meanwhile, check in without test may introduce bug which can break production cluster.costly. I think the problem is not if we should run test but running tests takes long time. If it takes reasonable time like 30 minutes, we have less pain. In a summary let us keep high quality via running test for every commit. Target to make unit test fast. Btw we can run test in parallel a hive wiki has details Thanks Sent from my iPhone On Jun 10, 2012, at 7:29 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Hive's unit tests take a long time. There are many simple patches we can get into hive earlier if we drop the notion of running the full test suite to QA every patch. For example: https://issues.apache.org/jira/browse/HIVE-3081 -- spelling mistakes that involved types https://issues.apache.org/jira/browse/HIVE-3061 -- patches with code cleanup https://issues.apache.org/jira/browse/HIVE-3048 -- patches that are one or two lines of code https://issues.apache.org/jira/browse/HIVE-2288 -- patches that are only additive Also I do not believe we should kick a patch back to someone for every tiny change. For example, suppose someone commits 9000 lines of code, with one typo. I have seen similar situations where the status gets reverted back to OPEN. It takes the person working on it a day to get back into the patch again, then by the time someone comes back around to reviewing another 3 days might go by. This is similar to a situation in the supermarket where You can only use one coupon so people walk in and out of the store 6 times to buy 6 items. Procedure and rules are followed, end results is really the same, but 6 times the work. In this case the committer should just make he change, re upload the patch and say 'committed with typo fixed' and commit. please comment, Edward
[jira] [Commented] (HIVE-3107) More semantic analysis errors
[ https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292867#comment-13292867 ] Richard Ding commented on HIVE-3107: {code} FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c' 12/06/11 09:36:56 ERROR ql.Driver: FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c' org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:15 Expression not in GROUP BY key 'c' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7510) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2256) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2058) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5921) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) {code} More semantic analysis errors - Key: HIVE-3107 URL: https://issues.apache.org/jira/browse/HIVE-3107 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Reporter: Richard Ding This is related to HIVE-1922. Following queries all fail with various SemanticExceptions: {code} explain select t.c from t group by c; explain select t.c from t group by c sort by t.c; explain select t.c as c0 from t group by c0; explain select t.c from t group by t.c sort by t.c; {code} It is true that one could always find a version of any of above queries that works. But one has to try to find out and it doesn't work well with machine generated SQL queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: test errors
Works for me. $ svn up svn At revision 1348932. $ svn st $ ant clean package test -Dtestcase=TestZooKeeperTokenStore [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 1.783 sec Any logs to look at? Ashutosh On Mon, Jun 11, 2012 at 5:33 AM, Namit Jain nj...@fb.com wrote: I am seeing the following errors on a fresh hive trunk ? [junit] Running org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec [junit] Test org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore FAILED (crashed) Is anyone else getting the same error ? Thanks, -namit
[jira] [Commented] (HIVE-3107) More semantic analysis errors
[ https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292881#comment-13292881 ] Richard Ding commented on HIVE-3107: These are issues for us since many of our SQL queries are machine generated. If this is by design, then it is not documented (i.e., what are invalid statements). I think these are just defects and not by design. The main problem with the above queries is that the column names are not in canonical forms, so it's hard for semantic analyzer to correlate column names in different clauses. Here are some stack traces: 1. explain select t.c from t group by c; {code} FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c' 12/06/11 09:36:56 ERROR ql.Driver: FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c' org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:15 Expression not in GROUP BY key 'c' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7510) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2256) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2058) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5921) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) {code} 2. explain select t.c as c0 from t group by c0; {code} FAILED: Error in semantic analysis: Line 1:41 Invalid table alias or column reference 'c0': (possible column names are: c) 12/06/11 09:50:18 ERROR ql.Driver: FAILED: Error in semantic analysis: Line 1:41 Invalid table alias or column reference 'c0': (possible column names are: c) org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:41 Invalid table alias or column reference 'c0': (possible column names are: c) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7510) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7464) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2739) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:3405) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5902) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at
Re: test errors
I was running into this too. For me, it was a permissions problem, the following directory was owned by root /tmp/zookeeper_0/version-2/ Changing the ownership to me fixed the problem. On 6/11/12 9:49 AM, Ashutosh Chauhan hashut...@apache.org wrote: Works for me. $ svn up svn At revision 1348932. $ svn st $ ant clean package test -Dtestcase=TestZooKeeperTokenStore [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 1.783 sec Any logs to look at? Ashutosh On Mon, Jun 11, 2012 at 5:33 AM, Namit Jain nj...@fb.com wrote: I am seeing the following errors on a fresh hive trunk ? [junit] Running org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec [junit] Test org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore FAILED (crashed) Is anyone else getting the same error ? Thanks, -namit
[jira] [Updated] (HIVE-3078) Add inputs/outputs for create table, create view and so forth
[ https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Ding updated HIVE-3078: - Attachment: hive_input_output.patch Fix the input/output for create table, drop table, create table like, create table as, create view, drop view, Add inputs/outputs for create table, create view and so forth - Key: HIVE-3078 URL: https://issues.apache.org/jira/browse/HIVE-3078 Project: Hive Issue Type: Bug Reporter: Shuai Ding Assignee: Shuai Ding Attachments: hive_input_output.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2694) Add FORMAT UDF
[ https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2694: - Status: Open (was: Patch Available) @Zhenxiao: Looks like the negative testcase outputs need to be updated. Can you please do this and then resubmit? Thanks. Add FORMAT UDF -- Key: HIVE-2694 URL: https://issues.apache.org/jira/browse/HIVE-2694 Project: Hive Issue Type: New Feature Components: UDF Reporter: Carl Steinbach Assignee: Zhenxiao Luo Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, HIVE-2694.3.patch.txt, HIVE-2694.D1149.1.patch, HIVE-2694.D1149.2.patch, HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, HIVE-2694.D2673.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)
[ https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-3098: --- Status: Patch Available (was: Open) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.) - Key: HIVE-3098 URL: https://issues.apache.org/jira/browse/HIVE-3098 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 0.9.0 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security turned on. Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-3098.patch The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing the Oracle backend). The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 100 instances of FileSystem, whose combined retained-mem consumed the entire heap. It boiled down to hadoop::UserGroupInformation::equals() being implemented such that the Subject member is compared for equality (==), and not equivalence (.equals()). This causes equivalent UGI instances to compare as unequal, and causes a new FileSystem instance to be created and cached. The UGI.equals() is so implemented, incidentally, as a fix for yet another problem (HADOOP-6670); so it is unlikely that that implementation can be modified. The solution for this is to check for UGI equivalence in HCatalog (i.e. in the Hive metastore), using an cache for UGI instances in the shims. I have a patch to fix this. I'll upload it shortly. I just ran an overnight test to confirm that the memory-leak has been arrested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2694) Add FORMAT UDF
[ https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292938#comment-13292938 ] Zhenxiao Luo commented on HIVE-2694: @Carl: negative testcase outputs updated in the new patch HIVE-2694.4.patch.txt. Add FORMAT UDF -- Key: HIVE-2694 URL: https://issues.apache.org/jira/browse/HIVE-2694 Project: Hive Issue Type: New Feature Components: UDF Reporter: Carl Steinbach Assignee: Zhenxiao Luo Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, HIVE-2694.3.patch.txt, HIVE-2694.4.patch.txt, HIVE-2694.D1149.1.patch, HIVE-2694.D1149.2.patch, HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, HIVE-2694.D2673.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2694) Add FORMAT UDF
[ https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhenxiao Luo updated HIVE-2694: --- Attachment: HIVE-2694.4.patch.txt Add FORMAT UDF -- Key: HIVE-2694 URL: https://issues.apache.org/jira/browse/HIVE-2694 Project: Hive Issue Type: New Feature Components: UDF Reporter: Carl Steinbach Assignee: Zhenxiao Luo Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, HIVE-2694.3.patch.txt, HIVE-2694.4.patch.txt, HIVE-2694.D1149.1.patch, HIVE-2694.D1149.2.patch, HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, HIVE-2694.D2673.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2694) Add FORMAT UDF
[ https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhenxiao Luo updated HIVE-2694: --- Status: Patch Available (was: Open) Add FORMAT UDF -- Key: HIVE-2694 URL: https://issues.apache.org/jira/browse/HIVE-2694 Project: Hive Issue Type: New Feature Components: UDF Reporter: Carl Steinbach Assignee: Zhenxiao Luo Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, HIVE-2694.3.patch.txt, HIVE-2694.4.patch.txt, HIVE-2694.D1149.1.patch, HIVE-2694.D1149.2.patch, HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, HIVE-2694.D2673.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3106) Add option to make multi inserts more atomic
[ https://issues.apache.org/jira/browse/HIVE-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292940#comment-13292940 ] Kevin Wilfong commented on HIVE-3106: - Spoke with njain offline. He suggested adding a dummy task which depends on the tasks each move task would depend on, and which has move tasks as its children. This will reduce the number of dependency edges in the dependency graph. This dummy task (DependencyCollectionTask) will only be added if this option is turned on. Add option to make multi inserts more atomic Key: HIVE-3106 URL: https://issues.apache.org/jira/browse/HIVE-3106 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-3106.1.patch.txt Currently, with multi-insert queries as soon the output of one of the inserts is ready the move task associated with that insert is run, creating the table/partition. However, if concurrency is enabled the lock on this table/partition is not released until the entire query finishes, which can be much later. This causes issues if, for example, a user is waiting for an output of the multi-insert query which is created long before the other outputs, and checking for it's existence using the metastore's Thrift methods (get_table/get_partition). In which case, the user will run their query which uses the output, and it will experience a timeout trying to acquire the lock on the table/partition. If all the move tasks depend on the parent's of all other move tasks, the output creation will be much closer to atomic relieving this problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2969) Log Time To Submit metric with PerfLogger
[ https://issues.apache.org/jira/browse/HIVE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-2969: Resolution: Fixed Status: Resolved (was: Patch Available) Committed Log Time To Submit metric with PerfLogger - Key: HIVE-2969 URL: https://issues.apache.org/jira/browse/HIVE-2969 Project: Hive Issue Type: Wish Components: Logging Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Attachments: HIVE-2969.D2919.1.patch Logging the time from when Driver.run starts to when we begin submitting jobs to map reduce would be helpful in determining how much of the lag in starting a query is due to Hive vs. Hadoop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Ding reassigned HIVE-3085: Assignee: Shuai Ding (was: Namit Jain) make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3109) metastore state not cleared
Namit Jain created HIVE-3109: Summary: metastore state not cleared Key: HIVE-3109 URL: https://issues.apache.org/jira/browse/HIVE-3109 Project: Hive Issue Type: Bug Reporter: Namit Jain When some of the tests are in order, random bugs are encountered. ant test -Dtestcase=TestCliDriver -Dqfile=part_inherit_tbl_props.q,stats1.q leads to an error in stats1.q We ran into this error as part of parallel testing (HIVE-3085). As part of HIVE-3085, this will be fixed temporarily by clearing hive.metastore.partition.inherit.table.properties at the end of the test. But, in general, any property set in one .q file should not affect anything in other tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3109) metastore state not cleared
[ https://issues.apache.org/jira/browse/HIVE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain reassigned HIVE-3109: Assignee: Ashutosh Chauhan metastore state not cleared --- Key: HIVE-3109 URL: https://issues.apache.org/jira/browse/HIVE-3109 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Ashutosh Chauhan When some of the tests are in order, random bugs are encountered. ant test -Dtestcase=TestCliDriver -Dqfile=part_inherit_tbl_props.q,stats1.q leads to an error in stats1.q We ran into this error as part of parallel testing (HIVE-3085). As part of HIVE-3085, this will be fixed temporarily by clearing hive.metastore.partition.inherit.table.properties at the end of the test. But, in general, any property set in one .q file should not affect anything in other tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3106) Add option to make multi inserts more atomic
[ https://issues.apache.org/jira/browse/HIVE-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292961#comment-13292961 ] Carl Steinbach commented on HIVE-3106: -- @Kevin: I added some comments on phabricator. Add option to make multi inserts more atomic Key: HIVE-3106 URL: https://issues.apache.org/jira/browse/HIVE-3106 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-3106.1.patch.txt Currently, with multi-insert queries as soon the output of one of the inserts is ready the move task associated with that insert is run, creating the table/partition. However, if concurrency is enabled the lock on this table/partition is not released until the entire query finishes, which can be much later. This causes issues if, for example, a user is waiting for an output of the multi-insert query which is created long before the other outputs, and checking for it's existence using the metastore's Thrift methods (get_table/get_partition). In which case, the user will run their query which uses the output, and it will experience a timeout trying to acquire the lock on the table/partition. If all the move tasks depend on the parent's of all other move tasks, the output creation will be much closer to atomic relieving this problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3109) metastore state not cleared
[ https://issues.apache.org/jira/browse/HIVE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292967#comment-13292967 ] Namit Jain commented on HIVE-3109: -- Assigning to you, Ashutosh. Feel free to un-assign or pass it. I vaguely remember you working on this parameter, but this is a more generic test cleanup problem. metastore state not cleared --- Key: HIVE-3109 URL: https://issues.apache.org/jira/browse/HIVE-3109 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Ashutosh Chauhan When some of the tests are in order, random bugs are encountered. ant test -Dtestcase=TestCliDriver -Dqfile=part_inherit_tbl_props.q,stats1.q leads to an error in stats1.q We ran into this error as part of parallel testing (HIVE-3085). As part of HIVE-3085, this will be fixed temporarily by clearing hive.metastore.partition.inherit.table.properties at the end of the test. But, in general, any property set in one .q file should not affect anything in other tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292969#comment-13292969 ] Shuai Ding commented on HIVE-3085: -- https://reviews.facebook.net/D3585 make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support
[ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-3072: --- Summary: Hive List Bucketing - DDL support (was: Hive List Bucketing - DDL support (single column)) Hive List Bucketing - DDL support - Key: HIVE-3072 URL: https://issues.apache.org/jira/browse/HIVE-3072 Project: Hive Issue Type: New Feature Components: SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it: https://cwiki.apache.org/Hive/listbucketing.html This jira issue will track DDL change for the feature. It's for single skewed column. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support
[ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-3072: --- Description: If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it: https://cwiki.apache.org/Hive/listbucketing.html This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns. was: If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it: https://cwiki.apache.org/Hive/listbucketing.html This jira issue will track DDL change for the feature. It's for single skewed column. Hive List Bucketing - DDL support - Key: HIVE-3072 URL: https://issues.apache.org/jira/browse/HIVE-3072 Project: Hive Issue Type: New Feature Components: SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it: https://cwiki.apache.org/Hive/listbucketing.html This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support
[ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292978#comment-13292978 ] Gang Tim Liu commented on HIVE-3072: making progress on DML. The following syntax started to work: create table T (c1 string, c2 string) list bucketed by (c1) with skew ('x1'); create table T (c1 string, c2 string, c3 string) list bucketed by (c1, c2) with skew (('x1', 'x2'), ('y1', 'y2')); Hive List Bucketing - DDL support - Key: HIVE-3072 URL: https://issues.apache.org/jira/browse/HIVE-3072 Project: Hive Issue Type: New Feature Components: SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it: https://cwiki.apache.org/Hive/listbucketing.html This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support
[ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292987#comment-13292987 ] Gang Tim Liu commented on HIVE-3072: We rethink release approach. We can deliver DDL and DML as separate patches or a single patch. Either has pros and cons. not perfect. Separate patch approach can make release more manageable. A single patch makes release make more sense because with DDL but no DML you can't experience list bucketing. We have to pick up one. We choose a single patch approach. It reduces overhead of multiple-patch release, gives community more time to review proposal and reserves room for us to adjust according to proposal review. I will call proposal review again today. Hive List Bucketing - DDL support - Key: HIVE-3072 URL: https://issues.apache.org/jira/browse/HIVE-3072 Project: Hive Issue Type: New Feature Components: SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it: https://cwiki.apache.org/Hive/listbucketing.html This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Hive List Bucketing - Feature Review
Dear all hive developers, We are making good progress of implementing the list bucketing feature. It should be available soon in weeks. We'd like to call feature review again and please provide your comments. Thanks Tim On 6/1/12 10:13 AM, Gang Liu g...@fb.com wrote: Dear all, Please review the proposal and provide your comments: https://cwiki.apache.org/Hive/listbucketing.html Thanks Tim
Re: Hive List Bucketing - Feature Review
This link may work better for some people: https://cwiki.apache.org/confluence/display/Hive/ListBucketing Thanks. Carl On Mon, Jun 11, 2012 at 12:03 PM, Gang Liu g...@fb.com wrote: Dear all hive developers, We are making good progress of implementing the list bucketing feature. It should be available soon in weeks. We'd like to call feature review again and please provide your comments. Thanks Tim On 6/1/12 10:13 AM, Gang Liu g...@fb.com wrote: Dear all, Please review the proposal and provide your comments: https://cwiki.apache.org/Hive/listbucketing.html Thanks Tim
Re: Hive List Bucketing - Feature Review
+ hcatalog-dev On Mon, Jun 11, 2012 at 12:09 PM, Carl Steinbach c...@cloudera.com wrote: This link may work better for some people: https://cwiki.apache.org/confluence/display/Hive/ListBucketing Thanks. Carl On Mon, Jun 11, 2012 at 12:03 PM, Gang Liu g...@fb.com wrote: Dear all hive developers, We are making good progress of implementing the list bucketing feature. It should be available soon in weeks. We'd like to call feature review again and please provide your comments. Thanks Tim On 6/1/12 10:13 AM, Gang Liu g...@fb.com wrote: Dear all, Please review the proposal and provide your comments: https://cwiki.apache.org/Hive/listbucketing.html Thanks Tim
[jira] [Created] (HIVE-3110) ant very-clean package dies if user does not have permissions to remove dir.
Namit Jain created HIVE-3110: Summary: ant very-clean package dies if user does not have permissions to remove dir. Key: HIVE-3110 URL: https://issues.apache.org/jira/browse/HIVE-3110 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2694) Add FORMAT UDF
[ https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2694: - Resolution: Fixed Fix Version/s: 0.10.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Zhenxiao! Add FORMAT UDF -- Key: HIVE-2694 URL: https://issues.apache.org/jira/browse/HIVE-2694 Project: Hive Issue Type: New Feature Components: UDF Reporter: Carl Steinbach Assignee: Zhenxiao Luo Fix For: 0.10.0 Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, HIVE-2694.3.patch.txt, HIVE-2694.4.patch.txt, HIVE-2694.D1149.1.patch, HIVE-2694.D1149.2.patch, HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, HIVE-2694.D2673.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3110) ant very-clean package dies if user does not have permissions to remove dir.
[ https://issues.apache.org/jira/browse/HIVE-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3110: - Status: Patch Available (was: Open) ant very-clean package dies if user does not have permissions to remove dir. Key: HIVE-3110 URL: https://issues.apache.org/jira/browse/HIVE-3110 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3110) ant very-clean package dies if user does not have permissions to remove dir.
[ https://issues.apache.org/jira/browse/HIVE-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293002#comment-13293002 ] Namit Jain commented on HIVE-3110: -- https://reviews.facebook.net/D3591 ant very-clean package dies if user does not have permissions to remove dir. Key: HIVE-3110 URL: https://issues.apache.org/jira/browse/HIVE-3110 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Hive List Bucketing - Feature Review
Hi Carl, thanks Tim On 6/11/12 12:14 PM, Carl Steinbach c...@cloudera.com wrote: + hcatalog-dev On Mon, Jun 11, 2012 at 12:09 PM, Carl Steinbach c...@cloudera.com wrote: This link may work better for some people: https://cwiki.apache.org/confluence/display/Hive/ListBucketing Thanks. Carl On Mon, Jun 11, 2012 at 12:03 PM, Gang Liu g...@fb.com wrote: Dear all hive developers, We are making good progress of implementing the list bucketing feature. It should be available soon in weeks. We'd like to call feature review again and please provide your comments. Thanks Tim On 6/1/12 10:13 AM, Gang Liu g...@fb.com wrote: Dear all, Please review the proposal and provide your comments: https://cwiki.apache.org/Hive/listbucketing.html Thanks Tim
[jira] [Updated] (HIVE-3110) ant very-clean package dies if user does not have permissions to remove dir.
[ https://issues.apache.org/jira/browse/HIVE-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3110: - Status: Open (was: Patch Available) ant very-clean package dies if user does not have permissions to remove dir. Key: HIVE-3110 URL: https://issues.apache.org/jira/browse/HIVE-3110 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support
[ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293008#comment-13293008 ] Carl Steinbach commented on HIVE-3072: -- If this feature requires metastore changes then I'd like to request that the first patch contain only changes to the metastore schema and metastore Thrift API. I would also prefer that the DML and DDL changes go in as a single patch since it a) prevents half-implemented features from showing up in releases and b) demonstrates that the feature actually works. Hive List Bucketing - DDL support - Key: HIVE-3072 URL: https://issues.apache.org/jira/browse/HIVE-3072 Project: Hive Issue Type: New Feature Components: SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it: https://cwiki.apache.org/Hive/listbucketing.html This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3085: - Component/s: Testing Infrastructure I added some comments to the review. Thanks. make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2969) Log Time To Submit metric with PerfLogger
[ https://issues.apache.org/jira/browse/HIVE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-2969: --- Affects Version/s: (was: 0.10.0) Fix Version/s: 0.10.0 Log Time To Submit metric with PerfLogger - Key: HIVE-2969 URL: https://issues.apache.org/jira/browse/HIVE-2969 Project: Hive Issue Type: Wish Components: Logging Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Fix For: 0.10.0 Attachments: HIVE-2969.D2919.1.patch Logging the time from when Driver.run starts to when we begin submitting jobs to map reduce would be helpful in determining how much of the lag in starting a query is due to Hive vs. Hadoop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth
[ https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293043#comment-13293043 ] Carl Steinbach commented on HIVE-3078: -- Is this ready for review? If so can you please submit a review request? Thanks. Add inputs/outputs for create table, create view and so forth - Key: HIVE-3078 URL: https://issues.apache.org/jira/browse/HIVE-3078 Project: Hive Issue Type: Bug Reporter: Shuai Ding Assignee: Shuai Ding Attachments: hive_input_output.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3013) TestCliDriver cannot be debugged with eclipse since hadoop_home is set incorrectly
[ https://issues.apache.org/jira/browse/HIVE-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3013: - Resolution: Fixed Fix Version/s: 0.10.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. TestCliDriver cannot be debugged with eclipse since hadoop_home is set incorrectly -- Key: HIVE-3013 URL: https://issues.apache.org/jira/browse/HIVE-3013 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.9.0 Reporter: Namit Jain Assignee: Carl Steinbach Fix For: 0.10.0 Attachments: HIVE-3013.2.patch.txt, HIVE-3013.3.patch.txt, hive.3013.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293055#comment-13293055 ] Edward Capriolo commented on HIVE-3085: --- Do the 'parallel' tests still require a shared NFS mount? A while back someone told me I did not need NFS anymore because hadoop 'give me Big Datas'. Really though this shared NFS mount destroys the utility of this toolkit for me. make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3013) TestCliDriver cannot be debugged with eclipse since hadoop_home is set incorrectly
[ https://issues.apache.org/jira/browse/HIVE-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293056#comment-13293056 ] Carl Steinbach commented on HIVE-3013: -- @Ashutosh: bq. HiveCLI.launch launches successfully, but doesnt work because of incorrect config. It seems to work for me, but maybe I'm just not trying the right commands. When you run it how does it fail? bq. FWIW, I ran all 887 ql tests from TestCliDriver. 36 of them failed. Investigating those will require separate kind of work so could be taken up in followup issue. I don't suppose you still have the list of failures available? TestCliDriver cannot be debugged with eclipse since hadoop_home is set incorrectly -- Key: HIVE-3013 URL: https://issues.apache.org/jira/browse/HIVE-3013 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.9.0 Reporter: Namit Jain Assignee: Carl Steinbach Fix For: 0.10.0 Attachments: HIVE-3013.2.patch.txt, HIVE-3013.3.patch.txt, hive.3013.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3106) Add option to make multi inserts more atomic
[ https://issues.apache.org/jira/browse/HIVE-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293059#comment-13293059 ] Kevin Wilfong commented on HIVE-3106: - Per Carl's comments, explicitely stated the advantages/disadvantages, removed atomic from the name of the configuration variable, as this is not really true, removed references to outputs in description of config. Also, fixed an issue, where if a file was taking a long time to produce, there would still be a long time between when the tables/partitions are produced and when the locks on them are released. Now, when the option is set, the DependencyCollection task depends on the dependencies of the move tasks for files, but the move tasks for files do not depend on the DependencyCollection task, as there are no locks on these files so there would not be any advantage. Added a new test case for this additional functionality. Add option to make multi inserts more atomic Key: HIVE-3106 URL: https://issues.apache.org/jira/browse/HIVE-3106 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-3106.1.patch.txt Currently, with multi-insert queries as soon the output of one of the inserts is ready the move task associated with that insert is run, creating the table/partition. However, if concurrency is enabled the lock on this table/partition is not released until the entire query finishes, which can be much later. This causes issues if, for example, a user is waiting for an output of the multi-insert query which is created long before the other outputs, and checking for it's existence using the metastore's Thrift methods (get_table/get_partition). In which case, the user will run their query which uses the output, and it will experience a timeout trying to acquire the lock on the table/partition. If all the move tasks depend on the parent's of all other move tasks, the output creation will be much closer to atomic relieving this problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3085: - Attachment: hive.3085.2.patch make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch, hive.3085.2.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support
[ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293082#comment-13293082 ] Gang Tim Liu commented on HIVE-3072: Yes, we are heading to a single patch approach. Yes, this feature requires metastore change. Hive List Bucketing - DDL support - Key: HIVE-3072 URL: https://issues.apache.org/jira/browse/HIVE-3072 Project: Hive Issue Type: New Feature Components: SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it: https://cwiki.apache.org/Hive/listbucketing.html This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-2693) Add DECIMAL data type
[ https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reassigned HIVE-2693: Assignee: Josh Wills Add DECIMAL data type - Key: HIVE-2693 URL: https://issues.apache.org/jira/browse/HIVE-2693 Project: Hive Issue Type: New Feature Components: Query Processor, Types Reporter: Carl Steinbach Assignee: Josh Wills Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice template for how to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2693) Add DECIMAL data type
[ https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293088#comment-13293088 ] Carl Steinbach commented on HIVE-2693: -- Review request from a while ago: https://reviews.facebook.net/D1221 Add DECIMAL data type - Key: HIVE-2693 URL: https://issues.apache.org/jira/browse/HIVE-2693 Project: Hive Issue Type: New Feature Components: Query Processor, Types Reporter: Carl Steinbach Assignee: Josh Wills Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice template for how to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3111) reduce the time for parallel unit tests for hive
Namit Jain created HIVE-3111: Summary: reduce the time for parallel unit tests for hive Key: HIVE-3111 URL: https://issues.apache.org/jira/browse/HIVE-3111 Project: Hive Issue Type: Bug Reporter: Namit Jain 1. Run the other tests in parallel with TestCliDriver and TestNegativeCliDriver 2. Run the tests that need super-user privilege in parallel with TestCliDriver and TestNegativeCliDriver -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293095#comment-13293095 ] Namit Jain commented on HIVE-3085: -- Right now, the tests does require a shared mount if you want to run on multiple machines. This is good, if we dont want to compile across all the machines. Having said that, I am also planning to use it on my machine only, and this should still help to finish the tests in about 1.5 hours. In that case, I was able to use local disk on my machine. This can be further optimized. Some of them are: 1. Run the other tests in parallel with TestCliDriver and TestNegativeCliDriver 2. Run the tests that need super-user privilege in parallel with TestCliDriver and TestNegativeCliDriver Filed https://issues.apache.org/jira/browse/HIVE-3111 for that. make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch, hive.3085.2.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3078) Add inputs/outputs for create table, create view and so forth
[ https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Ding updated HIVE-3078: - Status: Patch Available (was: Open) Add inputs/outputs for create table, create view and so forth - Key: HIVE-3078 URL: https://issues.apache.org/jira/browse/HIVE-3078 Project: Hive Issue Type: Bug Reporter: Shuai Ding Assignee: Shuai Ding Attachments: hive_input_output.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3078) Add inputs/outputs for create table, create view and so forth
[ https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3078: - Status: Open (was: Patch Available) Please submit a review request on Phabricator or Reviewboard. Thanks. Add inputs/outputs for create table, create view and so forth - Key: HIVE-3078 URL: https://issues.apache.org/jira/browse/HIVE-3078 Project: Hive Issue Type: Bug Reporter: Shuai Ding Assignee: Shuai Ding Attachments: hive_input_output.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293117#comment-13293117 ] Shuai Ding commented on HIVE-3085: -- https://reviews.facebook.net/D3585 make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch, hive.3085.2.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth
[ https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293119#comment-13293119 ] Carl Steinbach commented on HIVE-3078: -- And please exclude the test updates from the review request to make it easier to read. Add inputs/outputs for create table, create view and so forth - Key: HIVE-3078 URL: https://issues.apache.org/jira/browse/HIVE-3078 Project: Hive Issue Type: Bug Reporter: Shuai Ding Assignee: Shuai Ding Attachments: hive_input_output.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293121#comment-13293121 ] Namit Jain commented on HIVE-3085: -- @Carl, I had a question about ant very-clean package. That takes a very long time (~20 min.) since we are downloading so many jar files. Won't it be better to not populate hive*jar in ivy in our local builds ? Then, ant clean package can run much faster. Or, what is the downside of removing '*hive*.jar' from .ivy2 and then running ant clean package. Other jars rarely change, but this saves ~10 min. in compile time. make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch, hive.3085.2.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293124#comment-13293124 ] Shuai Ding commented on HIVE-3085: -- https://reviews.facebook.net/D3585 make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch, hive.3085.2.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed
Namit Jain created HIVE-3112: Summary: clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed Key: HIVE-3112 URL: https://issues.apache.org/jira/browse/HIVE-3112 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2693) Add DECIMAL data type
[ https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Wills updated HIVE-2693: - Attachment: HIVE-2693.patch The old version of this change. Add DECIMAL data type - Key: HIVE-2693 URL: https://issues.apache.org/jira/browse/HIVE-2693 Project: Hive Issue Type: New Feature Components: Query Processor, Types Reporter: Carl Steinbach Assignee: Josh Wills Attachments: HIVE-2693.patch Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice template for how to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed
[ https://issues.apache.org/jira/browse/HIVE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293135#comment-13293135 ] Kevin Wilfong commented on HIVE-3112: - +1 Will run tests clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed --- Key: HIVE-3112 URL: https://issues.apache.org/jira/browse/HIVE-3112 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Attachment: HIVE-2989.9.patch.txt Updated per comments. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhushan Mandhani updated HIVE-2989: --- Hadoop Flags: (was: Reviewed) Status: Patch Available (was: Open) Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth
[ https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293137#comment-13293137 ] Shuai Ding commented on HIVE-3078: -- I have binaries in the diff and can't use arc then .. Add inputs/outputs for create table, create view and so forth - Key: HIVE-3078 URL: https://issues.apache.org/jira/browse/HIVE-3078 Project: Hive Issue Type: Bug Reporter: Shuai Ding Assignee: Shuai Ding Attachments: hive_input_output.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth
[ https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293162#comment-13293162 ] Carl Steinbach commented on HIVE-3078: -- I don't see any binaries in the patch. Not sure what you're referring to. Add inputs/outputs for create table, create view and so forth - Key: HIVE-3078 URL: https://issues.apache.org/jira/browse/HIVE-3078 Project: Hive Issue Type: Bug Reporter: Shuai Ding Assignee: Shuai Ding Attachments: hive_input_output.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3113) Querying of Table Links
Bhushan Mandhani created HIVE-3113: -- Summary: Querying of Table Links Key: HIVE-3113 URL: https://issues.apache.org/jira/browse/HIVE-3113 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Implementation of querying of Table Links -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed
[ https://issues.apache.org/jira/browse/HIVE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3112: - Status: Open (was: Patch Available) @Namit: Please add a comment to the test that points back to HIVE-3112. Thanks. clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed --- Key: HIVE-3112 URL: https://issues.apache.org/jira/browse/HIVE-3112 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3114) Split Thrift interface for Table Link Creation
Bhushan Mandhani created HIVE-3114: -- Summary: Split Thrift interface for Table Link Creation Key: HIVE-3114 URL: https://issues.apache.org/jira/browse/HIVE-3114 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Table Link creation through Thrift currently goes through the same method as Table creation. We want to move it out of there and into it's own method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3115) Table Links and Authorization
Bhushan Mandhani created HIVE-3115: -- Summary: Table Links and Authorization Key: HIVE-3115 URL: https://issues.apache.org/jira/browse/HIVE-3115 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Incorporate Table Links into the existing authorization framework in Hive. Add tests to check that no breach of security permissions is possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth
[ https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293174#comment-13293174 ] Shuai Ding commented on HIVE-3078: -- https://reviews.facebook.net/D3603 Add inputs/outputs for create table, create view and so forth - Key: HIVE-3078 URL: https://issues.apache.org/jira/browse/HIVE-3078 Project: Hive Issue Type: Bug Reporter: Shuai Ding Assignee: Shuai Ding Attachments: hive_input_output.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth
[ https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293175#comment-13293175 ] Shuai Ding commented on HIVE-3078: -- https://reviews.facebook.net/D3603 Add inputs/outputs for create table, create view and so forth - Key: HIVE-3078 URL: https://issues.apache.org/jira/browse/HIVE-3078 Project: Hive Issue Type: Bug Reporter: Shuai Ding Assignee: Shuai Ding Attachments: hive_input_output.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293176#comment-13293176 ] Carl Steinbach commented on HIVE-3085: -- @Namit bq. Won't it be better to not populate hive*jar in ivy in our local builds? I don't think we can do that and also continue to list inter-subproject dependencies in the ivy.xml files. The basic reason why this doesn't work correctly for Hive is that the build is still not using Ivy correctly. More specifically, we're manually specifying the order in which subprojects are built instead of letting Ivy determine the order through dependency analysis. bq. Or, what is the downside of removing 'hive.jar' from .ivy2 and then running ant clean package. The main downside is that the user may have configured ivy.cache.dir to be something other than ~/.ivy2, so to be sure that you're deleting the right files you have to get the value of ${ivy.cache.dir} (which didn't seem that straightforward the last time I looked at it: http://ant.apache.org/ivy/history/2.0.0/use/cleancache.html). make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch, hive.3085.2.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3116) Make very-clean Ant target more selective
Carl Steinbach created HIVE-3116: Summary: Make very-clean Ant target more selective Key: HIVE-3116 URL: https://issues.apache.org/jira/browse/HIVE-3116 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3116) Make very-clean Ant target more selective
[ https://issues.apache.org/jira/browse/HIVE-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293180#comment-13293180 ] Carl Steinbach commented on HIVE-3116: -- Currently the very-clean target depends on ivy:cleancache. The original motivation for adding very-clean was to flush Hive artifacts out of the local Ivy cache, but the ivy:cleancache task actually deletes everything in ~/.ivy2. The following page indicates that the ivy:cleancache task can be configured to use a specific Ivy settings file, which may allow us to limit the deleted artifacts to Hive only: http://ant.apache.org/ivy/history/2.0.0/use/cleancache.html Also relevant: http://ant.apache.org/ivy/history/2.0.0/settings/caches.html Make very-clean Ant target more selective - Key: HIVE-3116 URL: https://issues.apache.org/jira/browse/HIVE-3116 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2745) Remove Hive's runtime dependency on bin/hadoop
[ https://issues.apache.org/jira/browse/HIVE-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2745: - Summary: Remove Hive's runtime dependency on bin/hadoop (was: Remove Hive's runtime/test dependency on bin/hadoop) Remove Hive's runtime dependency on bin/hadoop -- Key: HIVE-2745 URL: https://issues.apache.org/jira/browse/HIVE-2745 Project: Hive Issue Type: Improvement Components: Build Infrastructure, Query Processor Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3117) Determine order of subproject builds using ivy:buildlist task
Carl Steinbach created HIVE-3117: Summary: Determine order of subproject builds using ivy:buildlist task Key: HIVE-3117 URL: https://issues.apache.org/jira/browse/HIVE-3117 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3117) Determine order of subproject builds using ivy:buildlist task
[ https://issues.apache.org/jira/browse/HIVE-3117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293190#comment-13293190 ] Carl Steinbach commented on HIVE-3117: -- We should use the ivy:buildlist task to determine the order of subproject builds instead of hardcoding this in the root build.xml file. Problems with the current approach include a) the fact that we're likely to pick up dirty Hive artifacts from the local Ivy cache and b) the fact that it's hard to prevent the subprojects from evolving circular dependencies. References: * http://ant.apache.org/ivy/history/latest-milestone/tutorial/multiproject.html * http://stackoverflow.com/questions/4106143/ivy-simple-shared-repository Determine order of subproject builds using ivy:buildlist task - Key: HIVE-3117 URL: https://issues.apache.org/jira/browse/HIVE-3117 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293191#comment-13293191 ] Carl Steinbach commented on HIVE-3085: -- @Namit: I filed two followup tickets: HIVE-3116 and HIVE-3117. Thanks. make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch, hive.3085.2.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3115) Table Links and Authorization
[ https://issues.apache.org/jira/browse/HIVE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293195#comment-13293195 ] Carl Steinbach commented on HIVE-3115: -- @Bhushan: Is this going to be done before or after HIVE-3113? Table Links and Authorization - Key: HIVE-3115 URL: https://issues.apache.org/jira/browse/HIVE-3115 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Incorporate Table Links into the existing authorization framework in Hive. Add tests to check that no breach of security permissions is possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3114) Split Thrift interface for Table Link Creation
[ https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293198#comment-13293198 ] Carl Steinbach commented on HIVE-3114: -- @Bhushan: What's the timeline for doing this? I'm concerned that the Metastore Thrift interface is one of Hive's de facto public APIs, and any new functionality that appears in a release will need to be supported going forward. Why not just fix this in HIVE-2989 and eliminate the possibility that we're going to get stuck with an interface that we already know is broken? Split Thrift interface for Table Link Creation -- Key: HIVE-3114 URL: https://issues.apache.org/jira/browse/HIVE-3114 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Table Link creation through Thrift currently goes through the same method as Table creation. We want to move it out of there and into it's own method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293200#comment-13293200 ] Carl Steinbach commented on HIVE-2989: -- @Bhushan: I think HIVE-3114 (Split Thrift interface for TableLink creation) should be done in this patch instead of splitting it out into a followup ticket. Here's what I said in HIVE-3114: bq. I'm concerned that the Metastore Thrift interface is one of Hive's de facto public APIs, and any new functionality that appears in a release will need to be supported going forward. Why not just fix this in HIVE-2989 and eliminate the possibility that we're going to get stuck with an interface that we already know is broken? Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2989) Adding Table Links to Hive
[ https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293201#comment-13293201 ] Carl Steinbach commented on HIVE-2989: -- @Bhushan: I'll look over the rest of patch later tonight. Thanks. Adding Table Links to Hive -- Key: HIVE-2989 URL: https://issues.apache.org/jira/browse/HIVE-2989 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor, Security Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt, HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt, HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt Original Estimate: 672h Remaining Estimate: 672h This will add Table Links to Hive. This will be an alternate mechanism for a user to access tables and data in a database that is different from the one he is associated with. This feature can be used to provide access control (if access to databasename.tablename in queries and use database X is turned off in conjunction). If db X wants to access one or more partitions from table T in db Y, the user will issue: CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N') New partitions added to T will automatically be added to the link as well and become available to X. However, if the link is specified to be static, that will not be the case. The X user will then have to explicitly import each partition of T that he needs. The command above will not actually make any existing partitions of T available to X. Instead, we provide the following command to add an existing partition to a link: ALTER LINK T@Y ADD PARTITION (ds='2012-04-27') The user will need to execute the above for each existing partition that needs to be imported. For future partitions, Hive will take care of this. An imported partition can be dropped from a link using a similar command. We just specify DROP instead of ADD. For querying the linked table, the X user will refer to it as T@Y. Link Tables will only have read access and not be writable. The entire Table Link alongwith all its imported partitions can be dropped as follows: DROP LINK TO T@Y The above commands are purely MetaStore operations. The implementation will rely on replicating the entire partition metadata when a partition is added to a link. For every link that is created, we will add a new row to table TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or STATIC_LINK_TABLE if the link has been specified as static). A new column LINK_TBL_ID will be added which will contain the id of the imported table. It will be NULL for all other table types including the regular managed tables. When a partition is added to a link, the new row in the table PARTITIONS will point to the LINK_TABLE in the same database and not the master table in the other database. We will replicate all the metadata for this partition from the master database. The advantage of this approach is that fewer changes will be needed in query processing and DDL for LINK_TABLEs. Also, commands like SHOW TABLES and SHOW PARTITIONS will work as expected for LINK_TABLEs too. Of course, even though the metadata is not shared, the underlying data on disk is still shared. Hive still needs to know that when dropping a partition which belongs to a LINK_TABLE, it should not drop the underlying data from HDFS. Views and external tables cannot be imported from one database to another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1362) column level statistics
[ https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1362: - Assignee: Shreepadma Venugopalan Labels: (was: gsoc gsoc2012) column level statistics --- Key: HIVE-1362 URL: https://issues.apache.org/jira/browse/HIVE-1362 Project: Hive Issue Type: Sub-task Components: Statistics Reporter: Ning Zhang Assignee: Shreepadma Venugopalan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-1940) Query Optimization Using Column Metadata and Histograms
[ https://issues.apache.org/jira/browse/HIVE-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-1940. -- Resolution: Duplicate Resolving this as a duplicated of HIVE-1938 and HIVE-1362. Query Optimization Using Column Metadata and Histograms --- Key: HIVE-1940 URL: https://issues.apache.org/jira/browse/HIVE-1940 Project: Hive Issue Type: New Feature Components: Metastore, Query Processor, Statistics Reporter: Anja Gruenheid Attachments: Agruenheid_ideas11.pdf, HiveMetaStore.pdf The current basis for cost-based query optimization in Hive is information gathered on tables and partitions. To make further improvements in query optimization possible, the next step is to develop and implement possibilities to gather information on columns as discussed in issue HIVE-33. After that, an implementation of histograms is a possible option to use and collect run-time statistics. Next to the actual implementation of these features, it is also necessary to develop a consistent storage model for the MetaStore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2950) Hive should store the full table schema in partition storage descriptors
[ https://issues.apache.org/jira/browse/HIVE-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293218#comment-13293218 ] Ashutosh Chauhan commented on HIVE-2950: @Travis, Can you upload the latest patch at jira. Unfortunately, Phabricator doesn't let you download a patch file. Hive should store the full table schema in partition storage descriptors Key: HIVE-2950 URL: https://issues.apache.org/jira/browse/HIVE-2950 Project: Hive Issue Type: Bug Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-2950.D2769.1.patch Hive tables have a schema, which is copied into the partition storage descriptor when adding a partition. Currently only columns stored in the table storage descriptor are copied - columns that are reported by the serde are not copied. Instead of copying the table storage descriptor columns into the partition columns, the full table schema should be copied. DETAILS This is a little long but is necessary to show 3 things: current behavior when explicitly listing columns, behavior with HIVE-2941 patched in and serde reported columns, and finally the behavior with this patch (full table schema copied into the partition storage descriptor). Here's an example of what currently happens. Note the following: * the two manually-defined fields defined for the table are listed in the table storage descriptor. * both fields are present in the partition storage descriptor This works great because users who query for a partition can look at its storage descriptor and get the schema. {code} hive create external table foo_test (name string, age int) partitioned by (part_dt string); hive describe extended foo_test; OK name string age int part_dt string Detailed Table InformationTable(tableName:foo_test, dbName:travis_test, owner:travis, createTime:1334256062, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, type:string, comment:null), FieldSchema(name:age, type:int, comment:null), FieldSchema(name:part_dt, type:string, comment:null)], location:hdfs://foo.com/warehouse/travis_test.db/foo_test, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, primaryRegionName:, secondaryRegions:[]), partitionKeys:[FieldSchema(name:part_dt, type:string, comment:null)], parameters:{EXTERNAL=TRUE, transient_lastDdlTime=1334256062}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE) Time taken: 0.082 seconds hive alter table foo_test add partition (part_dt = '20120331T00Z') location 'hdfs://foo.com/foo/2012/03/31/00'; hive describe extended foo_test partition (part_dt = '20120331T00Z'); OK name string age int part_dt string Detailed Partition InformationPartition(values:[20120331T00Z], dbName:travis_test, tableName:foo_test, createTime:1334256131, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, type:string, comment:null), FieldSchema(name:age, type:int, comment:null), FieldSchema(name:part_dt, type:string, comment:null)], location:hdfs://foo.com/foo/2012/03/31/00, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, primaryRegionName:, secondaryRegions:[]), parameters:{transient_lastDdlTime=1334256131}) {code} CURRENT BEHAVIOR WITH HIVE-2941 PATCHED IN Now let's examine what happens when creating a table when the serde reports the schema. Notice the following: * The table storage descriptor contains an empty list of columns. However, the table schema is available from the serde reflecting on the serialization class. * The partition storage descriptor does contain a single part_dt column that was copied from the table partition keys. The actual data columns are not present. {code} hive create external table travis_test.person_test partitioned by (part_dt string) row format serde com.twitter.elephantbird.hive.serde.ThriftSerDe with serdeproperties (serialization.class=com.twitter.elephantbird.examples.thrift.Person) stored as inputformat
[jira] [Commented] (HIVE-2950) Hive should store the full table schema in partition storage descriptors
[ https://issues.apache.org/jira/browse/HIVE-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293227#comment-13293227 ] Travis Crawford commented on HIVE-2950: --- The patch is actually the same - the one attached to Jira is up-to-date. Hive should store the full table schema in partition storage descriptors Key: HIVE-2950 URL: https://issues.apache.org/jira/browse/HIVE-2950 Project: Hive Issue Type: Bug Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-2950.D2769.1.patch Hive tables have a schema, which is copied into the partition storage descriptor when adding a partition. Currently only columns stored in the table storage descriptor are copied - columns that are reported by the serde are not copied. Instead of copying the table storage descriptor columns into the partition columns, the full table schema should be copied. DETAILS This is a little long but is necessary to show 3 things: current behavior when explicitly listing columns, behavior with HIVE-2941 patched in and serde reported columns, and finally the behavior with this patch (full table schema copied into the partition storage descriptor). Here's an example of what currently happens. Note the following: * the two manually-defined fields defined for the table are listed in the table storage descriptor. * both fields are present in the partition storage descriptor This works great because users who query for a partition can look at its storage descriptor and get the schema. {code} hive create external table foo_test (name string, age int) partitioned by (part_dt string); hive describe extended foo_test; OK name string age int part_dt string Detailed Table InformationTable(tableName:foo_test, dbName:travis_test, owner:travis, createTime:1334256062, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, type:string, comment:null), FieldSchema(name:age, type:int, comment:null), FieldSchema(name:part_dt, type:string, comment:null)], location:hdfs://foo.com/warehouse/travis_test.db/foo_test, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, primaryRegionName:, secondaryRegions:[]), partitionKeys:[FieldSchema(name:part_dt, type:string, comment:null)], parameters:{EXTERNAL=TRUE, transient_lastDdlTime=1334256062}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE) Time taken: 0.082 seconds hive alter table foo_test add partition (part_dt = '20120331T00Z') location 'hdfs://foo.com/foo/2012/03/31/00'; hive describe extended foo_test partition (part_dt = '20120331T00Z'); OK name string age int part_dt string Detailed Partition InformationPartition(values:[20120331T00Z], dbName:travis_test, tableName:foo_test, createTime:1334256131, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, type:string, comment:null), FieldSchema(name:age, type:int, comment:null), FieldSchema(name:part_dt, type:string, comment:null)], location:hdfs://foo.com/foo/2012/03/31/00, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, primaryRegionName:, secondaryRegions:[]), parameters:{transient_lastDdlTime=1334256131}) {code} CURRENT BEHAVIOR WITH HIVE-2941 PATCHED IN Now let's examine what happens when creating a table when the serde reports the schema. Notice the following: * The table storage descriptor contains an empty list of columns. However, the table schema is available from the serde reflecting on the serialization class. * The partition storage descriptor does contain a single part_dt column that was copied from the table partition keys. The actual data columns are not present. {code} hive create external table travis_test.person_test partitioned by (part_dt string) row format serde com.twitter.elephantbird.hive.serde.ThriftSerDe with serdeproperties (serialization.class=com.twitter.elephantbird.examples.thrift.Person) stored as inputformat com.twitter.elephantbird.mapred.input.HiveMultiInputFormat outputformat
[jira] [Commented] (HIVE-3115) Table Links and Authorization
[ https://issues.apache.org/jira/browse/HIVE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293233#comment-13293233 ] Namit Jain commented on HIVE-3115: -- This will be done after HIVE-3113 from Facebook. Having said that, this is a open jira, and if someone wants to work on it, we would love to review it, and take it forward. Table Links and Authorization - Key: HIVE-3115 URL: https://issues.apache.org/jira/browse/HIVE-3115 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Incorporate Table Links into the existing authorization framework in Hive. Add tests to check that no breach of security permissions is possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3115) Table Links and Authorization
[ https://issues.apache.org/jira/browse/HIVE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293236#comment-13293236 ] Carl Steinbach commented on HIVE-3115: -- @Namit: I think this ticket either has be done at the same time as HIVE-3113 or before it. Otherwise you're adding a pretty big security hole with the table links feature. Table Links and Authorization - Key: HIVE-3115 URL: https://issues.apache.org/jira/browse/HIVE-3115 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Incorporate Table Links into the existing authorization framework in Hive. Add tests to check that no breach of security permissions is possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3114) Split Thrift interface for Table Link Creation
[ https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293237#comment-13293237 ] Namit Jain commented on HIVE-3114: -- I dont think so - there are many features (views/indexes etc.) which have been around for a long time without a thrift interface. Given the fact that there is a validity check and existing thrift APIs cannot create invalid objects (barring bugs in the validity check), HIVE-3114 should not be a pre-req. for HIVE-2989. Split Thrift interface for Table Link Creation -- Key: HIVE-3114 URL: https://issues.apache.org/jira/browse/HIVE-3114 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Table Link creation through Thrift currently goes through the same method as Table creation. We want to move it out of there and into it's own method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3115) Table Links and Authorization
[ https://issues.apache.org/jira/browse/HIVE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293239#comment-13293239 ] Namit Jain commented on HIVE-3115: -- Security has been added recently, and there are many existing features which do not work very well with it. HIVE-3078 is completing the inputs/outputs list and trying to plug-in some of these holes. Given that, there are many issues in security currently, it seems wrong to put the burden on links for security. Links is a new feature, and lots of bells and whistles will be added over time. Table Links and Authorization - Key: HIVE-3115 URL: https://issues.apache.org/jira/browse/HIVE-3115 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Incorporate Table Links into the existing authorization framework in Hive. Add tests to check that no breach of security permissions is possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3114) Split Thrift interface for Table Link Creation
[ https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293243#comment-13293243 ] Carl Steinbach commented on HIVE-3114: -- bq. I dont think so - there are many features (views/indexes etc.) which have been around for a long time without a thrift interface. That's true, and it was mistake to do it that way. Instead of continuing to compound the effects of an earlier bad decision can we please instead invest a little extra time and actually make the situation better? Also, this is the sort of thing that should have been described up front in the design document. Since the List Bucketing feature requires similar changes I'd like to see any metastore API changes that the feature requires explained in the design doc before they appear in a patch. Split Thrift interface for Table Link Creation -- Key: HIVE-3114 URL: https://issues.apache.org/jira/browse/HIVE-3114 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Table Link creation through Thrift currently goes through the same method as Table creation. We want to move it out of there and into it's own method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed
[ https://issues.apache.org/jira/browse/HIVE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293246#comment-13293246 ] Namit Jain commented on HIVE-3112: -- Comments clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed --- Key: HIVE-3112 URL: https://issues.apache.org/jira/browse/HIVE-3112 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed
[ https://issues.apache.org/jira/browse/HIVE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3112: - Status: Patch Available (was: Open) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed --- Key: HIVE-3112 URL: https://issues.apache.org/jira/browse/HIVE-3112 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3114) Split Thrift interface for Table Link Creation
[ https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293250#comment-13293250 ] Namit Jain commented on HIVE-3114: -- @Carl, I agree the comment on list bucketing. Please address it in the wiki, we should definitely take that into account. Agreed it was a mistake for unclean thrift API. But, we cannot penalize a single feature, which we need urgently, for that. I am all for the thrift cleanup - it is just that the timing is not right from our side. I would be happy to help in any way if someone else takes the thrift API cleanup effort. @Bhushan, in the links wiki - can you add a follow-up for the thrift interface ? Or, just add a link to all the follow-up jiras there. Split Thrift interface for Table Link Creation -- Key: HIVE-3114 URL: https://issues.apache.org/jira/browse/HIVE-3114 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Table Link creation through Thrift currently goes through the same method as Table creation. We want to move it out of there and into it's own method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3109) metastore state not cleared
[ https://issues.apache.org/jira/browse/HIVE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293255#comment-13293255 ] Kevin Wilfong commented on HIVE-3109: - HIVE-3112 unsets the parameter hive.metastore.partition.inherit.table.properties at the end of ql/src/test/queries/clientpositive/part_inherit_tbl_props.q ql/src/test/queries/clientpositive/part_inherit_tbl_props_with_star.q As part of this JIRA, the unsetting should be removed, and the ant test command in the Description should still work. metastore state not cleared --- Key: HIVE-3109 URL: https://issues.apache.org/jira/browse/HIVE-3109 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Ashutosh Chauhan When some of the tests are in order, random bugs are encountered. ant test -Dtestcase=TestCliDriver -Dqfile=part_inherit_tbl_props.q,stats1.q leads to an error in stats1.q We ran into this error as part of parallel testing (HIVE-3085). As part of HIVE-3085, this will be fixed temporarily by clearing hive.metastore.partition.inherit.table.properties at the end of the test. But, in general, any property set in one .q file should not affect anything in other tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3061) hive.binary.record.max.length is a magic string
[ https://issues.apache.org/jira/browse/HIVE-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293259#comment-13293259 ] Hudson commented on HIVE-3061: -- Integrated in Hive-trunk-h0.21 #1479 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1479/]) HIVE-3061 hive.binary.record.max.length is a magic string (Edward Capriolo via namit) (Revision 1348808) Result = SUCCESS namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1348808 Files : * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/conf/hive-default.xml.template * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/BinaryRecordReader.java hive.binary.record.max.length is a magic string --- Key: HIVE-3061 URL: https://issues.apache.org/jira/browse/HIVE-3061 Project: Hive Issue Type: Task Affects Versions: 0.8.1 Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: HIVE-3061.patch.1.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2694) Add FORMAT UDF
[ https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293260#comment-13293260 ] Hudson commented on HIVE-2694: -- Integrated in Hive-trunk-h0.21 #1479 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1479/]) HIVE-2694. Add FORMAT UDF (Zhenxiao Luo via cws) (Revision 1348976) Result = SUCCESS cws : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1348976 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFormatNumber.java * /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong1.q * /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong2.q * /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong3.q * /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong4.q * /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong5.q * /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong6.q * /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong7.q * /hive/trunk/ql/src/test/queries/clientpositive/udf_format_number.q * /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong1.q.out * /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong2.q.out * /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong3.q.out * /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong4.q.out * /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong5.q.out * /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong6.q.out * /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong7.q.out * /hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out * /hive/trunk/ql/src/test/results/clientpositive/udf_format_number.q.out Add FORMAT UDF -- Key: HIVE-2694 URL: https://issues.apache.org/jira/browse/HIVE-2694 Project: Hive Issue Type: New Feature Components: UDF Reporter: Carl Steinbach Assignee: Zhenxiao Luo Fix For: 0.10.0 Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, HIVE-2694.3.patch.txt, HIVE-2694.4.patch.txt, HIVE-2694.D1149.1.patch, HIVE-2694.D1149.2.patch, HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, HIVE-2694.D2673.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HIVE-3114) Split Thrift interface for Table Link Creation
[ https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293261#comment-13293261 ] Carl Steinbach edited comment on HIVE-3114 at 6/12/12 1:37 AM: --- bq. Agreed it was a mistake for unclean thrift API. But, we cannot penalize a single feature, which we need urgently, for that. I am all for the thrift cleanup - it is just that the timing is not right from our side. I'm not asking you to clean up the entire metastore Thrift API. All I'm asking is for you to add a createTableLink() method instead of overloading the createTable() method. It's fine if both createTable() and createTableLink() use a common codepath behind the Thrift API, but the Thrift API needs to call these things out as distinct operations. bq. @Bhushan, in the links wiki - can you add a follow-up for the thrift interface ? When are these followups going to be addressed? If they aren't committed in time for the 0.10.0 release are you OK with us backing out these changes? was (Author: cwsteinbach): bq. Agreed it was a mistake for unclean thrift API. But, we cannot penalize a single feature, which we need urgently, for that. I am all for the thrift cleanup - it is just that the timing is not right from our side. I'm not asking you to clean up the entire metastore Thrift API. All I'm asking is for you to add a createTableLink() method instead of overloading the createTable() method. It's fine if both createTable() and createTableLink() use a common codepath behind the Thrift API, but the Thrift API needs to call these things out as distinct operations. bq. @Bhushan, in the links wiki - can you add a follow-up for the thrift interface ? When are these followups going to be addressed? If they aren't committed in time for the 0.10.0 release are you OK with us backing out these changes? Split Thrift interface for Table Link Creation -- Key: HIVE-3114 URL: https://issues.apache.org/jira/browse/HIVE-3114 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Table Link creation through Thrift currently goes through the same method as Table creation. We want to move it out of there and into it's own method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3115) Table Links and Authorization
[ https://issues.apache.org/jira/browse/HIVE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293263#comment-13293263 ] Namit Jain commented on HIVE-3115: -- If there is no-one outside the walls of Facebook who is interested in using links, then not having authorization for links should not be a problem for anyone outside Facebook anyway. I am not saying having authorization for links is not a good idea - advanced users like Facebook will also need it, but this should not be coupled to the patch. It can definitely be done in a follow-up. If I remember right, security was added for most of the new features in follow-ups. Infact, having the patch will make it easy for other contributors, or us to quickly address security. Everyone has their own priority of features, and they are free to work on their own priorities. Some features which are very useful for advanced users may not be applicable for many other users right away, but they do get a free ride in the long run. Anyway, we already had a long discussion on the wiki - and there is no point repeating it. Table Links and Authorization - Key: HIVE-3115 URL: https://issues.apache.org/jira/browse/HIVE-3115 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Incorporate Table Links into the existing authorization framework in Hive. Add tests to check that no breach of security permissions is possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3114) Split Thrift interface for Table Link Creation
[ https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293264#comment-13293264 ] Namit Jain commented on HIVE-3114: -- Absolutely not, a feature may not be ready for 0.10. That does not mean that the code for that feature will be deleted. If 'links' is not available in 0.10, let us document it clearly - when all the jiras are ready, 'links' will be available in that release. Split Thrift interface for Table Link Creation -- Key: HIVE-3114 URL: https://issues.apache.org/jira/browse/HIVE-3114 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor Table Link creation through Thrift currently goes through the same method as Table creation. We want to move it out of there and into it's own method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed
[ https://issues.apache.org/jira/browse/HIVE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293284#comment-13293284 ] Kevin Wilfong commented on HIVE-3112: - +1 Looks like Namit addressed Carl's comments. clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed --- Key: HIVE-3112 URL: https://issues.apache.org/jira/browse/HIVE-3112 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3085) make parallel tests work
[ https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293289#comment-13293289 ] Shuai Ding commented on HIVE-3085: -- https://reviews.facebook.net/D3585 make parallel tests work Key: HIVE-3085 URL: https://issues.apache.org/jira/browse/HIVE-3085 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Shuai Ding Attachments: hive.3085.1.patch, hive.3085.2.patch https://cwiki.apache.org/Hive/unit-test-parallel-execution.html I was trying to run the tests using the instructions above. I was able to run them using a single machine (parallelism of 4 in ~2 hours). The conf. file is as follows: .hive_ptest.conf { qfile_hosts: [ [root@MC, 4] ], other_hosts: [ [root@MC, 1] ], master_base_path: /data/users/tmp, host_base_path: /data/users/hivetests, java_home: /usr/local/jdk-6u24-64 } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed
[ https://issues.apache.org/jira/browse/HIVE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3112: Resolution: Fixed Status: Resolved (was: Patch Available) Committed. Thanks Namit. clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed --- Key: HIVE-3112 URL: https://issues.apache.org/jira/browse/HIVE-3112 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira