[jira] [Updated] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly
[ https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2839: -- Attachment: HIVE-2839.D2079.4.patch navis updated the revision HIVE-2839 [jira] Filters on outer join with mapjoin hint is not applied correctly. Addressed comments Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D2079 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D2079?vs=27147id=27171#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java ql/src/test/queries/clientpositive/mapjoin1.q ql/src/test/results/clientpositive/mapjoin1.q.out To: JIRA, navis Cc: njain Filters on outer join with mapjoin hint is not applied correctly Key: HIVE-2839 URL: https://issues.apache.org/jira/browse/HIVE-2839 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Reporter: Navis Assignee: Navis Priority: Minor Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, HIVE-2839.D2079.4.patch Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions. {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND true limit 10; FAILED: Hive Internal Error: java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) java.lang.ClassCastException: org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) {code} and {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND b.key * 10 '1000' limit 10; java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at
[jira] [Updated] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly
[ https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2839: -- Attachment: HIVE-2839.D2079.5.patch navis updated the revision HIVE-2839 [jira] Filters on outer join with mapjoin hint is not applied correctly. Typos, sorry. Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D2079 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D2079?vs=27171id=27177#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java ql/src/test/queries/clientpositive/mapjoin1.q ql/src/test/results/clientpositive/mapjoin1.q.out To: JIRA, navis Cc: njain Filters on outer join with mapjoin hint is not applied correctly Key: HIVE-2839 URL: https://issues.apache.org/jira/browse/HIVE-2839 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions. {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND true limit 10; FAILED: Hive Internal Error: java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) java.lang.ClassCastException: org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) {code} and {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND b.key * 10 '1000' limit 10; java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at
[jira] [Updated] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly
[ https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2839: Affects Version/s: (was: 0.9.0) Status: Patch Available (was: Open) Filters on outer join with mapjoin hint is not applied correctly Key: HIVE-2839 URL: https://issues.apache.org/jira/browse/HIVE-2839 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions. {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND true limit 10; FAILED: Hive Internal Error: java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) java.lang.ClassCastException: org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) {code} and {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND b.key * 10 '1000' limit 10; java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:495) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) ... 8 more {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: branch for ptf and windowing fuction
Hi Ashutosh, My +1 for the proposal for creating a separate branch for feature development. I do have one question in this regard: how do you plan on keeping this branch in sync with the trunk? If the branch is allowed to diverge indefinitely, it is likely that the build from it will lag in features and fixes that are otherwise available on the trunk. It will be great if you could get the branch to first synchronize with the trunk and then follow a policy where there are periodic merges from the trunk into the development branch. Regards, Arvind Prabhakar On Fri, Feb 1, 2013 at 10:11 AM, Ashutosh Chauhan hashut...@apache.orgwrote: Hi all, Harish and Prajkta are doing some cool work over at https://issues.apache.org/jira/browse/HIVE-896 IMO its a very useful feature for the community and our user base. Harish and Prajkta are making steady progress on this for much last year in their github repo https://github.com/hbutani/hive and much of the feature is now functional. However, its quite a bit of work and new code which will take some time before being ready for trunk. I propose that we create a new branch so that further development of this happens in apache repo instead of github repo. This gets us few benefits: a) It will avoid the situation we ended up with HiveServer2 where a useful new functionality came but in one big patch which made its review and thus inclusion in mainline harder than it should have been. b) Obvious advantages of development getting done in apache as oppose to github which are: i) It will make it easier for apache hive community members interested in this work (like me) to follow progress. ii) It will make it easier for apache hive community members interested in this work to contribute. iii) It will make it easier for apache community members to review the work and provide feedback. I further propose that we follow Commit-than-review policy for this feature branch which will enable contributors to make rapid progress without waiting for lengthy review cycles. Hive committers interested in work can either review branch any time they want to provide feedback or can wait till contributors declare work is complete and make a proposal to merge in trunk and than review it than. This anyway is a throwaway branch not intended to make releases out of it. Unless I hear any objections, I will create a branch over the weekend. Thanks, Ashutosh
[jira] [Updated] (HIVE-2379) Hive/HBase integration could be improved
[ https://issues.apache.org/jira/browse/HIVE-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guido Serra aka Zeph updated HIVE-2379: --- Priority: Critical (was: Minor) Issue Type: Bug (was: Improvement) ahemm [~namitjain], sorry if I dare to raise this to Critical and switch it to a Bug, but it is preventing one of the major use cases to work at all from hive's console I can sort of circumvent it, but from Hue or from a JDBC connector I have no easy way (even totally impossible from an end user's perspective) is anyone reviewing the patch? I tried to compile and apply it, but the unit tests against 974918c Hive 0.9.0-rc0 release are failing, and I can't judge if I get to a consistent state applying it Hive/HBase integration could be improved Key: HIVE-2379 URL: https://issues.apache.org/jira/browse/HIVE-2379 Project: Hive Issue Type: Bug Components: CLI, Clients, HBase Handler Affects Versions: 0.7.1, 0.8.0, 0.9.0 Reporter: Roman Shaposhnik Assignee: Navis Priority: Critical Attachments: HIVE-2379.D7347.1.patch For now any Hive/HBase queries would require the following jars to be explicitly added via hive's add jar command: add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar; add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar; add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar; add jar /usr/lib/hive/lib/guava-r06.jar; the longer term solution, perhaps, should be to have the code at submit time call hbase's TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship it in distributedcache. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2379) Hive/HBase integration could be improved
[ https://issues.apache.org/jira/browse/HIVE-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570077#comment-13570077 ] Guido Serra aka Zeph commented on HIVE-2379: (uhmm... looks like I stepped into HIVE-2937) {code} [junit] Running org.apache.hadoop.hive.service.TestHiveServerSessions [junit] Running org.apache.hadoop.hive.service.TestHiveServerSessions [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec {code} Hive/HBase integration could be improved Key: HIVE-2379 URL: https://issues.apache.org/jira/browse/HIVE-2379 Project: Hive Issue Type: Bug Components: CLI, Clients, HBase Handler Affects Versions: 0.7.1, 0.8.0, 0.9.0 Reporter: Roman Shaposhnik Assignee: Navis Priority: Critical Attachments: HIVE-2379.D7347.1.patch For now any Hive/HBase queries would require the following jars to be explicitly added via hive's add jar command: add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar; add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar; add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar; add jar /usr/lib/hive/lib/guava-r06.jar; the longer term solution, perhaps, should be to have the code at submit time call hbase's TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship it in distributedcache. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2379) Hive/HBase integration could be improved
[ https://issues.apache.org/jira/browse/HIVE-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570078#comment-13570078 ] Guido Serra aka Zeph commented on HIVE-2379: and... {code} [junit] Running org.apache.hadoop.hive.cli.TestCliDriver [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec {code} but no much output though Hive/HBase integration could be improved Key: HIVE-2379 URL: https://issues.apache.org/jira/browse/HIVE-2379 Project: Hive Issue Type: Bug Components: CLI, Clients, HBase Handler Affects Versions: 0.7.1, 0.8.0, 0.9.0 Reporter: Roman Shaposhnik Assignee: Navis Priority: Critical Attachments: HIVE-2379.D7347.1.patch For now any Hive/HBase queries would require the following jars to be explicitly added via hive's add jar command: add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar; add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar; add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar; add jar /usr/lib/hive/lib/guava-r06.jar; the longer term solution, perhaps, should be to have the code at submit time call hbase's TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship it in distributedcache. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3405) UDF to obtain a string with the first letter of each word in uppercase
[ https://issues.apache.org/jira/browse/HIVE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Padma Ravindran updated HIVE-3405: -- Labels: patch (was: ) Affects Version/s: 0.11.0 0.9.1 0.8.1 0.9.0 0.10.0 Release Note: Initcap method tested.please verify Status: Patch Available (was: Open) Initcap method tested.please verify UDF to obtain a string with the first letter of each word in uppercase -- Key: HIVE-3405 URL: https://issues.apache.org/jira/browse/HIVE-3405 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.10.0, 0.9.0, 0.8.1, 0.9.1, 0.11.0 Reporter: Archana Nair Labels: patch Hive current releases lacks a INITCAP function which returns String with first letter of the word in uppercase.INITCAP returns String, with the first letter of each word in uppercase, all other letters in same case. Words are delimited by white space.This will be useful report generation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3405) UDF to obtain a string with the first letter of each word in uppercase
[ https://issues.apache.org/jira/browse/HIVE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Padma Ravindran updated HIVE-3405: -- Attachment: HIVE-3405.1.patch.txt Patch UDF to obtain a string with the first letter of each word in uppercase -- Key: HIVE-3405 URL: https://issues.apache.org/jira/browse/HIVE-3405 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.8.1, 0.9.0, 0.10.0, 0.9.1, 0.11.0 Reporter: Archana Nair Labels: patch Attachments: HIVE-3405.1.patch.txt Hive current releases lacks a INITCAP function which returns String with first letter of the word in uppercase.INITCAP returns String, with the first letter of each word in uppercase, all other letters in same case. Words are delimited by white space.This will be useful report generation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3937) Hive Profiler
[ https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3937: - Status: Patch Available (was: Open) Hive Profiler - Key: HIVE-3937 URL: https://issues.apache.org/jira/browse/HIVE-3937 Project: Hive Issue Type: New Feature Reporter: Pamela Vagata Assignee: Pamela Vagata Priority: Minor Attachments: HIVE-3937.1.patch.txt, HIVE-3937.patch.2.txt, HIVE-3937.patch.3.txt, HIVE-3937.patch.4.txt, HIVE-3937.patch.5.txt Adding a Hive Profiler implementation which tracks inclusive wall times and call counts of the operators -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3937) Hive Profiler
[ https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3937: - Resolution: Fixed Fix Version/s: 0.11.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed. Thanks Pamela Hive Profiler - Key: HIVE-3937 URL: https://issues.apache.org/jira/browse/HIVE-3937 Project: Hive Issue Type: New Feature Reporter: Pamela Vagata Assignee: Pamela Vagata Priority: Minor Fix For: 0.11.0 Attachments: HIVE-3937.1.patch.txt, HIVE-3937.patch.2.txt, HIVE-3937.patch.3.txt, HIVE-3937.patch.4.txt, HIVE-3937.patch.5.txt Adding a Hive Profiler implementation which tracks inclusive wall times and call counts of the operators -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3937) Hive Profiler
[ https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570126#comment-13570126 ] Hudson commented on HIVE-3937: -- Integrated in hive-trunk-hadoop1 #67 (See [https://builds.apache.org/job/hive-trunk-hadoop1/67/]) HIVE-3937 Hive Profiler (Pamela Vagata via namit) (Revision 1442062) Result = ABORTED namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442062 Files : * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/conf/hive-default.xml.template * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisher.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisherInfo.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfiler.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerAggregateStat.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerConnectionInfo.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStats.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStatsAggregator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerUtils.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/HiveProfilerResultsHook.java * /hive/trunk/ql/src/test/queries/clientpositive/hiveprofiler0.q * /hive/trunk/ql/src/test/results/clientpositive/hiveprofiler0.q.out Hive Profiler - Key: HIVE-3937 URL: https://issues.apache.org/jira/browse/HIVE-3937 Project: Hive Issue Type: New Feature Reporter: Pamela Vagata Assignee: Pamela Vagata Priority: Minor Fix For: 0.11.0 Attachments: HIVE-3937.1.patch.txt, HIVE-3937.patch.2.txt, HIVE-3937.patch.3.txt, HIVE-3937.patch.4.txt, HIVE-3937.patch.5.txt Adding a Hive Profiler implementation which tracks inclusive wall times and call counts of the operators -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3571) add a way to run a small unit quickly
[ https://issues.apache.org/jira/browse/HIVE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570127#comment-13570127 ] Hudson commented on HIVE-3571: -- Integrated in hive-trunk-hadoop1 #67 (See [https://builds.apache.org/job/hive-trunk-hadoop1/67/]) HIVE-3571 : add a way to run a small unit quickly (Navis via Ashutosh Chauhan) (Revision 1442043) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442043 Files : * /hive/trunk/build.properties * /hive/trunk/build.xml add a way to run a small unit quickly - Key: HIVE-3571 URL: https://issues.apache.org/jira/browse/HIVE-3571 Project: Hive Issue Type: Test Components: Testing Infrastructure Reporter: Namit Jain Assignee: Navis Fix For: 0.11.0 Attachments: HIVE-3571.1.patch.txt, HIVE-3571.D7695.1.patch, HIVE-3571.D7695.2.patch, HIVE-3571.D7695.3.patch A simple unit test: ant test -Dtestcase=TestCliDriver -Dqfile=groupby2.q takes a long time. There should be a quick way to achieve that for debugging. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3956) TestMetaStoreAuthorization always uses the same port
[ https://issues.apache.org/jira/browse/HIVE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570128#comment-13570128 ] Hudson commented on HIVE-3956: -- Integrated in hive-trunk-hadoop1 #67 (See [https://builds.apache.org/job/hive-trunk-hadoop1/67/]) HIVE-3956 : TestMetaStoreAuthorization always uses the same port (Navis via Ashutosh Chauhan) (Revision 1442038) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442038 Files : * /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreAuthorization.java TestMetaStoreAuthorization always uses the same port Key: HIVE-3956 URL: https://issues.apache.org/jira/browse/HIVE-3956 Project: Hive Issue Type: Test Components: Tests Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.11.0 Attachments: HIVE-3956.D8253.1.patch Similar issue with HIVE-2959 and HIVE-3052. Using fixed port(1) for test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly
[ https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570130#comment-13570130 ] Phabricator commented on HIVE-2839: --- njain has commented on the revision HIVE-2839 [jira] Filters on outer join with mapjoin hint is not applied correctly. INLINE COMMENTS ql/src/test/queries/clientpositive/mapjoin1.q:15 Can you add tests with hive.outerjoin.supports.filter also ? REVISION DETAIL https://reviews.facebook.net/D2079 To: JIRA, navis Cc: njain Filters on outer join with mapjoin hint is not applied correctly Key: HIVE-2839 URL: https://issues.apache.org/jira/browse/HIVE-2839 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions. {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND true limit 10; FAILED: Hive Internal Error: java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) java.lang.ClassCastException: org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) {code} and {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND b.key * 10 '1000' limit 10; java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:495) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) ... 8 more {code} -- This message is automatically generated by JIRA. If you think it was sent
[jira] [Commented] (HIVE-3559) UDF RIGHT(string,position) to HIVE
[ https://issues.apache.org/jira/browse/HIVE-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570282#comment-13570282 ] Arun A K commented on HIVE-3559: [~meenu] Please review the patch submitted. I have seen ample number of bugs in the current version of the patch. eg: String null, length 3 String teststring, length 200 String test, length -1 ... Do take time and fix these bugs and add new patch file. UDF RIGHT(string,position) to HIVE --- Key: HIVE-3559 URL: https://issues.apache.org/jira/browse/HIVE-3559 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: Vinaya Varghese Assignee: Meenu K Chandran Priority: Minor Attachments: HIVE-3559.1.patch.txt, udf_right.q, udf_right.q.out Introduction UDF (User Defined Function) to obtain the rightmost 'n' characters from a string in HIVE. Relevance Current releases of Hive lacks a function which would returns the rightmost len characters from the string str, or NULL if any argument is NULL. The function LEFT(string,length) would return the rightmost 'length' characters from the string 'string' , or NULL if any argument is NULL which would be useful while using HiveQL. This would find its use in all the technical aspects where the concept of strings are used. Functionality :- Function Name: RIGHT(string,length) Returns the rightmost 'length' characters from the string or NULL if any argument is NULL. Example: hiveSELECT LEFT('https://www.irctc.com',3); - 'com' Usage :- Case 1: To query a table to find details based on an https request Table :-Transaction Request_id|date|period_id|url_name 0001|01/07/2012|110001|https://www.irctc.com 0002|02/07/2012|110001|https://nextstep.tcs.com 0003|03/07/2012|110001|https://www.hdfcbank.com 0005|01/07/2012|110001|http://www.lmnm.org 0006|08/07/2012|110001|http://nextstart.gov 0007|10/07/2012|110001|https://netbanking.icicibank.com 0012|21/07/2012|110001|http://www.people.nic 0026|08/07/2012|110001|http://nextprobs.gov 00023|25/07/2012|110001|https://netbanking.canarabank.com Query : select * from transaction where RIGHT(url_name,3)='com'; Result :- 0001|01/07/2012|110001|https://www.irctc.com 0002|02/07/2012|110001|https://nextstep.tcs.com 0003|03/07/2012|110001|https://www.hdfcbank.com 0007|10/07/2012|110001|https://netbanking.icicibank.com 00023|25/07/2012|110001|https://netbanking.canarabank.com -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-896) Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.
[ https://issues.apache.org/jira/browse/HIVE-896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-896: --- Attachment: hive-896.3.patch.txt Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive. --- Key: HIVE-896 URL: https://issues.apache.org/jira/browse/HIVE-896 Project: Hive Issue Type: New Feature Components: OLAP, UDF Reporter: Amr Awadallah Priority: Minor Attachments: DataStructs.pdf, HIVE-896.1.patch.txt, Hive-896.2.patch.txt, hive-896.3.patch.txt Windowing functions are very useful for click stream processing and similar time-series/sliding-window analytics. More details at: http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1006709 http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007059 http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007032 -- amr -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-896) Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.
[ https://issues.apache.org/jira/browse/HIVE-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570306#comment-13570306 ] Harish Butani commented on HIVE-896: Attached patch to be used as starting point for hive branch. Has minor changes since the last patch. Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive. --- Key: HIVE-896 URL: https://issues.apache.org/jira/browse/HIVE-896 Project: Hive Issue Type: New Feature Components: OLAP, UDF Reporter: Amr Awadallah Priority: Minor Attachments: DataStructs.pdf, HIVE-896.1.patch.txt, Hive-896.2.patch.txt, hive-896.3.patch.txt Windowing functions are very useful for click stream processing and similar time-series/sliding-window analytics. More details at: http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1006709 http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007059 http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007032 -- amr -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Requests
Hello, I opened up two reviews for small issues, HIVE-3553[1] and HIVE-3725[2]. If you guys get a chance to review and provide feedback on it, I will really appreciate. Thanks, [1] https://reviews.apache.org/r/9275/ [2] https://reviews.apache.org/r/9276/ -- Swarnim
Re: branch for ptf and windowing fuction
Hi Arvind, Yeah thats the idea to do periodic merges to keep the branch in sync with trunk, otherwise merging it with trunk later on will get unnecessarily complicated. Thanks, Ashutosh On Mon, Feb 4, 2013 at 12:56 AM, Arvind Prabhakar arv...@apache.org wrote: Hi Ashutosh, My +1 for the proposal for creating a separate branch for feature development. I do have one question in this regard: how do you plan on keeping this branch in sync with the trunk? If the branch is allowed to diverge indefinitely, it is likely that the build from it will lag in features and fixes that are otherwise available on the trunk. It will be great if you could get the branch to first synchronize with the trunk and then follow a policy where there are periodic merges from the trunk into the development branch. Regards, Arvind Prabhakar On Fri, Feb 1, 2013 at 10:11 AM, Ashutosh Chauhan hashut...@apache.org wrote: Hi all, Harish and Prajkta are doing some cool work over at https://issues.apache.org/jira/browse/HIVE-896 IMO its a very useful feature for the community and our user base. Harish and Prajkta are making steady progress on this for much last year in their github repo https://github.com/hbutani/hive and much of the feature is now functional. However, its quite a bit of work and new code which will take some time before being ready for trunk. I propose that we create a new branch so that further development of this happens in apache repo instead of github repo. This gets us few benefits: a) It will avoid the situation we ended up with HiveServer2 where a useful new functionality came but in one big patch which made its review and thus inclusion in mainline harder than it should have been. b) Obvious advantages of development getting done in apache as oppose to github which are: i) It will make it easier for apache hive community members interested in this work (like me) to follow progress. ii) It will make it easier for apache hive community members interested in this work to contribute. iii) It will make it easier for apache community members to review the work and provide feedback. I further propose that we follow Commit-than-review policy for this feature branch which will enable contributors to make rapid progress without waiting for lengthy review cycles. Hive committers interested in work can either review branch any time they want to provide feedback or can wait till contributors declare work is complete and make a proposal to merge in trunk and than review it than. This anyway is a throwaway branch not intended to make releases out of it. Unless I hear any objections, I will create a branch over the weekend. Thanks, Ashutosh
Re: branch for ptf and windowing fuction
Hi all, Cool. Seems like everyone is on board. I have created a new branch [1] based of current trunk and have committed latest patch attached on HIVE-896 to it. Check it out. Feel free to open jiras for this work and put up patches. I have added a new component called ptf-windowing on jira which you could use for issues related to this work. https://svn.apache.org/repos/asf/hive/branches/ptf-windowing/ Thanks, Ashutosh On Mon, Feb 4, 2013 at 8:54 AM, Ashutosh Chauhan hashut...@apache.orgwrote: Hi Arvind, Yeah thats the idea to do periodic merges to keep the branch in sync with trunk, otherwise merging it with trunk later on will get unnecessarily complicated. Thanks, Ashutosh On Mon, Feb 4, 2013 at 12:56 AM, Arvind Prabhakar arv...@apache.orgwrote: Hi Ashutosh, My +1 for the proposal for creating a separate branch for feature development. I do have one question in this regard: how do you plan on keeping this branch in sync with the trunk? If the branch is allowed to diverge indefinitely, it is likely that the build from it will lag in features and fixes that are otherwise available on the trunk. It will be great if you could get the branch to first synchronize with the trunk and then follow a policy where there are periodic merges from the trunk into the development branch. Regards, Arvind Prabhakar On Fri, Feb 1, 2013 at 10:11 AM, Ashutosh Chauhan hashut...@apache.org wrote: Hi all, Harish and Prajkta are doing some cool work over at https://issues.apache.org/jira/browse/HIVE-896 IMO its a very useful feature for the community and our user base. Harish and Prajkta are making steady progress on this for much last year in their github repo https://github.com/hbutani/hive and much of the feature is now functional. However, its quite a bit of work and new code which will take some time before being ready for trunk. I propose that we create a new branch so that further development of this happens in apache repo instead of github repo. This gets us few benefits: a) It will avoid the situation we ended up with HiveServer2 where a useful new functionality came but in one big patch which made its review and thus inclusion in mainline harder than it should have been. b) Obvious advantages of development getting done in apache as oppose to github which are: i) It will make it easier for apache hive community members interested in this work (like me) to follow progress. ii) It will make it easier for apache hive community members interested in this work to contribute. iii) It will make it easier for apache community members to review the work and provide feedback. I further propose that we follow Commit-than-review policy for this feature branch which will enable contributors to make rapid progress without waiting for lengthy review cycles. Hive committers interested in work can either review branch any time they want to provide feedback or can wait till contributors declare work is complete and make a proposal to merge in trunk and than review it than. This anyway is a throwaway branch not intended to make releases out of it. Unless I hear any objections, I will create a branch over the weekend. Thanks, Ashutosh
[jira] [Created] (HIVE-3981) Split up tests in ptf_general_queries.q
Ashutosh Chauhan created HIVE-3981: -- Summary: Split up tests in ptf_general_queries.q Key: HIVE-3981 URL: https://issues.apache.org/jira/browse/HIVE-3981 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes on my laptop to finish. I think we should break it down in smaller .q files otherwise adding a new query and debugging it will be a pain. I have split out rcfile and seqfile tests from it to begin. Also, this test currently fails because original patch didn't had .rc and .seq files (they were binary). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3981) Split up tests in ptf_general_queries.q
[ https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570400#comment-13570400 ] Ashutosh Chauhan commented on HIVE-3981: [~rhbutani] Can you attach .rcfile and .seqfile here on jira, I can than generate the full patch. Split up tests in ptf_general_queries.q --- Key: HIVE-3981 URL: https://issues.apache.org/jira/browse/HIVE-3981 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3981.patch ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes on my laptop to finish. I think we should break it down in smaller .q files otherwise adding a new query and debugging it will be a pain. I have split out rcfile and seqfile tests from it to begin. Also, this test currently fails because original patch didn't had .rc and .seq files (they were binary). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3937) Hive Profiler
[ https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570402#comment-13570402 ] Hudson commented on HIVE-3937: -- Integrated in Hive-trunk-hadoop2 #106 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/106/]) HIVE-3937 Hive Profiler (Pamela Vagata via namit) (Revision 1442062) Result = FAILURE namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442062 Files : * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/conf/hive-default.xml.template * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisher.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisherInfo.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfiler.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerAggregateStat.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerConnectionInfo.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStats.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStatsAggregator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerUtils.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/HiveProfilerResultsHook.java * /hive/trunk/ql/src/test/queries/clientpositive/hiveprofiler0.q * /hive/trunk/ql/src/test/results/clientpositive/hiveprofiler0.q.out Hive Profiler - Key: HIVE-3937 URL: https://issues.apache.org/jira/browse/HIVE-3937 Project: Hive Issue Type: New Feature Reporter: Pamela Vagata Assignee: Pamela Vagata Priority: Minor Fix For: 0.11.0 Attachments: HIVE-3937.1.patch.txt, HIVE-3937.patch.2.txt, HIVE-3937.patch.3.txt, HIVE-3937.patch.4.txt, HIVE-3937.patch.5.txt Adding a Hive Profiler implementation which tracks inclusive wall times and call counts of the operators -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3571) add a way to run a small unit quickly
[ https://issues.apache.org/jira/browse/HIVE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570403#comment-13570403 ] Hudson commented on HIVE-3571: -- Integrated in Hive-trunk-hadoop2 #106 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/106/]) HIVE-3571 : add a way to run a small unit quickly (Navis via Ashutosh Chauhan) (Revision 1442043) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442043 Files : * /hive/trunk/build.properties * /hive/trunk/build.xml add a way to run a small unit quickly - Key: HIVE-3571 URL: https://issues.apache.org/jira/browse/HIVE-3571 Project: Hive Issue Type: Test Components: Testing Infrastructure Reporter: Namit Jain Assignee: Navis Fix For: 0.11.0 Attachments: HIVE-3571.1.patch.txt, HIVE-3571.D7695.1.patch, HIVE-3571.D7695.2.patch, HIVE-3571.D7695.3.patch A simple unit test: ant test -Dtestcase=TestCliDriver -Dqfile=groupby2.q takes a long time. There should be a quick way to achieve that for debugging. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3956) TestMetaStoreAuthorization always uses the same port
[ https://issues.apache.org/jira/browse/HIVE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570404#comment-13570404 ] Hudson commented on HIVE-3956: -- Integrated in Hive-trunk-hadoop2 #106 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/106/]) HIVE-3956 : TestMetaStoreAuthorization always uses the same port (Navis via Ashutosh Chauhan) (Revision 1442038) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442038 Files : * /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreAuthorization.java TestMetaStoreAuthorization always uses the same port Key: HIVE-3956 URL: https://issues.apache.org/jira/browse/HIVE-3956 Project: Hive Issue Type: Test Components: Tests Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.11.0 Attachments: HIVE-3956.D8253.1.patch Similar issue with HIVE-2959 and HIVE-3052. Using fixed port(1) for test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3981) Split up tests in ptf_general_queries.q
[ https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-3981: Attachment: part.seq part.rc Split up tests in ptf_general_queries.q --- Key: HIVE-3981 URL: https://issues.apache.org/jira/browse/HIVE-3981 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3981.patch, part.rc, part.seq ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes on my laptop to finish. I think we should break it down in smaller .q files otherwise adding a new query and debugging it will be a pain. I have split out rcfile and seqfile tests from it to begin. Also, this test currently fails because original patch didn't had .rc and .seq files (they were binary). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3978) HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH
[ https://issues.apache.org/jira/browse/HIVE-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arup Malakar updated HIVE-3978: --- Attachment: HIVE-3978_branch_0.10_0.patch HIVE-3978_trunk_0.patch HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH - Key: HIVE-3978 URL: https://issues.apache.org/jira/browse/HIVE-3978 Project: Hive Issue Type: Bug Environment: hive-0.10 hcatalog-0.5 hadoop 0.23 hbase 0.94 Reporter: Arup Malakar Assignee: Arup Malakar Attachments: HIVE-3978_branch_0.10_0.patch, HIVE-3978_trunk_0.patch The following code gets executed only in case of cygwin. HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/,/:/g'` But since HIVE_AUX_JARS_PATH gets added to HADOOP_CLASSPATH, the comma should get replaced by : for all cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3978) HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH
[ https://issues.apache.org/jira/browse/HIVE-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arup Malakar updated HIVE-3978: --- Fix Version/s: 0.10.0 0.11.0 Release Note: Use ':' in HIVE_AUX_JARS_PATH instead of ',' Status: Patch Available (was: Open) Review: https://reviews.facebook.net/D8373 HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH - Key: HIVE-3978 URL: https://issues.apache.org/jira/browse/HIVE-3978 Project: Hive Issue Type: Bug Environment: hive-0.10 hcatalog-0.5 hadoop 0.23 hbase 0.94 Reporter: Arup Malakar Assignee: Arup Malakar Fix For: 0.11.0, 0.10.0 Attachments: HIVE-3978_branch_0.10_0.patch, HIVE-3978_trunk_0.patch The following code gets executed only in case of cygwin. HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/,/:/g'` But since HIVE_AUX_JARS_PATH gets added to HADOOP_CLASSPATH, the comma should get replaced by : for all cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3981) Split up tests in ptf_general_queries.q
[ https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570485#comment-13570485 ] Harish Butani commented on HIVE-3981: - I attached the files. Several things: - the rc seq file were in the original patch. So why do you need them here - We have struggled with this issue. The reason to keep all tests in 1 file is we make sure all the ptf tests run before every commit. Can we have a junit suite for all these tests. I would request that developers continue to make sure all these tests run before a commit. - A hack we use to speed up running all the tests is to explicitly set 'runningViaChild = false' line 133 MapRedTask.java. The 70 tests run in under 5 minutes with this setting. Split up tests in ptf_general_queries.q --- Key: HIVE-3981 URL: https://issues.apache.org/jira/browse/HIVE-3981 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3981.patch, part.rc, part.seq ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes on my laptop to finish. I think we should break it down in smaller .q files otherwise adding a new query and debugging it will be a pain. I have split out rcfile and seqfile tests from it to begin. Also, this test currently fails because original patch didn't had .rc and .seq files (they were binary). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3252) Add environment context to metastore Thrift calls
[ https://issues.apache.org/jira/browse/HIVE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570516#comment-13570516 ] Kevin Wilfong commented on HIVE-3252: - +1 Add environment context to metastore Thrift calls - Key: HIVE-3252 URL: https://issues.apache.org/jira/browse/HIVE-3252 Project: Hive Issue Type: Improvement Components: Metastore Reporter: John Reese Assignee: John Reese Priority: Minor Attachments: HIVE-3252.1.patch.txt, HIVE-3252.2.patch.txt Currently in the Hive Thrift metastore API create_table, add_partition, alter_table, alter_partition have with_environment_context analogs. It would be really useful to add similar methods from drop_partition, drop_table, and append_partition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3982) Merge PTFDesc and PTFDef classes
Ashutosh Chauhan created HIVE-3982: -- Summary: Merge PTFDesc and PTFDef classes Key: HIVE-3982 URL: https://issues.apache.org/jira/browse/HIVE-3982 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan As discussed on https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3982) Merge PTFDesc and PTFDef classes
[ https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3982: --- Attachment: hive-3982.patch This is first stab at refactoring. There is more work to do to get rid of antlr datastructures. This patch just gets rid of PTFDef class and uses PTFDesc everywhere. Merge PTFDesc and PTFDef classes Key: HIVE-3982 URL: https://issues.apache.org/jira/browse/HIVE-3982 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3982.patch As discussed on https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3983) Select on table with hbase storage handler fails with an SASL error
[ https://issues.apache.org/jira/browse/HIVE-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arup Malakar updated HIVE-3983: --- Summary: Select on table with hbase storage handler fails with an SASL error (was: Select on table with hbase storage handler fails with an SASL) Select on table with hbase storage handler fails with an SASL error --- Key: HIVE-3983 URL: https://issues.apache.org/jira/browse/HIVE-3983 Project: Hive Issue Type: Bug Environment: hive-0.10 hbase-0.94.5.5 hadoop-0.23.3.1 hcatalog-0.5 Reporter: Arup Malakar The table is created using the following query: {code} CREATE TABLE hbase_table_1(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf1:val) TBLPROPERTIES (hbase.table.name = xyz); {code} Doing a select on the table launches a map-reduce job. But the job fails with the following error: {code} 2013-02-02 01:31:07,500 FATAL [IPC Server handler 3 on 40118] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1348093718159_1501_m_00_0 - exited : java.io.IOException: java.lang.RuntimeException: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'. at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:522) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:160) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:381) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152) Caused by: java.lang.RuntimeException: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'. at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:242) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37) at org.apache.hadoop.hbase.security.User.call(User.java:590) at org.apache.hadoop.hbase.security.User.access$700(User.java:51) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:444) at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.handleSaslConnectionFailure(SecureClient.java:203) at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.setupIOstreams(SecureClient.java:291) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:104) at $Proxy12.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.SecureRpcEngine.getProxy(SecureRpcEngine.java:146) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1291) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1278) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:987) at
[jira] [Created] (HIVE-3983) Select on table with hbase storage handler fails with an SASL
Arup Malakar created HIVE-3983: -- Summary: Select on table with hbase storage handler fails with an SASL Key: HIVE-3983 URL: https://issues.apache.org/jira/browse/HIVE-3983 Project: Hive Issue Type: Bug Environment: hive-0.10 hbase-0.94.5.5 hadoop-0.23.3.1 hcatalog-0.5 Reporter: Arup Malakar The table is created using the following query: {code} CREATE TABLE hbase_table_1(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf1:val) TBLPROPERTIES (hbase.table.name = xyz); {code} Doing a select on the table launches a map-reduce job. But the job fails with the following error: {code} 2013-02-02 01:31:07,500 FATAL [IPC Server handler 3 on 40118] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1348093718159_1501_m_00_0 - exited : java.io.IOException: java.lang.RuntimeException: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'. at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:522) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:160) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:381) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152) Caused by: java.lang.RuntimeException: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'. at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:242) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37) at org.apache.hadoop.hbase.security.User.call(User.java:590) at org.apache.hadoop.hbase.security.User.access$700(User.java:51) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:444) at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.handleSaslConnectionFailure(SecureClient.java:203) at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.setupIOstreams(SecureClient.java:291) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:104) at $Proxy12.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.SecureRpcEngine.getProxy(SecureRpcEngine.java:146) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1291) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1278) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:987) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:882) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:984) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:886) at
[jira] [Assigned] (HIVE-3952) merge map-job followed by map-reduce job
[ https://issues.apache.org/jira/browse/HIVE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reassigned HIVE-3952: - Assignee: Vinod Kumar Vavilapalli I'd like to take a stab at it.. merge map-job followed by map-reduce job Key: HIVE-3952 URL: https://issues.apache.org/jira/browse/HIVE-3952 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Vinod Kumar Vavilapalli Consider the query like: select count(*) FROM ( select idOne, idTwo, value FROM bigTable JOIN smallTableOne on (bigTable.idOne = smallTableOne.idOne) ) firstjoin JOIN smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo); where smallTableOne and smallTableTwo are smaller than hive.auto.convert.join.noconditionaltask.size and hive.auto.convert.join.noconditionaltask is set to true. The joins are collapsed into mapjoins, and it leads to a map-only job (for the map-joins) followed by a map-reduce job (for the group by). Ideally, the map-only job should be merged with the following map-reduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby
[ https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570645#comment-13570645 ] Gunther Hagleitner commented on HIVE-2340: -- [~navis]: I think in general the logic should be to copy numReducers from parent to child not the other way around. If hive makes a decent estimate of reducers for the parent, that's probably the number you want to carry into the combined reduce stage, because that means each reducer is doing the desired amount of work. Buckets and order by are the only special cases I can think of, where the number needs to be fixed. For those special cases without knowing the cardinalities of join/group by/tables, it's indeed difficult to guess if the optimization should be on or off. However, what do you think of using a max ratio of parent reducers/child reducers instead of a fixed minimum number of reducers for the child? With a default of 4 maybe. I.e.: If there are less than 4 times as many reducers in the parent than in the child collapse (assuming another job will be more expensive than the lower number of reducers), else leave it alone. The optimization is only good if the input sizes of the child and parent reducers are similar and expressing this as a ratio of number of reducers is probably the closest we can get right now. This would enable the optimization for a larger body of queries (small tables, single input split, empty group by expr, etc). optimize orderby followed by a groupby -- Key: HIVE-2340 URL: https://issues.apache.org/jira/browse/HIVE-2340 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Labels: perfomance Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt, HIVE-2340.D1209.10.patch, HIVE-2340.D1209.6.patch, HIVE-2340.D1209.7.patch, HIVE-2340.D1209.8.patch, HIVE-2340.D1209.9.patch, testclidriver.txt Before implementing optimizer for JOIN-GBY, try to implement RS-GBY optimizer(cluster-by following group-by). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Amend Hive Bylaws + Add HCatalog Submodule
The following active Hive PMC members have cast votes: Carl Steinbach: +1, +1 Ashutosh Chauhan: +1, +1 Edward Capriolo: +1, +1 Ashish Thusoo: +1, +1 Yongqiang He: +1, +1 Namit Jain: +1, +1 Three active PMC members have abstained from voting. Over the last week the following four Hive PMC members requested that their status be changed from active to emeritus member: jvs, prasadc, zhao, pauly. Voting on these measures is now closed. Both measures have been approved with the required 2/3 majority of active Hive PMC members. Thanks. Carl On Thu, Jan 31, 2013 at 2:04 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: +1 and +1 non-binding. Great to see this happen! Thanks, +Vinod On Thu, Jan 31, 2013 at 12:14 AM, Namit Jain nj...@fb.com wrote: +1 and +1 On 1/30/13 6:53 AM, Gunther Hagleitner ghagleit...@hortonworks.com wrote: +1 and +1 Thanks, Gunther. On Tue, Jan 29, 2013 at 5:18 PM, Edward Capriolo edlinuxg...@gmail.comwrote: Measure 1: +1 Measure 2: +1 On Mon, Jan 28, 2013 at 2:47 PM, Carl Steinbach c...@apache.org wrote: I am calling a vote on the following two measures. Measure 1: Amend Hive Bylaws to Define Submodules and Submodule Committers If this measure passes the Apache Hive Project Bylaws will be amended with the following changes: https://cwiki.apache.org/confluence/display/Hive/Proposed+Changes+to+Hive +Bylaws+for+Submodule+Committers The motivation for these changes is discussed in the following email thread which appeared on the hive-dev and hcatalog-dev mailing lists: http://markmail.org/thread/u5nap7ghvyo7euqa Measure 2: Create HCatalog Submodule and Adopt HCatalog Codebase This measure provides for 1) the establishment of an HCatalog submodule in the Apache Hive Project, 2) the adoption of the Apache HCatalog codebase into the Hive HCatalog submodule, and 3) adding all currently active HCatalog committers as submodule committers on the Hive HCatalog submodule. Passage of this measure depends on the passage of Measure 1. Voting: Both measures require +1 votes from 2/3 of active Hive PMC members in order to pass. All participants in the Hive project are encouraged to vote on these measures, but only votes from active Hive PMC members are binding. The voting period commences immediately and shall last a minimum of six days. Voting is carried out by replying to this email thread. You must indicate which measure you are voting on in order for your vote to be counted. More details about the voting process can be found in the Apache Hive Project Bylaws: https://cwiki.apache.org/confluence/display/Hive/Bylaws -- +Vinod Hortonworks Inc. http://hortonworks.com/
Jenkins build is back to normal : Hive-0.9.1-SNAPSHOT-h0.21 #282
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/282/
[jira] [Updated] (HIVE-3981) Split up tests in ptf_general_queries.q
[ https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3981: --- Attachment: hive-3981_2.patch Patch with ptf_seqfile.q.out and ptf_rcfile.q.out Harish, * patch files are txt files, they don't usually contain binary data, so part.rc and part.seq data are not contained in your patch file. * -Dqfile is capable of handling multiple q files in one run, so contributors can specify -Dqfile=ptf_general_queries.q,ptf_rcfile.q,ptf_seqfile.q so it shouldn't be too hard for contributors to ensure they have run all tests. * Running tests in-process is a good idea but has implication for whole of hive project so probably needs to be discussed in seperate jira Split up tests in ptf_general_queries.q --- Key: HIVE-3981 URL: https://issues.apache.org/jira/browse/HIVE-3981 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3981_2.patch, hive-3981.patch, part.rc, part.seq ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes on my laptop to finish. I think we should break it down in smaller .q files otherwise adding a new query and debugging it will be a pain. I have split out rcfile and seqfile tests from it to begin. Also, this test currently fails because original patch didn't had .rc and .seq files (they were binary). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3937) Hive Profiler
[ https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570778#comment-13570778 ] Hudson commented on HIVE-3937: -- Integrated in Hive-trunk-h0.21 #1955 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1955/]) HIVE-3937 Hive Profiler (Pamela Vagata via namit) (Revision 1442062) Result = FAILURE namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442062 Files : * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/conf/hive-default.xml.template * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisher.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisherInfo.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfiler.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerAggregateStat.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerConnectionInfo.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStats.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStatsAggregator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerUtils.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/HiveProfilerResultsHook.java * /hive/trunk/ql/src/test/queries/clientpositive/hiveprofiler0.q * /hive/trunk/ql/src/test/results/clientpositive/hiveprofiler0.q.out Hive Profiler - Key: HIVE-3937 URL: https://issues.apache.org/jira/browse/HIVE-3937 Project: Hive Issue Type: New Feature Reporter: Pamela Vagata Assignee: Pamela Vagata Priority: Minor Fix For: 0.11.0 Attachments: HIVE-3937.1.patch.txt, HIVE-3937.patch.2.txt, HIVE-3937.patch.3.txt, HIVE-3937.patch.4.txt, HIVE-3937.patch.5.txt Adding a Hive Profiler implementation which tracks inclusive wall times and call counts of the operators -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3571) add a way to run a small unit quickly
[ https://issues.apache.org/jira/browse/HIVE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570779#comment-13570779 ] Hudson commented on HIVE-3571: -- Integrated in Hive-trunk-h0.21 #1955 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1955/]) HIVE-3571 : add a way to run a small unit quickly (Navis via Ashutosh Chauhan) (Revision 1442043) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442043 Files : * /hive/trunk/build.properties * /hive/trunk/build.xml add a way to run a small unit quickly - Key: HIVE-3571 URL: https://issues.apache.org/jira/browse/HIVE-3571 Project: Hive Issue Type: Test Components: Testing Infrastructure Reporter: Namit Jain Assignee: Navis Fix For: 0.11.0 Attachments: HIVE-3571.1.patch.txt, HIVE-3571.D7695.1.patch, HIVE-3571.D7695.2.patch, HIVE-3571.D7695.3.patch A simple unit test: ant test -Dtestcase=TestCliDriver -Dqfile=groupby2.q takes a long time. There should be a quick way to achieve that for debugging. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3956) TestMetaStoreAuthorization always uses the same port
[ https://issues.apache.org/jira/browse/HIVE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570780#comment-13570780 ] Hudson commented on HIVE-3956: -- Integrated in Hive-trunk-h0.21 #1955 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1955/]) HIVE-3956 : TestMetaStoreAuthorization always uses the same port (Navis via Ashutosh Chauhan) (Revision 1442038) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442038 Files : * /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreAuthorization.java TestMetaStoreAuthorization always uses the same port Key: HIVE-3956 URL: https://issues.apache.org/jira/browse/HIVE-3956 Project: Hive Issue Type: Test Components: Tests Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.11.0 Attachments: HIVE-3956.D8253.1.patch Similar issue with HIVE-2959 and HIVE-3052. Using fixed port(1) for test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 1955 - Failure
Changes for Build #1955 [namit] HIVE-3937 Hive Profiler (Pamela Vagata via namit) [hashutosh] HIVE-3571 : add a way to run a small unit quickly (Navis via Ashutosh Chauhan) [hashutosh] HIVE-3956 : TestMetaStoreAuthorization always uses the same port (Navis via Ashutosh Chauhan) 1 tests failed. REGRESSION: org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_aggregator_error_1 Error Message: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. Stack Trace: junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. at net.sf.antcontrib.logic.ForTask.doSequentialIteration(ForTask.java:259) at net.sf.antcontrib.logic.ForTask.doToken(ForTask.java:268) at net.sf.antcontrib.logic.ForTask.doTheTasks(ForTask.java:299) at net.sf.antcontrib.logic.ForTask.execute(ForTask.java:244) The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1955) Status: Failure Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1955/ to view the results.
[jira] [Updated] (HIVE-3982) Merge PTFDesc and PTFDef classes
[ https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3982: --- Attachment: hive-3982_2.patch Previous wasn't applying correctly. Updated patch. Merge PTFDesc and PTFDef classes Key: HIVE-3982 URL: https://issues.apache.org/jira/browse/HIVE-3982 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3982_2.patch, hive-3982.patch As discussed on https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3981) Split up tests in ptf_general_queries.q
[ https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570809#comment-13570809 ] Harish Butani commented on HIVE-3981: - Oh yes, I forgot. +1 Split up tests in ptf_general_queries.q --- Key: HIVE-3981 URL: https://issues.apache.org/jira/browse/HIVE-3981 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3981_2.patch, hive-3981.patch, part.rc, part.seq ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes on my laptop to finish. I think we should break it down in smaller .q files otherwise adding a new query and debugging it will be a pain. I have split out rcfile and seqfile tests from it to begin. Also, this test currently fails because original patch didn't had .rc and .seq files (they were binary). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby
[ https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570813#comment-13570813 ] Navis commented on HIVE-2340: - @Gunther Hagleitner: I also considered ratio thing, but number of reducers is calculated based on input size just before submitted to hadoop and cannot be known in optimizer layer. Except those special cases with order by and bucketing, number of reducers for both RS is -1. So generally speaking, it's safe. optimize orderby followed by a groupby -- Key: HIVE-2340 URL: https://issues.apache.org/jira/browse/HIVE-2340 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Labels: perfomance Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt, HIVE-2340.D1209.10.patch, HIVE-2340.D1209.6.patch, HIVE-2340.D1209.7.patch, HIVE-2340.D1209.8.patch, HIVE-2340.D1209.9.patch, testclidriver.txt Before implementing optimizer for JOIN-GBY, try to implement RS-GBY optimizer(cluster-by following group-by). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-3981) Split up tests in ptf_general_queries.q
[ https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-3981. Resolution: Fixed Committed to branch. Split up tests in ptf_general_queries.q --- Key: HIVE-3981 URL: https://issues.apache.org/jira/browse/HIVE-3981 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3981_2.patch, hive-3981.patch, part.rc, part.seq ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes on my laptop to finish. I think we should break it down in smaller .q files otherwise adding a new query and debugging it will be a pain. I have split out rcfile and seqfile tests from it to begin. Also, this test currently fails because original patch didn't had .rc and .seq files (they were binary). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly
[ https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2839: -- Attachment: HIVE-2839.D2079.6.patch navis updated the revision HIVE-2839 [jira] Filters on outer join with mapjoin hint is not applied correctly. Added tests Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D2079 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D2079?vs=27177id=27231#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java ql/src/test/queries/clientpositive/mapjoin1.q ql/src/test/results/clientpositive/mapjoin1.q.out To: JIRA, navis Cc: njain Filters on outer join with mapjoin hint is not applied correctly Key: HIVE-2839 URL: https://issues.apache.org/jira/browse/HIVE-2839 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch, HIVE-2839.D2079.6.patch Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions. {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND true limit 10; FAILED: Hive Internal Error: java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) java.lang.ClassCastException: org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) {code} and {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND b.key * 10 '1000' limit 10; java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at
[jira] [Commented] (HIVE-3982) Merge PTFDesc and PTFDef classes
[ https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570832#comment-13570832 ] Harish Butani commented on HIVE-3982: - Hi Ashutosh, Thanks for doing this. Looks like by using the PTFDesc as the conf object on the PTFOperator; the regular Hive serialization of the work object just worked. I am fine with this patch; but we are planning to completely changing the Spec and Def(Desc) classes to handle these things: - make the Windowing PTF information explicit after Phase 1: so that it can be used by a Windowing Operator in the future. - remove references from Desc classes to Spec classes and cleanup the Desc classes; It is good time to simplify the translation from Spec to Desc and how the Desc is reinitialized at runtime. I hope you are ok with this. Merge PTFDesc and PTFDef classes Key: HIVE-3982 URL: https://issues.apache.org/jira/browse/HIVE-3982 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3982_2.patch, hive-3982.patch As discussed on https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3984) Maintain a clear separation between Windowing PTF at the specification level.
Harish Butani created HIVE-3984: --- Summary: Maintain a clear separation between Windowing PTF at the specification level. Key: HIVE-3984 URL: https://issues.apache.org/jira/browse/HIVE-3984 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani This has multiple pieces: - redefine the PTF Spec classes, as described in the Data Structs doc in Hive-896 - refactor PTFDesc classes based on this design - refactor translation: both Phase 1 GenPlan phases -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3985) Update new UDAFs introduced for Windowing to work with new Decimal Type
Harish Butani created HIVE-3985: --- Summary: Update new UDAFs introduced for Windowing to work with new Decimal Type Key: HIVE-3985 URL: https://issues.apache.org/jira/browse/HIVE-3985 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3986) Fix select expr processing in PTF Operator
Harish Butani created HIVE-3986: --- Summary: Fix select expr processing in PTF Operator Key: HIVE-3986 URL: https://issues.apache.org/jira/browse/HIVE-3986 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Select expressions that contain Lead/Lag functions are handled by the PTF Operator as a post processing step after output Partition is computed. The Select Exprs Node Descs for these are incorrectly created using the Input RR. These should be created, just like the having expression using the Output RR of the PTF. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3982) Merge PTFDesc and PTFDef classes
[ https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570870#comment-13570870 ] Ashutosh Chauhan commented on HIVE-3982: Yeah.. I agree we need to do both these things. Thats the goal. Thats why my comment said first stab.. more work to do.. : ) this is first step. If you are ok with this change, than I will go ahead and commit, since it simplifies code a bit by getting rid of unnecessary class. Merge PTFDesc and PTFDef classes Key: HIVE-3982 URL: https://issues.apache.org/jira/browse/HIVE-3982 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3982_2.patch, hive-3982.patch As discussed on https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3982) Merge PTFDesc and PTFDef classes
[ https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570883#comment-13570883 ] Harish Butani commented on HIVE-3982: - +1 Merge PTFDesc and PTFDef classes Key: HIVE-3982 URL: https://issues.apache.org/jira/browse/HIVE-3982 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3982_2.patch, hive-3982.patch As discussed on https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3987) Update PTF invocation and windowing grammar
Harish Butani created HIVE-3987: --- Summary: Update PTF invocation and windowing grammar Key: HIVE-3987 URL: https://issues.apache.org/jira/browse/HIVE-3987 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Changes to grammar to make it more Standards based: - support Partition Order style along with Hive specific Distribute/Cluster and Sort in windowing specification. - PTF args should come after Input details like in Aster. - tbd: do we need to support named parameters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3874) Create a new Optimized Row Columnar file format for Hive
[ https://issues.apache.org/jira/browse/HIVE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570888#comment-13570888 ] Kevin Wilfong commented on HIVE-3874: - @Owen: Here's a couple more issues I ran into, and again I can file JIRAs for these later once the code is checked in. Incorrect deserialization of doubles (leads to a lot of NaNs) https://reviews.facebook.net/D8379 Strings are written incorrectly when they span two chunks of a DynamicByteArray E.g. say the original string is 'abcdefghi' the string written in the ORC file may be 'abcdefabc' https://reviews.facebook.net/D8385 Create a new Optimized Row Columnar file format for Hive Key: HIVE-3874 URL: https://issues.apache.org/jira/browse/HIVE-3874 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: hive.3874.2.patch, OrcFileIntro.pptx, orc.tgz There are several limitations of the current RC File format that I'd like to address by creating a new format: * each column value is stored as a binary blob, which means: ** the entire column value must be read, decompressed, and deserialized ** the file format can't use smarter type-specific compression ** push down filters can't be evaluated * the start of each row group needs to be found by scanning * user metadata can only be added to the file when the file is created * the file doesn't store the number of rows per a file or row group * there is no mechanism for seeking to a particular row number, which is required for external indexes. * there is no mechanism for storing light weight indexes within the file to enable push-down filters to skip entire row groups. * the type of the rows aren't stored in the file -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Amend Hive Bylaws + Add HCatalog Submodule
Most excellent. I'll start the vote in the HCatalog PPMC to approve this, and assuming that passes I'll then start a vote in the IPMC per the guidelines at http://incubator.apache.org/guides/graduation.html#subproject Alan. On Feb 4, 2013, at 2:27 PM, Carl Steinbach wrote: The following active Hive PMC members have cast votes: Carl Steinbach: +1, +1 Ashutosh Chauhan: +1, +1 Edward Capriolo: +1, +1 Ashish Thusoo: +1, +1 Yongqiang He: +1, +1 Namit Jain: +1, +1 Three active PMC members have abstained from voting. Over the last week the following four Hive PMC members requested that their status be changed from active to emeritus member: jvs, prasadc, zhao, pauly. Voting on these measures is now closed. Both measures have been approved with the required 2/3 majority of active Hive PMC members. Thanks. Carl On Thu, Jan 31, 2013 at 2:04 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: +1 and +1 non-binding. Great to see this happen! Thanks, +Vinod On Thu, Jan 31, 2013 at 12:14 AM, Namit Jain nj...@fb.com wrote: +1 and +1 On 1/30/13 6:53 AM, Gunther Hagleitner ghagleit...@hortonworks.com wrote: +1 and +1 Thanks, Gunther. On Tue, Jan 29, 2013 at 5:18 PM, Edward Capriolo edlinuxg...@gmail.comwrote: Measure 1: +1 Measure 2: +1 On Mon, Jan 28, 2013 at 2:47 PM, Carl Steinbach c...@apache.org wrote: I am calling a vote on the following two measures. Measure 1: Amend Hive Bylaws to Define Submodules and Submodule Committers If this measure passes the Apache Hive Project Bylaws will be amended with the following changes: https://cwiki.apache.org/confluence/display/Hive/Proposed+Changes+to+Hive +Bylaws+for+Submodule+Committers The motivation for these changes is discussed in the following email thread which appeared on the hive-dev and hcatalog-dev mailing lists: http://markmail.org/thread/u5nap7ghvyo7euqa Measure 2: Create HCatalog Submodule and Adopt HCatalog Codebase This measure provides for 1) the establishment of an HCatalog submodule in the Apache Hive Project, 2) the adoption of the Apache HCatalog codebase into the Hive HCatalog submodule, and 3) adding all currently active HCatalog committers as submodule committers on the Hive HCatalog submodule. Passage of this measure depends on the passage of Measure 1. Voting: Both measures require +1 votes from 2/3 of active Hive PMC members in order to pass. All participants in the Hive project are encouraged to vote on these measures, but only votes from active Hive PMC members are binding. The voting period commences immediately and shall last a minimum of six days. Voting is carried out by replying to this email thread. You must indicate which measure you are voting on in order for your vote to be counted. More details about the voting process can be found in the Apache Hive Project Bylaws: https://cwiki.apache.org/confluence/display/Hive/Bylaws -- +Vinod Hortonworks Inc. http://hortonworks.com/
[jira] [Updated] (HIVE-3849) Aliased column in where clause for multi-groupby single reducer cannot be resolved
[ https://issues.apache.org/jira/browse/HIVE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3849: -- Attachment: HIVE-3849.D7713.7.patch navis updated the revision HIVE-3849 [jira] Columns are not extracted for multi-groupby single reducer case somtimes. For addressing comment, I've generified some important entities and found it's affeting many classes. @namit; I think this is not what you expected, or is it? Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D7713 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D7713?vs=26973id=27243#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/lib/DefaultGraphWalker.java ql/src/java/org/apache/hadoop/hive/ql/lib/DefaultRuleDispatcher.java ql/src/java/org/apache/hadoop/hive/ql/lib/Dispatcher.java ql/src/java/org/apache/hadoop/hive/ql/lib/GraphWalker.java ql/src/java/org/apache/hadoop/hive/ql/lib/Node.java ql/src/java/org/apache/hadoop/hive/ql/lib/NodeProcessor.java ql/src/java/org/apache/hadoop/hive/ql/lib/PreOrderWalker.java ql/src/java/org/apache/hadoop/hive/ql/lib/Rule.java ql/src/java/org/apache/hadoop/hive/ql/lib/RuleExactMatch.java ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ql/src/java/org/apache/hadoop/hive/ql/lib/TaskGraphWalker.java ql/src/java/org/apache/hadoop/hive/ql/lib/Utils.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractBucketJoinProc.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketMapJoinOptimizer.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMROperator.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRRedSink1.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRRedSink2.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRRedSink3.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRUnion1.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/GroupByOptimizer.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/PrunerExpressionOperatorFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/PrunerOperatorFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplePruner.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/SkewJoinOptimizer.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedMergeBucketMapJoinOptimizer.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/ExprProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/OpProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrExprProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrOpProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingOpProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinResolver.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/unionproc/UnionProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/parse/ASTNode.java ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/parse/PrintOpTreeProcessor.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
[jira] [Commented] (HIVE-3959) Update Partition Statistics in Metastore Layer
[ https://issues.apache.org/jira/browse/HIVE-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570910#comment-13570910 ] Bhushan Mandhani commented on HIVE-3959: Diff out at https://reviews.facebook.net/D8271 Still need some minor updates before I can submit the patch. Update Partition Statistics in Metastore Layer -- Key: HIVE-3959 URL: https://issues.apache.org/jira/browse/HIVE-3959 Project: Hive Issue Type: Improvement Components: Metastore, Statistics Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Minor When partitions are created using queries (insert overwrite and insert into) then the StatsTask updates all stats. However, when partitions are added directly through metadata-only partitions (either CLI or direct calls to Thrift Metastore) no stats are populated even if hive.stats.reliable is set to true. This puts us in a situation where we can't decide if stats are truly reliable or not. We propose that the fast stats (numFiles and totalSize) which don't require a scan of the data should always be populated and be completely reliable. For now we are still excluding rowCount and rawDataSize because that will make these operations very expensive. Currently they are quick metadata-only ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Add support for pulling HBase columns with prefixes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9276/#review16080 --- hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java https://reviews.apache.org/r/9276/#comment34401 This seems like a limited case of pattern matching. Swarnim, any way we can support generic regex matching instead? - Mark Grover On Feb. 3, 2013, 1:04 a.m., Swarnim Kulkarni wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9276/ --- (Updated Feb. 3, 2013, 1:04 a.m.) Review request for hive. Description --- Added support for pulling hbase columns just by providing prefixes and a wildcard. So a query now could look something like this: CREATE EXTERNAL TABLE hive_hbase_test ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,fam1:col*) TBLPROPERTIES (hbase.table.name = TEST_HBASE_TABLE); This would pull in all columns under column family fam1 which start with col. This gives a little more flexibility over pull all columns format. This addresses bug HIVE-3725. https://issues.apache.org/jira/browse/HIVE-3725 Diffs - hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 7f37ba5 hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java a8ba9d9 hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java d35bb52 hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java e821282 Diff: https://reviews.apache.org/r/9276/diff/ Testing --- Added unit tests to demonstrate the new functionality. Also made sure that all existing unit tests passed. Thanks, Swarnim Kulkarni
[jira] [Commented] (HIVE-3725) Add support for pulling HBase columns with prefixes
[ https://issues.apache.org/jira/browse/HIVE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570925#comment-13570925 ] Mark Grover commented on HIVE-3725: --- Comments on reviewboard Add support for pulling HBase columns with prefixes --- Key: HIVE-3725 URL: https://issues.apache.org/jira/browse/HIVE-3725 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.9.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-3725.1.patch.txt Current HBase Hive integration supports reading many values from the same row by specifying a column family. And specifying just the column family can pull in all qualifiers within the family. We should add in support to be able to specify a prefix for the qualifier and all columns that start with the prefix would automatically get pulled in. A wildcard support would be ideal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Requests
Swarnim, I left some comments on reviewboard. On Mon, Feb 4, 2013 at 8:00 AM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: Hello, I opened up two reviews for small issues, HIVE-3553[1] and HIVE-3725[2]. If you guys get a chance to review and provide feedback on it, I will really appreciate. Thanks, [1] https://reviews.apache.org/r/9275/ [2] https://reviews.apache.org/r/9276/ -- Swarnim
[jira] [Comment Edited] (HIVE-3982) Merge PTFDesc and PTFDef classes
[ https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571030#comment-13571030 ] Ashutosh Chauhan edited comment on HIVE-3982 at 2/5/13 4:26 AM: Committed to branch. was (Author: ashutoshc): Committed to trunk. Merge PTFDesc and PTFDef classes Key: HIVE-3982 URL: https://issues.apache.org/jira/browse/HIVE-3982 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3982_2.patch, hive-3982.patch As discussed on https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-3982) Merge PTFDesc and PTFDef classes
[ https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-3982. Resolution: Fixed Committed to trunk. Merge PTFDesc and PTFDef classes Key: HIVE-3982 URL: https://issues.apache.org/jira/browse/HIVE-3982 Project: Hive Issue Type: Task Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: hive-3982_2.patch, hive-3982.patch As discussed on https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Fwd: [VOTE] Graduate HCatalog from the incubator and become part of Hive
FYI. Alan. Begin forwarded message: From: Alan Gates ga...@hortonworks.com Date: February 4, 2013 10:18:09 PM PST To: hcatalog-...@incubator.apache.org Subject: [VOTE] Graduate HCatalog from the incubator and become part of Hive The Hive PMC has voted to accept HCatalog as a submodule of Hive. You can see the vote thread at http://mail-archives.apache.org/mod_mbox/hive-dev/201301.mbox/%3cCACf6RrzktBYD0suZxn3Pfv8XkR=vgwszrzyb_2qvesuj2vh...@mail.gmail.com%3e . We now need to vote to graduate from the incubator and become a submodule of Hive. This entails the following: 1) the establishment of an HCatalog submodule in the Apache Hive Project; 2) the adoption of the Apache HCatalog codebase into the Hive HCatalog submodule; and 3) adding all currently active HCatalog committers as submodule committers on the Hive HCatalog submodule. Definitions for all these can be found in the (now adopted) Hive bylaws at https://cwiki.apache.org/confluence/display/Hive/Proposed+Changes+to+Hive+Bylaws+for+Submodule+Committer. This vote will stay open for at least 72 hours (thus 23:00 PST on 2/7/13). PPMC members votes are binding in this vote, though input from all is welcome. If this vote passes the next step will be to submit the graduation motion to the Incubator PMC. Here's my +1. Alan.
[jira] [Updated] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly
[ https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-2839: - Status: Open (was: Patch Available) comments Filters on outer join with mapjoin hint is not applied correctly Key: HIVE-2839 URL: https://issues.apache.org/jira/browse/HIVE-2839 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch, HIVE-2839.D2079.6.patch Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions. {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND true limit 10; FAILED: Hive Internal Error: java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) java.lang.ClassCastException: org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) {code} and {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND b.key * 10 '1000' limit 10; java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:495) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) ... 8 more {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly
[ https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571117#comment-13571117 ] Phabricator commented on HIVE-2839: --- njain has commented on the revision HIVE-2839 [jira] Filters on outer join with mapjoin hint is not applied correctly. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java:169 This should not be possible. We don't support join - mapjoin union - mapjoin I am not sure about LateralView - mapjoin, but if it is allowed, I will add a jira to stop that, and fix that ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java:181 Remove this function. assert that parents.size() == 1 ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java:128 return a List instead of ArrayList ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java:130 result can be a List instead of ArrayList REVISION DETAIL https://reviews.facebook.net/D2079 To: JIRA, navis Cc: njain Filters on outer join with mapjoin hint is not applied correctly Key: HIVE-2839 URL: https://issues.apache.org/jira/browse/HIVE-2839 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch, HIVE-2839.D2079.6.patch Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions. {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND true limit 10; FAILED: Hive Internal Error: java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) java.lang.ClassCastException: org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) {code} and {code} SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key AND b.key * 10 '1000' limit 10; java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212) at
[jira] [Updated] (HIVE-2991) Integrate Clover with Hive
[ https://issues.apache.org/jira/browse/HIVE-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Katsov updated HIVE-2991: -- Attachment: (was: hive.2991.2.trunk.patch) Integrate Clover with Hive -- Key: HIVE-2991 URL: https://issues.apache.org/jira/browse/HIVE-2991 Project: Hive Issue Type: Test Components: Testing Infrastructure Affects Versions: 0.9.0 Reporter: Ashutosh Chauhan Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2991.D2985.1.patch, hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, hive.2991.1.trunk.patch, hive.2991.2.branch-0.10.patch, hive.2991.2.branch-0.9.patch, hive.2991.2.trunk.patch, hive-trunk-clover-html-report.zip Atlassian has donated license of their code coverage tool Clover to ASF. Lets make use of it to generate code coverage report to figure out which areas of Hive are well tested and which ones are not. More information about license can be found in Hadoop jira HADOOP-1718 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2991) Integrate Clover with Hive
[ https://issues.apache.org/jira/browse/HIVE-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Katsov updated HIVE-2991: -- Attachment: hive.2991.2.trunk.patch Integrate Clover with Hive -- Key: HIVE-2991 URL: https://issues.apache.org/jira/browse/HIVE-2991 Project: Hive Issue Type: Test Components: Testing Infrastructure Affects Versions: 0.9.0 Reporter: Ashutosh Chauhan Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2991.D2985.1.patch, hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, hive.2991.1.trunk.patch, hive.2991.2.branch-0.10.patch, hive.2991.2.branch-0.9.patch, hive.2991.2.trunk.patch, hive-trunk-clover-html-report.zip Atlassian has donated license of their code coverage tool Clover to ASF. Lets make use of it to generate code coverage report to figure out which areas of Hive are well tested and which ones are not. More information about license can be found in Hadoop jira HADOOP-1718 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2991) Integrate Clover with Hive
[ https://issues.apache.org/jira/browse/HIVE-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Katsov updated HIVE-2991: -- Status: Patch Available (was: Open) Integrate Clover with Hive -- Key: HIVE-2991 URL: https://issues.apache.org/jira/browse/HIVE-2991 Project: Hive Issue Type: Test Components: Testing Infrastructure Affects Versions: 0.9.0 Reporter: Ashutosh Chauhan Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2991.D2985.1.patch, hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, hive.2991.1.trunk.patch, hive.2991.2.branch-0.10.patch, hive.2991.2.branch-0.9.patch, hive.2991.2.trunk.patch, hive-trunk-clover-html-report.zip Atlassian has donated license of their code coverage tool Clover to ASF. Lets make use of it to generate code coverage report to figure out which areas of Hive are well tested and which ones are not. More information about license can be found in Hadoop jira HADOOP-1718 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira