[jira] [Commented] (HIVE-1772) optimize join followed by a groupby
[ https://issues.apache.org/jira/browse/HIVE-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13268913#comment-13268913 ] Radhika Malik commented on HIVE-1772: - A group of us is trying to do this for a class project. We want to parallelize the process of JOIN followed by GROUP BY as follows- The Map job is the same: it takes in two TableScanOperators (as well as any FilterOperators) as well as two ReduceSinkOperators. The Reduce job, while computing the joins in the JoinOperator also groups the results and performs any aggregates. It then pushes the results directly to a FileSinkOperator without having a separate GroupByOperator. Does anyone have suggestions on where we can get started in the code? Looking at Hive's architecture overview, it seems we want to make changes to the Query Plan Generator in the compiler to generate different map-reduce tasks for queries that include Join followed by Group By. We are thinking of beginning with trying to modify src/ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java but weren't sure if this was the right approach. Any input on how you think we should approach this would be great! optimize join followed by a groupby --- Key: HIVE-1772 URL: https://issues.apache.org/jira/browse/HIVE-1772 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Navis Attachments: HIVE-1772.1.patch explain SELECT x.key, count(1) FROM src1 x JOIN src y ON (x.key = y.key) group by x.key; STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 is a root stage The above query issues 2 map-reduce jobs. The first MR job performs the join, whereas the second MR performs the group by. Since the data is already sorted, the group by can be performed in the reducer of the join itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3003) ant gen-test fails in ./ql
[ https://issues.apache.org/jira/browse/HIVE-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13268928#comment-13268928 ] Carl Steinbach commented on HIVE-3003: -- @Tim: I added you to the wiki ACL. Let me know if you run into any problems. ant gen-test fails in ./ql Key: HIVE-3003 URL: https://issues.apache.org/jira/browse/HIVE-3003 Project: Hive Issue Type: Bug Components: Documentation Reporter: Gang Tim Liu Assignee: Gang Tim Liu Priority: Minor ant gen-test fails in hive-root/ql. Expected behavior = BUILD SUCCESSFUL Actual behavior BUILD FAILED ./hive/ql/build.xml:85: Problem: failed to create task or type if Cause: The name is undefined. Action: Check the spelling. Action: Check that any custom tasks/types have been declared. Action: Check that any presetdef/macrodef declarations have taken place. How to reproduce it === 1. git clone git://github.com/facebook/arcanist.git 2. ant clean package eclipse-files 3. cd metastore/ 4. ant model-jar 5. cd ../ql 6. ant gen-test Details in https://cwiki.apache.org/confluence/display/Hive/GettingStarted+EclipseSetup -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #6
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/6/ -- [...truncated 5046 lines...] A ql/src/gen/thrift/gen-py/queryplan A ql/src/gen/thrift/gen-py/queryplan/ttypes.py A ql/src/gen/thrift/gen-py/queryplan/constants.py A ql/src/gen/thrift/gen-py/queryplan/__init__.py A ql/src/gen/thrift/gen-cpp A ql/src/gen/thrift/gen-cpp/queryplan_constants.h A ql/src/gen/thrift/gen-cpp/queryplan_types.cpp A ql/src/gen/thrift/gen-cpp/queryplan_types.h A ql/src/gen/thrift/gen-cpp/queryplan_constants.cpp A ql/src/gen/thrift/gen-rb A ql/src/gen/thrift/gen-rb/queryplan_types.rb A ql/src/gen/thrift/gen-rb/queryplan_constants.rb A ql/src/gen/thrift/gen-javabean A ql/src/gen/thrift/gen-javabean/org A ql/src/gen/thrift/gen-javabean/org/apache A ql/src/gen/thrift/gen-javabean/org/apache/hadoop A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/QueryPlan.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Adjacency.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Graph.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Task.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/AdjacencyType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Stage.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/TaskType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Query.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/StageType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/NodeType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Operator.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java A ql/src/gen/thrift/gen-php A ql/src/gen/thrift/gen-php/queryplan A ql/src/gen/thrift/gen-php/queryplan/queryplan_types.php A ql/src/gen-javabean A ql/src/gen-javabean/org A ql/src/gen-javabean/org/apache A ql/src/gen-javabean/org/apache/hadoop A ql/src/gen-javabean/org/apache/hadoop/hive A ql/src/gen-javabean/org/apache/hadoop/hive/ql A ql/src/gen-javabean/org/apache/hadoop/hive/ql/plan A ql/src/gen-javabean/org/apache/hadoop/hive/ql/plan/api A ql/src/gen-php A ql/build.xml A ql/if A ql/if/queryplan.thrift A pdk A pdk/ivy.xml A pdk/scripts A pdk/scripts/class-registration.xsl A pdk/scripts/build-plugin.xml A pdk/scripts/README A pdk/src A pdk/src/java A pdk/src/java/org A pdk/src/java/org/apache A pdk/src/java/org/apache/hive A pdk/src/java/org/apache/hive/pdk A pdk/src/java/org/apache/hive/pdk/FunctionExtractor.java A pdk/src/java/org/apache/hive/pdk/HivePdkUnitTest.java A pdk/src/java/org/apache/hive/pdk/HivePdkUnitTests.java A pdk/src/java/org/apache/hive/pdk/PluginTest.java A pdk/test-plugin A pdk/test-plugin/test A pdk/test-plugin/test/cleanup.sql A pdk/test-plugin/test/onerow.txt A pdk/test-plugin/test/setup.sql A pdk/test-plugin/src A pdk/test-plugin/src/org A pdk/test-plugin/src/org/apache A pdk/test-plugin/src/org/apache/hive A pdk/test-plugin/src/org/apache/hive/pdktest A pdk/test-plugin/src/org/apache/hive/pdktest/Rot13.java A pdk/test-plugin/build.xml A pdk/build.xml A build-offline.xml U. At revision 1334438 no change for http://svn.apache.org/repos/asf/hive/branches/branch-0.9 since the previous build [hive] $ /home/hudson/tools/ant/apache-ant-1.8.1/bin/ant -Dversion=0.9.1-SNAPSHOT very-clean tar binary Buildfile: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build.xml ivy-init-dirs: [echo] Project: hive [mkdir] Created dir: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy [mkdir] Created dir: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy/lib [mkdir] Created dir: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy/report [mkdir] Created dir: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy/maven ivy-download: [echo] Project: hive [get] Getting:
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #6
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/6/ -- [...truncated 5043 lines...] A ql/src/gen A ql/src/gen/thrift A ql/src/gen/thrift/gen-py A ql/src/gen/thrift/gen-py/queryplan A ql/src/gen/thrift/gen-py/queryplan/ttypes.py A ql/src/gen/thrift/gen-py/queryplan/constants.py A ql/src/gen/thrift/gen-py/queryplan/__init__.py A ql/src/gen/thrift/gen-cpp A ql/src/gen/thrift/gen-cpp/queryplan_constants.h A ql/src/gen/thrift/gen-cpp/queryplan_types.cpp A ql/src/gen/thrift/gen-cpp/queryplan_types.h A ql/src/gen/thrift/gen-cpp/queryplan_constants.cpp A ql/src/gen/thrift/gen-rb A ql/src/gen/thrift/gen-rb/queryplan_types.rb A ql/src/gen/thrift/gen-rb/queryplan_constants.rb A ql/src/gen/thrift/gen-javabean A ql/src/gen/thrift/gen-javabean/org A ql/src/gen/thrift/gen-javabean/org/apache A ql/src/gen/thrift/gen-javabean/org/apache/hadoop A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/QueryPlan.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Adjacency.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Graph.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Task.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/AdjacencyType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Stage.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/TaskType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Query.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/StageType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/NodeType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Operator.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java A ql/src/gen/thrift/gen-php A ql/src/gen/thrift/gen-php/queryplan A ql/src/gen/thrift/gen-php/queryplan/queryplan_types.php A ql/src/gen-javabean A ql/src/gen-javabean/org A ql/src/gen-javabean/org/apache A ql/src/gen-javabean/org/apache/hadoop A ql/src/gen-javabean/org/apache/hadoop/hive A ql/src/gen-javabean/org/apache/hadoop/hive/ql A ql/src/gen-javabean/org/apache/hadoop/hive/ql/plan A ql/src/gen-javabean/org/apache/hadoop/hive/ql/plan/api A ql/src/gen-php A ql/build.xml A ql/if A ql/if/queryplan.thrift A pdk A pdk/ivy.xml A pdk/scripts A pdk/scripts/class-registration.xsl A pdk/scripts/build-plugin.xml A pdk/scripts/README A pdk/src A pdk/src/java A pdk/src/java/org A pdk/src/java/org/apache A pdk/src/java/org/apache/hive A pdk/src/java/org/apache/hive/pdk A pdk/src/java/org/apache/hive/pdk/FunctionExtractor.java A pdk/src/java/org/apache/hive/pdk/HivePdkUnitTest.java A pdk/src/java/org/apache/hive/pdk/HivePdkUnitTests.java A pdk/src/java/org/apache/hive/pdk/PluginTest.java A pdk/test-plugin A pdk/test-plugin/test A pdk/test-plugin/test/cleanup.sql A pdk/test-plugin/test/onerow.txt A pdk/test-plugin/test/setup.sql A pdk/test-plugin/src A pdk/test-plugin/src/org A pdk/test-plugin/src/org/apache A pdk/test-plugin/src/org/apache/hive A pdk/test-plugin/src/org/apache/hive/pdktest A pdk/test-plugin/src/org/apache/hive/pdktest/Rot13.java A pdk/test-plugin/build.xml A pdk/build.xml A build-offline.xml U. At revision 1334438 no change for http://svn.apache.org/repos/asf/hive/branches/branch-0.9 since the previous build [hive] $ /home/hudson/tools/ant/apache-ant-1.8.1/bin/ant -Dversion=0.9.1-SNAPSHOT very-clean tar binary Buildfile: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build.xml ivy-init-dirs: [echo] Project: hive [mkdir] Created dir: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy [mkdir] Created dir: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/lib [mkdir] Created dir: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/report [mkdir] Created dir:
[jira] [Commented] (HIVE-3003) ant gen-test fails in ./ql
[ https://issues.apache.org/jira/browse/HIVE-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269010#comment-13269010 ] Gang Tim Liu commented on HIVE-3003: @Carl thanks a lot will let you know how it goes thanks again Tim Sent from my iPhone ant gen-test fails in ./ql Key: HIVE-3003 URL: https://issues.apache.org/jira/browse/HIVE-3003 Project: Hive Issue Type: Bug Components: Documentation Reporter: Gang Tim Liu Assignee: Gang Tim Liu Priority: Minor ant gen-test fails in hive-root/ql. Expected behavior = BUILD SUCCESSFUL Actual behavior BUILD FAILED ./hive/ql/build.xml:85: Problem: failed to create task or type if Cause: The name is undefined. Action: Check the spelling. Action: Check that any custom tasks/types have been declared. Action: Check that any presetdef/macrodef declarations have taken place. How to reproduce it === 1. git clone git://github.com/facebook/arcanist.git 2. ant clean package eclipse-files 3. cd metastore/ 4. ant model-jar 5. cd ../ql 6. ant gen-test Details in https://cwiki.apache.org/confluence/display/Hive/GettingStarted+EclipseSetup -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 1413 - Still Failing
Changes for Build #1408 Changes for Build #1409 [hashutosh] HIVE-2990 Remove hadoop-source Ivy resolvers and Ant targets (Carl Steinbach via Ashutosh Chauhan) Changes for Build #1410 Changes for Build #1411 Changes for Build #1412 [namit] HIVE-3002 Revert HIVE-2986 (Kevin Wilfong via namit) [namit] HIVE-2994 Pass a environment context to metastore thrift APIs (Delia David via namit) Changes for Build #1413 1 tests failed. FAILED: org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1 Error Message: Unexpected exception See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. Stack Trace: junit.framework.AssertionFailedError: Unexpected exception See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. at junit.framework.Assert.fail(Assert.java:50) at org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1(TestNegativeCliDriver.java:10552) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906) The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1413) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1413/ to view the results.
[jira] [Commented] (HIVE-2529) metastore 0.8 upgrade script for PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269092#comment-13269092 ] Phabricator commented on HIVE-2529: --- cwsteinbach has accepted the revision HIVE-2529 [jira] metastore 0.8 upgrade script for PostgreSQL. +1. REVISION DETAIL https://reviews.facebook.net/D3027 BRANCH HIVE-2529 metastore 0.8 upgrade script for PostgreSQL Key: HIVE-2529 URL: https://issues.apache.org/jira/browse/HIVE-2529 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0 Reporter: John Sichi Assignee: Zhenxiao Luo Priority: Blocker Attachments: HIVE-2529.1.patch.txt, HIVE-2529.D3027.1.patch I think you mentioned that this was in the works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2529) metastore 0.8 upgrade script for PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2529: - Resolution: Fixed Fix Version/s: 0.10.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Zhenxiao! metastore 0.8 upgrade script for PostgreSQL Key: HIVE-2529 URL: https://issues.apache.org/jira/browse/HIVE-2529 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0 Reporter: John Sichi Assignee: Zhenxiao Luo Priority: Blocker Fix For: 0.10.0 Attachments: HIVE-2529.1.patch.txt, HIVE-2529.D3027.1.patch I think you mentioned that this was in the works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2529) metastore 0.8 upgrade script for PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269094#comment-13269094 ] Phabricator commented on HIVE-2529: --- zhenxiao has committed the revision HIVE-2529 [jira] metastore 0.8 upgrade script for PostgreSQL. Change committed by cws. REVISION DETAIL https://reviews.facebook.net/D3027 COMMIT https://reviews.facebook.net/rHIVE1334537 metastore 0.8 upgrade script for PostgreSQL Key: HIVE-2529 URL: https://issues.apache.org/jira/browse/HIVE-2529 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0 Reporter: John Sichi Assignee: Zhenxiao Luo Priority: Blocker Fix For: 0.10.0 Attachments: HIVE-2529.1.patch.txt, HIVE-2529.D3027.1.patch I think you mentioned that this was in the works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1719) Move RegexSerDe out of hive-contrib and over to hive-serde
[ https://issues.apache.org/jira/browse/HIVE-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269095#comment-13269095 ] Phabricator commented on HIVE-1719: --- cwsteinbach has requested changes to the revision HIVE-1719 [jira] Move RegexSerDe out of hive-contrib and over to hive-serde. Looks good, but we need to copy the regex serde testcases from contrib over to ql. I also noticed that the negative testcase isn't documented, and doesn't seem to exercise any of the error conditions in RegexSerDe.initialize(). Can you please add some additional test coverage for these cases? Thanks. REVISION DETAIL https://reviews.facebook.net/D3051 BRANCH HIVE-1719 Move RegexSerDe out of hive-contrib and over to hive-serde -- Key: HIVE-1719 URL: https://issues.apache.org/jira/browse/HIVE-1719 Project: Hive Issue Type: Task Components: Serializers/Deserializers Reporter: Carl Steinbach Assignee: Shreepadma Venugopalan Attachments: HIVE-1719.D3051.1.patch, HIVE-1719.D3051.2.patch RegexSerDe is as much a part of the standard Hive distribution as the other SerDes currently in hive-serde. I think we should move it over to the hive-serde module so that users don't have to go to the added effort of manually registering the contrib jar before using it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3004) RegexSerDe should support column types besides STRING
Carl Steinbach created HIVE-3004: Summary: RegexSerDe should support column types besides STRING Key: HIVE-3004 URL: https://issues.apache.org/jira/browse/HIVE-3004 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3004) RegexSerDe should support column types besides STRING
[ https://issues.apache.org/jira/browse/HIVE-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269096#comment-13269096 ] Carl Steinbach commented on HIVE-3004: -- Currently RegexSerDe requires that all columns in the table be of type STRING. We should modify it so that it automatically converts to the other primitive column types. If feasible, it would also be nice to support Hive's complex column types. RegexSerDe should support column types besides STRING - Key: HIVE-3004 URL: https://issues.apache.org/jira/browse/HIVE-3004 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3004) RegexSerDe should support column types besides STRING
[ https://issues.apache.org/jira/browse/HIVE-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reassigned HIVE-3004: Assignee: Shreepadma Venugopalan RegexSerDe should support column types besides STRING - Key: HIVE-3004 URL: https://issues.apache.org/jira/browse/HIVE-3004 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Carl Steinbach Assignee: Shreepadma Venugopalan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3004) RegexSerDe should support other column types in addition to STRING
[ https://issues.apache.org/jira/browse/HIVE-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3004: - Summary: RegexSerDe should support other column types in addition to STRING (was: RegexSerDe should support column types besides STRING) RegexSerDe should support other column types in addition to STRING -- Key: HIVE-3004 URL: https://issues.apache.org/jira/browse/HIVE-3004 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Carl Steinbach Assignee: Shreepadma Venugopalan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3005) Query containing LIMIT 0 clause should skip execution phase
Carl Steinbach created HIVE-3005: Summary: Query containing LIMIT 0 clause should skip execution phase Key: HIVE-3005 URL: https://issues.apache.org/jira/browse/HIVE-3005 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3005) Skip execution phase for queries that contain LIMIT 0 clause
[ https://issues.apache.org/jira/browse/HIVE-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3005: - Summary: Skip execution phase for queries that contain LIMIT 0 clause (was: Query containing LIMIT 0 clause should skip execution phase) Skip execution phase for queries that contain LIMIT 0 clause -- Key: HIVE-3005 URL: https://issues.apache.org/jira/browse/HIVE-3005 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3005) Query containing LIMIT 0 clause should skip execution phase
[ https://issues.apache.org/jira/browse/HIVE-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269099#comment-13269099 ] Carl Steinbach commented on HIVE-3005: -- Executing a query that contains a LIMIT 0 clause is a trick that some clients (e.g. ODBC clients) employ in order to generate a result set without incurring the cost of actually executing the query. Unfortunately, this trick doesn't work with Hive: {noformat} hive SELECT key FROM SRC LIMIT 0; SELECT key FROM SRC LIMIT 0; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Execution log at: /tmp/carl/carl_20120505182828_a5405bcb-c156-4572-b2d8-eda2cc199a14.log Job running in-process (local Hadoop) Hadoop job information for null: number of mappers: 0; number of reducers: 0 2012-05-05 18:28:35,999 null map = 100%, reduce = 0% Ended Job = job_local_0001 Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin OK {noformat} Query containing LIMIT 0 clause should skip execution phase - Key: HIVE-3005 URL: https://issues.apache.org/jira/browse/HIVE-3005 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3005) Skip execution phase for queries that contain LIMIT 0 clause
[ https://issues.apache.org/jira/browse/HIVE-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reassigned HIVE-3005: Assignee: Shreepadma Venugopalan Skip execution phase for queries that contain LIMIT 0 clause -- Key: HIVE-3005 URL: https://issues.apache.org/jira/browse/HIVE-3005 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Carl Steinbach Assignee: Shreepadma Venugopalan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3006) execution of queries with always false WHERE clauses
Carl Steinbach created HIVE-3006: Summary: execution of queries with always false WHERE clauses Key: HIVE-3006 URL: https://issues.apache.org/jira/browse/HIVE-3006 Project: Hive Issue Type: Improvement Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3006) Skip execution of queries with always false WHERE clauses
[ https://issues.apache.org/jira/browse/HIVE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3006: - Component/s: Query Processor Summary: Skip execution of queries with always false WHERE clauses (was: execution of queries with always false WHERE clauses) Skip execution of queries with always false WHERE clauses - Key: HIVE-3006 URL: https://issues.apache.org/jira/browse/HIVE-3006 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3006) Skip execution of queries with always false WHERE clauses
[ https://issues.apache.org/jira/browse/HIVE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reassigned HIVE-3006: Assignee: Shreepadma Venugopalan Skip execution of queries with always false WHERE clauses - Key: HIVE-3006 URL: https://issues.apache.org/jira/browse/HIVE-3006 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Carl Steinbach Assignee: Shreepadma Venugopalan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3006) Skip execution of queries with always false WHERE clauses
[ https://issues.apache.org/jira/browse/HIVE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269100#comment-13269100 ] Carl Steinbach commented on HIVE-3006: -- Some clients (e.g. ODBC clients) execute queries with constant WHERE clauses that are always false in order to generate a result set without incurring the expense of actually executing the query. Unfortunately, this technique does not work with Hive: {noformat} hive SELECT key FROM src WHERE false; SELECT key FROM src WHERE false; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Execution log at: /tmp/carl/carl_20120505183939_4008b766-5637-4c5d-8cb0-22fb222ee228.log Job running in-process (local Hadoop) Hadoop job information for null: number of mappers: 0; number of reducers: 0 2012-05-05 18:39:15,923 null map = 100%, reduce = 0% Ended Job = job_local_0001 Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin OK Time taken: 3.664 seconds hive SELECT key FROM src WHERE 1=0; SELECT key FROM src WHERE 1=0; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Execution log at: /tmp/carl/carl_20120505184040_074bd38d-2697-40e2-996a-91a46aaad71b.log Job running in-process (local Hadoop) Hadoop job information for null: number of mappers: 0; number of reducers: 0 2012-05-05 18:40:11,407 null map = 100%, reduce = 0% Ended Job = job_local_0001 Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin OK Time taken: 3.615 seconds {noformat} Skip execution of queries with always false WHERE clauses - Key: HIVE-3006 URL: https://issues.apache.org/jira/browse/HIVE-3006 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
About the performance of job execution on Amazon EMR
Hello, As we increase the number of mappers and decrease reducers to less number, does performance increase? I have never played with setting the number mapper and reducer and I don't know how to set it. But in case of multiple nodes then how much do I need the set the mappers and reducers according to the number of instances to increase the performance. Because in case if I have large table and it requires large mapper then we had set the mappers then in such case will be a problem? I have many factors in my task which I think could create a little bit issue in performance. 1) I have created many tables at the run-time. After reusing it I am dropping it. 2) I have used lots of joins in the query. (Most of the query is taking 10-11 jobs to execute) 3) I have used indexing in the task. I am applying the index after inserting the values into table, because I have to reuse it. 4) Also I am dynamically altering the table and inserting the values. Will all this factors should be considered separately to increase the performance or it will just get solved normally by setting mappers and reducers.? Thanks for helping me. Sorry for inconvenience by continuously asking question. -- Regards, Bhavesh Shah
[jira] [Commented] (HIVE-3000) Potential infinite loop / log spew in ZookeeperHiveLockManager
[ https://issues.apache.org/jira/browse/HIVE-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269125#comment-13269125 ] Namit Jain commented on HIVE-3000: -- https://reviews.facebook.net/D3063 Potential infinite loop / log spew in ZookeeperHiveLockManager -- Key: HIVE-3000 URL: https://issues.apache.org/jira/browse/HIVE-3000 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.9.0 Reporter: Paul Yang Attachments: HIVE-3000.D3063.1.patch See ZookeeperHiveLockManger.lock() If Zookeeper is in a bad state, it's possible to get an exception (e.g. org.apache.zookeeper.KeeperException$SessionExpiredException) when we call lockPrimitive(). There is a bug in the exception handler where the loop does not exit because the break in the switch statement gets out the switch, not the do..while loop. Because tryNum was not incremented due to the exception, lockPrimitive() will be called in an infinite loop, as fast as possible. Since the exception is printed for each call, Hive will produce significant log spew. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3000) Potential infinite loop / log spew in ZookeeperHiveLockManager
[ https://issues.apache.org/jira/browse/HIVE-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3000: -- Attachment: HIVE-3000.D3063.1.patch njain requested code review of HIVE-3000 [jira] Potential infinite loop / log spew in ZookeeperHiveLockManager. Reviewers: JIRA https://issues.apache.org/jira/browse/HIVE-3000 HIVE-3000 infinite loop in ZooKeeperHiveLockManager See ZookeeperHiveLockManger.lock() If Zookeeper is in a bad state, it's possible to get an exception (e.g. org.apache.zookeeper.KeeperException$SessionExpiredException) when we call lockPrimitive(). There is a bug in the exception handler where the loop does not exit because the break in the switch statement gets out the switch, not the do..while loop. Because tryNum was not incremented due to the exception, lockPrimitive() will be called in an infinite loop, as fast as possible. Since the exception is printed for each call, Hive will produce significant log spew. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D3063 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/6963/ Tip: use the X-Herald-Rules header to filter Herald messages in your client. Potential infinite loop / log spew in ZookeeperHiveLockManager -- Key: HIVE-3000 URL: https://issues.apache.org/jira/browse/HIVE-3000 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.9.0 Reporter: Paul Yang Attachments: HIVE-3000.D3063.1.patch See ZookeeperHiveLockManger.lock() If Zookeeper is in a bad state, it's possible to get an exception (e.g. org.apache.zookeeper.KeeperException$SessionExpiredException) when we call lockPrimitive(). There is a bug in the exception handler where the loop does not exit because the break in the switch statement gets out the switch, not the do..while loop. Because tryNum was not incremented due to the exception, lockPrimitive() will be called in an infinite loop, as fast as possible. Since the exception is printed for each call, Hive will produce significant log spew. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3000) Potential infinite loop / log spew in ZookeeperHiveLockManager
[ https://issues.apache.org/jira/browse/HIVE-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3000: - Status: Patch Available (was: Open) Potential infinite loop / log spew in ZookeeperHiveLockManager -- Key: HIVE-3000 URL: https://issues.apache.org/jira/browse/HIVE-3000 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.9.0 Reporter: Paul Yang Attachments: HIVE-3000.D3063.1.patch See ZookeeperHiveLockManger.lock() If Zookeeper is in a bad state, it's possible to get an exception (e.g. org.apache.zookeeper.KeeperException$SessionExpiredException) when we call lockPrimitive(). There is a bug in the exception handler where the loop does not exit because the break in the switch statement gets out the switch, not the do..while loop. Because tryNum was not incremented due to the exception, lockPrimitive() will be called in an infinite loop, as fast as possible. Since the exception is printed for each call, Hive will produce significant log spew. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3006) Skip execution of queries with always false WHERE clauses
[ https://issues.apache.org/jira/browse/HIVE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269130#comment-13269130 ] Namit Jain commented on HIVE-3006: -- Is src a partitioned table ? In that case, I think, all the partitions would be pruned, thereby getting the right result. But, I agree, this should also work for non-partitioned tables. Skip execution of queries with always false WHERE clauses - Key: HIVE-3006 URL: https://issues.apache.org/jira/browse/HIVE-3006 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Carl Steinbach Assignee: Shreepadma Venugopalan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2804) Task log retrieval fails on secure cluster
[ https://issues.apache.org/jira/browse/HIVE-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269133#comment-13269133 ] Phabricator commented on HIVE-2804: --- njain has commented on the revision HIVE-2804 [jira] Task log retrieval fails on secure cluster. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFNPE.java:1 Can this file be added in ql/src/test instead ? You can add the relevant function in the test. REVISION DETAIL https://reviews.facebook.net/D3057 Task log retrieval fails on secure cluster -- Key: HIVE-2804 URL: https://issues.apache.org/jira/browse/HIVE-2804 Project: Hive Issue Type: Bug Components: Diagnosability, Query Processor, Security Reporter: Carl Steinbach Assignee: Zhenxiao Luo Attachments: HIVE-2804.1.patch.txt, HIVE-2804.D3057.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira