[jira] Commented: (HIVE-1605) regression and improvements in handling NULLs in joins
[ https://issues.apache.org/jira/browse/HIVE-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904088#action_12904088 ] Amareshwari Sriramadasu commented on HIVE-1605: --- MapJoinOperator.java still has a debug log. Otherwise, patch looks fine. > regression and improvements in handling NULLs in joins > -- > > Key: HIVE-1605 > URL: https://issues.apache.org/jira/browse/HIVE-1605 > Project: Hadoop Hive > Issue Type: Improvement >Reporter: Ning Zhang >Assignee: Ning Zhang > Attachments: HIVE-1605.2.patch, HIVE-1605.patch > > > There are regressions in sort-merge map join after HIVE-741. There are a lot > of OOM exceptions in SMBMapJoinOperator. This caused by the HashMap > maintained for each key to remember whether it is NULL. This takes too much > memory when the tables are large. > A second issu is in handling NULLs if the join keys are more than 1 column. > This appears in regular MapJoin as well as SMBMapJoin. The code only checks > if all the columns are NULL. It should return false in match if any joined > value is NULL. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1605) regression and improvements in handling NULLs in joins
[ https://issues.apache.org/jira/browse/HIVE-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-1605: - Attachment: HIVE-1605.2.patch Thanks Amareshwari for the review. Attached HIVE-1605.2.patch address the issues. > regression and improvements in handling NULLs in joins > -- > > Key: HIVE-1605 > URL: https://issues.apache.org/jira/browse/HIVE-1605 > Project: Hadoop Hive > Issue Type: Improvement >Reporter: Ning Zhang >Assignee: Ning Zhang > Attachments: HIVE-1605.2.patch, HIVE-1605.patch > > > There are regressions in sort-merge map join after HIVE-741. There are a lot > of OOM exceptions in SMBMapJoinOperator. This caused by the HashMap > maintained for each key to remember whether it is NULL. This takes too much > memory when the tables are large. > A second issu is in handling NULLs if the join keys are more than 1 column. > This appears in regular MapJoin as well as SMBMapJoin. The code only checks > if all the columns are NULL. It should return false in match if any joined > value is NULL. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1605) regression and improvements in handling NULLs in joins
[ https://issues.apache.org/jira/browse/HIVE-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904085#action_12904085 ] Amareshwari Sriramadasu commented on HIVE-1605: --- Ning, Thanks for looking into this. A couple of minor comments: * Patch has many debug logs and commented code. do you want to remove them? * Do you want to remove hasAllNulls method from AbstractMapJoinOperator.java? > regression and improvements in handling NULLs in joins > -- > > Key: HIVE-1605 > URL: https://issues.apache.org/jira/browse/HIVE-1605 > Project: Hadoop Hive > Issue Type: Improvement >Reporter: Ning Zhang >Assignee: Ning Zhang > Attachments: HIVE-1605.patch > > > There are regressions in sort-merge map join after HIVE-741. There are a lot > of OOM exceptions in SMBMapJoinOperator. This caused by the HashMap > maintained for each key to remember whether it is NULL. This takes too much > memory when the tables are large. > A second issu is in handling NULLs if the join keys are more than 1 column. > This appears in regular MapJoin as well as SMBMapJoin. The code only checks > if all the columns are NULL. It should return false in match if any joined > value is NULL. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1605) regression and improvements in handling NULLs in joins
[ https://issues.apache.org/jira/browse/HIVE-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-1605: - Attachment: HIVE-1605.patch Passed all test except scriptfile1.q in TestMinimrCliDriver in hadoop 0,20. This test also failed on trunk. > regression and improvements in handling NULLs in joins > -- > > Key: HIVE-1605 > URL: https://issues.apache.org/jira/browse/HIVE-1605 > Project: Hadoop Hive > Issue Type: Improvement >Reporter: Ning Zhang >Assignee: Ning Zhang > Attachments: HIVE-1605.patch > > > There are regressions in sort-merge map join after HIVE-741. There are a lot > of OOM exceptions in SMBMapJoinOperator. This caused by the HashMap > maintained for each key to remember whether it is NULL. This takes too much > memory when the tables are large. > A second issu is in handling NULLs if the join keys are more than 1 column. > This appears in regular MapJoin as well as SMBMapJoin. The code only checks > if all the columns are NULL. It should return false in match if any joined > value is NULL. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1605) regression and improvements in handling NULLs in joins
[ https://issues.apache.org/jira/browse/HIVE-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-1605: - Status: Patch Available (was: Open) > regression and improvements in handling NULLs in joins > -- > > Key: HIVE-1605 > URL: https://issues.apache.org/jira/browse/HIVE-1605 > Project: Hadoop Hive > Issue Type: Improvement >Reporter: Ning Zhang >Assignee: Ning Zhang > Attachments: HIVE-1605.patch > > > There are regressions in sort-merge map join after HIVE-741. There are a lot > of OOM exceptions in SMBMapJoinOperator. This caused by the HashMap > maintained for each key to remember whether it is NULL. This takes too much > memory when the tables are large. > A second issu is in handling NULLs if the join keys are more than 1 column. > This appears in regular MapJoin as well as SMBMapJoin. The code only checks > if all the columns are NULL. It should return false in match if any joined > value is NULL. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-471) A UDF for simple reflection
[ https://issues.apache.org/jira/browse/HIVE-471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-471: Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed Committed. Thanks Edward > A UDF for simple reflection > --- > > Key: HIVE-471 > URL: https://issues.apache.org/jira/browse/HIVE-471 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Affects Versions: 0.6.0 >Reporter: Edward Capriolo >Assignee: Edward Capriolo >Priority: Minor > Fix For: 0.7.0 > > Attachments: hive-471-gen.diff, HIVE-471.1.patch, HIVE-471.2.patch, > HIVE-471.3.patch, HIVE-471.4.patch, HIVE-471.5.patch, HIVE-471.6.patch.txt, > hive-471.diff > > > There are many methods in java that are static and have no arguments or can > be invoked with one simple parameter. More complicated functions will require > a UDF but one generic one can work as a poor-mans UDF. > {noformat} > SELECT reflect("java.lang.String", "valueOf", 1), reflect("java.lang.String", > "isEmpty") > FROM src LIMIT 1; > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1536) Add support for JDBC PreparedStatements
[ https://issues.apache.org/jira/browse/HIVE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-1536: - Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed Committed. Thanks Sean! > Add support for JDBC PreparedStatements > --- > > Key: HIVE-1536 > URL: https://issues.apache.org/jira/browse/HIVE-1536 > Project: Hadoop Hive > Issue Type: Improvement > Components: Drivers >Affects Versions: 0.6.0 >Reporter: Sean Flatley >Assignee: Sean Flatley > Fix For: 0.7.0 > > Attachments: all-tests-ant.log, HIVE-1536-2.patch, HIVE-1536-3.patch, > HIVE-1536-changes-2.txt, HIVE-1536-changes-3.txt, HIVE-1536-changes.txt, > HIVE-1536.patch, JdbcDriverTest-ant-2.log, JdbcDriverTest-ant.log, > TestJdbcDriver-ant-3.log > > > As a result of a Sprint which had us using Pentaho Data Integration with the > Hive database we have updated the driver. Many PreparedStatement methods > have been implemented. A patch will be attached tomorrow with a summary of > changes. > Note: A checkout of Hive/trunk was performed and the TestJdbcDriver test > cased was run. This was done before any modifications were made to the > checked out project. The testResultSetMetaData failed: > java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED: > Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask > at > org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:189) > at > org.apache.hadoop.hive.jdbc.TestJdbcDriver.testResultSetMetaData(TestJdbcDriver.java:530) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at junit.framework.TestCase.runTest(TestCase.java:154) > at junit.framework.TestCase.runBare(TestCase.java:127) > at junit.framework.TestResult$1.protect(TestResult.java:106) > at junit.framework.TestResult.runProtected(TestResult.java:124) > at junit.framework.TestResult.run(TestResult.java:109) > at junit.framework.TestCase.run(TestCase.java:118) > at junit.framework.TestSuite.runTest(TestSuite.java:208) > at junit.framework.TestSuite.run(TestSuite.java:203) > at > org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420) > at > org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911) > at > org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768) > A co-worker did the same and the tests passed. Both environments were Ubuntu > and Hadoop version 0.20.2. > Tests added to the TestJdbcDriver by us were successful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-471) A UDF for simple reflection
[ https://issues.apache.org/jira/browse/HIVE-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903987#action_12903987 ] Namit Jain commented on HIVE-471: - +1 will commit if the tests pass > A UDF for simple reflection > --- > > Key: HIVE-471 > URL: https://issues.apache.org/jira/browse/HIVE-471 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Affects Versions: 0.6.0 >Reporter: Edward Capriolo >Assignee: Edward Capriolo >Priority: Minor > Fix For: 0.7.0 > > Attachments: hive-471-gen.diff, HIVE-471.1.patch, HIVE-471.2.patch, > HIVE-471.3.patch, HIVE-471.4.patch, HIVE-471.5.patch, HIVE-471.6.patch.txt, > hive-471.diff > > > There are many methods in java that are static and have no arguments or can > be invoked with one simple parameter. More complicated functions will require > a UDF but one generic one can work as a poor-mans UDF. > {noformat} > SELECT reflect("java.lang.String", "valueOf", 1), reflect("java.lang.String", > "isEmpty") > FROM src LIMIT 1; > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Running JUnit with eclipse on a fresh checkout of hive trunk fails with NPE
Hi, Trying to checkout some hive code here. Ubuntu 64bit, Java 1.6 64bit, Eclipse 3.6. For the following chain of events: ma...@maxim-desktop:~/workspace$ svn co http://svn.apache.org/repos/asf/hadoop/hive/trunk hadoop-hive-trunk ma...@maxim-desktop:~/workspace$ cd hadoop-hive-trunk ma...@maxim-desktop:~/workspace/hadoop-hive-trunk$ ant tar ma...@maxim-desktop:~/workspace/hadoop-hive-trunk$ ant eclipse-files >> Import new project in eclipse >> Right click on hadoop-hive-trunk > Run As > JUnit Test I get the following error: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.TestExecDriver.(TestExecDriver.java:102) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at junit.framework.TestSuite.createTest(TestSuite.java:131) at junit.framework.TestSuite.addTestMethod(TestSuite.java:114) at junit.framework.TestSuite.(TestSuite.java:75) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestLoader.getTest(JUnit3TestLoader.java:102) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestLoader.loadTests(JUnit3TestLoader.java:59) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:452) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) java.lang.ExceptionInInitializerError at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at junit.framework.TestSuite.createTest(TestSuite.java:131) at junit.framework.TestSuite.addTestMethod(TestSuite.java:114) at junit.framework.TestSuite.(TestSuite.java:75) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestLoader.getTest(JUnit3TestLoader.java:102) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestLoader.loadTests(JUnit3TestLoader.java:59) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:452) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) Caused by: java.lang.RuntimeException: Encountered throwable at org.apache.hadoop.hive.ql.exec.TestExecDriver.(TestExecDriver.java:127) ... 13 more What am I doing wrong? I would like to be able to run the hive unit test to play a bit with the code. Thank you, Maxim.