[jira] [Commented] (HIVE-2471) Add timestamp column to the partition stats table.
[ https://issues.apache.org/jira/browse/HIVE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233258#comment-13233258 ] Hudson commented on HIVE-2471: -- Integrated in Hive-trunk-h0.21 #1322 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1322/]) HIVE-2471 Add timestamp column to the partition stats table. (Kevin Wilfong via namit) (Revision 1302739) Result = FAILURE namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302739 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsSetupConstants.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java Add timestamp column to the partition stats table. -- Key: HIVE-2471 URL: https://issues.apache.org/jira/browse/HIVE-2471 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2471.1.patch.txt, HIVE-2471.D2367.1.patch, HIVE-2471.D2367.2.patch, HIVE-2471.D2367.3.patch Occasionally, when entries are added to the partition stats table the program is halted before it can delete those entries, by an exception, keyboard interrupt, etc. These build up to the point where the table gets very large, and it hurts the performance of the update statement which is often called. In order to fix this, I am adding a column to the table which is auto-populated with the current timestamp. This will allow us to create scripts that go through periodically and clean out old entries from the table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2881) Remove redundant key comparing in SMBMapJoinOperator
Remove redundant key comparing in SMBMapJoinOperator Key: HIVE-2881 URL: https://issues.apache.org/jira/browse/HIVE-2881 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-2881.D2379.1.patch Currently, SMBJoin compares keys twice in #findSmallestKey and #joinObject. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2881) Remove redundant key comparing in SMBMapJoinOperator
[ https://issues.apache.org/jira/browse/HIVE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2881: -- Attachment: HIVE-2881.D2379.1.patch navis requested code review of HIVE-2881 [jira] Remove redundant key comparing in SMBMapJoinOperator. Reviewers: JIRA DPAL-988 Remove redundant key comparing in SMBMapJoinOperator Currently, SMBJoin compares keys twice in #findSmallestKey and #joinObject. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D2379 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/5331/ Tip: use the X-Herald-Rules header to filter Herald messages in your client. Remove redundant key comparing in SMBMapJoinOperator Key: HIVE-2881 URL: https://issues.apache.org/jira/browse/HIVE-2881 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-2881.D2379.1.patch Currently, SMBJoin compares keys twice in #findSmallestKey and #joinObject. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-2084: --- Attachment: HIVE-2084.2.patch.txt Updated patch, and changed to Datanucleus v3.0.8. Does anyone still have any failing tests with this upgrade? Upgrade datanucleus from 2.0.3 to 3.0.1 --- Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Carl Steinbach Labels: datanucleus Attachments: HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2882) Problem with Hive using JDBC
Problem with Hive using JDBC Key: HIVE-2882 URL: https://issues.apache.org/jira/browse/HIVE-2882 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.7.1 Environment: Operating System - Ubuntu 11.10 Softwares - Hadoop-0.20.2, Hive-0.7.1 Reporter: Bhavesh Shah Priority: Critical I am trying to implement a task in Hive (Similar to Stored Procedure in SQL (Block of queries)). In SQL, when we write cursor, first we execute select query and then fetching the records we perform some actions. Likely I have fired a select query in Hive as: String driverName = org.apache.hadoop.hive.jdbc.HiveDriver; Class.forName(driverName); Connection con = DriverManager.getConnection(jdbc:hive://localhost:1/default, , ); String sql=null; Statement stmt = con.createStatement(); Statement stmt1 = con.createStatement(); ResultSet res=null; ResultSet rs1=null; sql=select a,c,b from tbl_name; res=stmt.executeQuery();--- CONTAINS 30 RECORDS while(res.next()) { sql=select d,e,f, from t1; rs1=stmt1.executeQuery(); like wise many queries are there. . . . .. } But the problem is that while loop executes only once instead of 30 times when the inner query (inside while) gets execute. And If I create two different connection for both the queries then all works fine. Like: String driverName = org.apache.hadoop.hive.jdbc.HiveDriver; Class.forName(driverName); Connection con = DriverManager.getConnection(jdbc:hive://localhost:1/default, , ); Connection con1 = DriverManager.getConnection(jdbc:hive://localhost:1/default, , ); String sql=null; Statement stmt = con.createStatement(); Statement stmt1 = con1.createStatement(); ResultSet res=null; ResultSet rs1=null; To sum up, when I iterate through a result set do I need to use a different connection(and statement object) to execute other queries -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2883) Metastore client doesnt close connection properly
Metastore client doesnt close connection properly - Key: HIVE-2883 URL: https://issues.apache.org/jira/browse/HIVE-2883 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan While closing connection, it always fail with following trace. Seemingly, it doesnt have any harmful effects. {code} 12/03/20 10:55:02 ERROR hive.metastore: Unable to shutdown local metastore client org.apache.thrift.transport.TTransportException: Cannot write to null outputStream at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142) at org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163) at org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:91) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) at com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:421) at com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:415) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:310) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2884) When distinct is used in from of STDDEV, the statement fails with a Java exception error
When distinct is used in from of STDDEV, the statement fails with a Java exception error -- Key: HIVE-2884 URL: https://issues.apache.org/jira/browse/HIVE-2884 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.7.1 Reporter: Mauro Cazzari Given the following statement: select distinct STDDEV_SAMP(TXT_1.`age`) as AgeAlias, STDDEV_SAMP(TXT_1.`weight`) as WeightAlias from `CLASS` TXT_1; Hive generates a Java SQL exception error upon execution. If the distinct is removed, the statement runs fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2885) select distinct string from table returns wrong result
select distinct string from table returns wrong result Key: HIVE-2885 URL: https://issues.apache.org/jira/browse/HIVE-2885 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.8.1 Reporter: Mauro Cazzari Give the following table: CREATE TABLE `MYTAB` (`a` DOUBLE) containing the values 1, 2, 3, and 4, the following SQL fails to produce the correct result: select distinct 'FOO' from `MYTAB` Note that this issue didn't show up with Hive 7. Only 8.1 seems to be affected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2886) distinct with order by fails with Java SQL exception.
distinct with order by fails with Java SQL exception. - Key: HIVE-2886 URL: https://issues.apache.org/jira/browse/HIVE-2886 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.7.1 Reporter: Mauro Cazzari The following select: select distinct TXT_1.`a`, TXT_1.`b` from `MYTAB` TXT_1 order by TXT_1.`a` asc fails with a Java SQL exception. Note that if the distinct or the table alias is removed from the SQL, the statement executes fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2885) select distinct string from table returns wrong result
[ https://issues.apache.org/jira/browse/HIVE-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233663#comment-13233663 ] Mauro Cazzari commented on HIVE-2885: - I just verified that the problem seems to have gone away with the latest version of Hive. If anyone knows which Hive issue fixed this, feel free to mark it as a DUP. Thanks! select distinct string from table returns wrong result Key: HIVE-2885 URL: https://issues.apache.org/jira/browse/HIVE-2885 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.8.1 Reporter: Mauro Cazzari Give the following table: CREATE TABLE `MYTAB` (`a` DOUBLE) containing the values 1, 2, 3, and 4, the following SQL fails to produce the correct result: select distinct 'FOO' from `MYTAB` Note that this issue didn't show up with Hive 7. Only 8.1 seems to be affected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reassigned HIVE-2084: Assignee: Sushanth Sowmyan (was: Carl Steinbach) @Sushanth: Can you please submit a review request on phabricator? Thanks. https://cwiki.apache.org/Hive/phabricatorcodereview.html Upgrade datanucleus from 2.0.3 to 3.0.1 --- Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Sushanth Sowmyan Labels: datanucleus Attachments: HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 1323 - Still Failing
Changes for Build #1322 [namit] HIVE-2471 Add timestamp column to the partition stats table. (Kevin Wilfong via namit) Changes for Build #1323 1 tests failed. REGRESSION: org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore Error Message: null Stack Trace: java.lang.NullPointerException at org.apache.hadoop.hive.metastore.HiveMetaStore.getDelegationToken(HiveMetaStore.java:2749) at org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.getDelegationTokenStr(TestHadoop20SAuthBridge.java:296) at org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.obtainTokenAndAddIntoUGI(TestHadoop20SAuthBridge.java:303) at org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore(TestHadoop20SAuthBridge.java:212) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785) The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1323) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1323/ to view the results.
[jira] [Commented] (HIVE-942) use bucketing for group by
[ https://issues.apache.org/jira/browse/HIVE-942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233789#comment-13233789 ] Kevin Wilfong commented on HIVE-942: If the table is bucketed and sorted on the group by keys, the group by operator would not need to do a hash, which would help with memory consumption. use bucketing for group by -- Key: HIVE-942 URL: https://issues.apache.org/jira/browse/HIVE-942 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Group by on a bucketed column can be completely performed on the mapper if the split can be adjusted to span the key boundary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2262) mapjoin followed by union all, groupby does not work
[ https://issues.apache.org/jira/browse/HIVE-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-2262. Resolution: Fixed This is no longer reproducible on trunk. Feel free to reopen if there is some other variant which can produce this. mapjoin followed by union all, groupby does not work Key: HIVE-2262 URL: https://issues.apache.org/jira/browse/HIVE-2262 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.7.1 Reporter: yu xiang Priority: Trivial sql: CREATE TABLE nulltest2(int_data1 INT, int_data2 INT, boolean_data BOOLEAN, double_data DOUBLE, string_data STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE nulltest3(int_data1 INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; explain select int_data2,count(1) from (select /*+mapjoin(a)*/ int_data2, 1 as c1, 0 as c2 from nulltest2 a join nulltest3 b on(a.int_data1 = b.int_data1) union all select /*+mapjoin(a)*/ int_data2, 1 as c1, 2 as c2 from nulltest2 a join nulltest3 b on(a.int_data1 = b.int_data1)) mapjointable group by int_data2; exception: FAILED: Hive Internal Error: java.lang.NullPointerException(null) java.lang.NullPointerException at org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:156) at org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan(GenMapRedUtils.java:551) at org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan(GenMapRedUtils.java:514) at org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.initPlan(GenMapRedUtils.java:125) at org.apache.hadoop.hive.ql.optimizer.GenMRRedSink1.process(GenMRRedSink1.java:76) at org.apache.hadoop.hive.ql.optimizer.GenMRRedSink3.process(GenMRRedSink3.java:64) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) Analyse the reason: 1.When use mapjoin,union,groupby together,the UnionProcFactory.MapJoinUnion()(optimizer) will set the MapJoinSubq true, and set up the UnionParseContext. 2.In GenMRUnion1, hive will call mergeMapJoinUnion, and also set task plan. 3.In GenMRRedSink3, hive judges the uCtx.isMapOnlySubq(), and call GenMRRedSink1()).process() to init the plan.But the utask's plan has been set yet, it just need to set reducer.And also the utask is processing temporary table, there is no topOp map to table.So here we get null exception. Solutions: 1.SQL solution:use a sub query to modify the sql; 2.Code solution:when in mergeMapJoinUnion, after the task plan have been set, set a settaskplan flag true to indicate the plan for this utask has been set.When in GenMRRedSink3 ,if this flag sets true, don't use the GenMRRedSink1()).process() to reinit the plan. if (uCtx.isMapOnlySubq()!upc.isIssetTaskPlan()) I don't know whether the code solution is suitable. Is there any better solution? thx -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2870) Throw an error when a nonexistent partition is accessed in strict mode
[ https://issues.apache.org/jira/browse/HIVE-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233825#comment-13233825 ] Phabricator commented on HIVE-2870: --- kevinwilfong has commented on the revision HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode. Could you add a test case for this, e.g. look at the files in ql/src/test/queries/clientnegative REVISION DETAIL https://reviews.facebook.net/D2319 Throw an error when a nonexistent partition is accessed in strict mode -- Key: HIVE-2870 URL: https://issues.apache.org/jira/browse/HIVE-2870 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Lucian Adrian Grijincu Priority: Minor Attachments: HIVE-2870.D2319.1.patch, HIVE-2870.D2319.2.patch Original Estimate: 24h Remaining Estimate: 24h When a table does not exist and someone tries to read from it in a query, Hive throws an error. But if a partition is directly accessed that does not exist, an error is not thrown. This is inconsistent and also leads to a lot of confused users who get no output. This task is to cause Hive to throw an error when the partition pruner for a query eliminates ALL existing partitions for some table when running in strict mode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2870) Throw an error when a nonexistent partition is accessed in strict mode
[ https://issues.apache.org/jira/browse/HIVE-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233844#comment-13233844 ] Phabricator commented on HIVE-2870: --- kevinwilfong has commented on the revision HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode. By a test case, I meant add a query to that list to test that the query actually does fail if there are no partitions. Also, it might be a good idea to add a test case in client positive to make sure you can turn it off. See https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-AddaUnitTest INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java:246-252 Sorry to be picky, but could you also add a small comment here just giving a quick explanation of what it does. REVISION DETAIL https://reviews.facebook.net/D2319 Throw an error when a nonexistent partition is accessed in strict mode -- Key: HIVE-2870 URL: https://issues.apache.org/jira/browse/HIVE-2870 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Lucian Adrian Grijincu Priority: Minor Attachments: HIVE-2870.D2319.1.patch, HIVE-2870.D2319.2.patch Original Estimate: 24h Remaining Estimate: 24h When a table does not exist and someone tries to read from it in a query, Hive throws an error. But if a partition is directly accessed that does not exist, an error is not thrown. This is inconsistent and also leads to a lot of confused users who get no output. This task is to cause Hive to throw an error when the partition pruner for a query eliminates ALL existing partitions for some table when running in strict mode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2877) TABLESAMPLE(x PERCENT) tests fail on 0.22/0.23
[ https://issues.apache.org/jira/browse/HIVE-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233888#comment-13233888 ] Carl Steinbach commented on HIVE-2877: -- There are two distinct problems: 1) Many of the queries in split_sample.q and sample_islocalmode_hook.q are nondeterministic. This can be fixed by adding ORDER BY clauses. 2) The second problem is more serious. Both of the tests set mapred.max.split.size=300 and hive.merge.smallfiles.avgsize=1 in an effort to force the generation of multiple splits and multiple output files. However, Hadoop 0.20 is incapable of generating splits smaller than the block size when using CombineFileInputFormat, so only one split is generated. This has a significant impact on the results of the TABLESAMPLE(x PERCENT). This issue was fixed in MAPREDUCE-2046 which is included in 0.23. Suggested Fixes: # Make the queries deterministic # Restrict these tests to Hadoop versions = 0.22 TABLESAMPLE(x PERCENT) tests fail on 0.22/0.23 -- Key: HIVE-2877 URL: https://issues.apache.org/jira/browse/HIVE-2877 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2797) Make the IP address of a Thrift client available to HMSHandler.
[ https://issues.apache.org/jira/browse/HIVE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2797: -- Attachment: HIVE-2797.D1701.5.patch kevinwilfong updated the revision HIVE-2797 [jira] Make the IP address of a Thrift client available to HMSHandler.. Reviewers: JIRA, njain, ashutoshc Really sorry about that Ashutosh. The TUGIContainingTranspport does not extend TSocket, which caused the errors you saw. I added a getSocket method to the class which returns the Socket object if the underlying TTransport is an instance of TSocket, otherwise null. TUGIBasedProcessor's implementation of setIpAddress now uses this method and handles the case of null. REVISION DETAIL https://reviews.facebook.net/D1701 AFFECTED FILES shims/src/common/java/org/apache/hadoop/hive/thrift/TUGIContainingTransport.java metastore/src/test/org/apache/hadoop/hive/metastore/TestRemoteUGIHiveMetaStoreIpAddress.java metastore/src/test/org/apache/hadoop/hive/metastore/IpAddressListener.java metastore/src/test/org/apache/hadoop/hive/metastore/TestRemoteHiveMetaStoreIpAddress.java metastore/src/java/org/apache/hadoop/hive/metastore/TUGIBasedProcessor.java metastore/src/java/org/apache/hadoop/hive/metastore/TSetIpAddressProcessor.java metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java Make the IP address of a Thrift client available to HMSHandler. --- Key: HIVE-2797 URL: https://issues.apache.org/jira/browse/HIVE-2797 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2797.D1701.1.patch, HIVE-2797.D1701.2.patch, HIVE-2797.D1701.3.patch, HIVE-2797.D1701.4.patch, HIVE-2797.D1701.5.patch Currently, in unsecured mode, metastore Thrift calls are, from the HMSHandler's point of view, anonymous. If we expose the IP address of the Thrift client to the HMSHandler from the Processor, this will help to give some context, in particular for audit logging, of where the call is coming from. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2797) Make the IP address of a Thrift client available to HMSHandler.
[ https://issues.apache.org/jira/browse/HIVE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-2797: Status: Patch Available (was: Open) Make the IP address of a Thrift client available to HMSHandler. --- Key: HIVE-2797 URL: https://issues.apache.org/jira/browse/HIVE-2797 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2797.D1701.1.patch, HIVE-2797.D1701.2.patch, HIVE-2797.D1701.3.patch, HIVE-2797.D1701.4.patch, HIVE-2797.D1701.5.patch Currently, in unsecured mode, metastore Thrift calls are, from the HMSHandler's point of view, anonymous. If we expose the IP address of the Thrift client to the HMSHandler from the Processor, this will help to give some context, in particular for audit logging, of where the call is coming from. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2797) Make the IP address of a Thrift client available to HMSHandler.
[ https://issues.apache.org/jira/browse/HIVE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233897#comment-13233897 ] Phabricator commented on HIVE-2797: --- kevinwilfong has commented on the revision HIVE-2797 [jira] Make the IP address of a Thrift client available to HMSHandler.. I added a test case which verifies that the IP address is accessible when the setugi config variable is true. In addition I ran the entire test suite, TestHiveServerSessions timed out, but that seems to be an unrelated issue as it does the same on a fresh checkout. REVISION DETAIL https://reviews.facebook.net/D1701 Make the IP address of a Thrift client available to HMSHandler. --- Key: HIVE-2797 URL: https://issues.apache.org/jira/browse/HIVE-2797 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2797.D1701.1.patch, HIVE-2797.D1701.2.patch, HIVE-2797.D1701.3.patch, HIVE-2797.D1701.4.patch, HIVE-2797.D1701.5.patch Currently, in unsecured mode, metastore Thrift calls are, from the HMSHandler's point of view, anonymous. If we expose the IP address of the Thrift client to the HMSHandler from the Processor, this will help to give some context, in particular for audit logging, of where the call is coming from. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2084: -- Attachment: HIVE-2084.D2397.1.patch khorgath requested code review of HIVE-2084 [jira] Upgrade datanucleus from 2.0.3 to 3.0.1. Reviewers: JIRA Updated HIVE-2084 to work off DataNucleus release 3.0.8 It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. TEST PLAN existing tests (this is a library dep upgrade) REVISION DETAIL https://reviews.facebook.net/D2397 AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java conf/hive-default.xml.template ivy/libraries.properties metastore/ivy.xml ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/5367/ Tip: use the X-Herald-Rules header to filter Herald messages in your client. Upgrade datanucleus from 2.0.3 to 3.0.1 --- Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Sushanth Sowmyan Labels: datanucleus Attachments: HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.D2397.1.patch, HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233928#comment-13233928 ] Sushanth Sowmyan commented on HIVE-2084: Updated : https://reviews.facebook.net/D2397 (Also, please ignore the patch file I'd attached here before, I'd generated it from the hcatalog root dir, so it contains extra directory structure) Upgrade datanucleus from 2.0.3 to 3.0.1 --- Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Sushanth Sowmyan Labels: datanucleus Attachments: HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.D2397.1.patch, HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2881) Remove redundant key comparing in SMBMapJoinOperator
[ https://issues.apache.org/jira/browse/HIVE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2881: Status: Patch Available (was: Open) Passed all tests. Remove redundant key comparing in SMBMapJoinOperator Key: HIVE-2881 URL: https://issues.apache.org/jira/browse/HIVE-2881 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-2881.D2379.1.patch Currently, SMBJoin compares keys twice in #findSmallestKey and #joinObject. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2797) Make the IP address of a Thrift client available to HMSHandler.
[ https://issues.apache.org/jira/browse/HIVE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233950#comment-13233950 ] Phabricator commented on HIVE-2797: --- ashutoshc has accepted the revision HIVE-2797 [jira] Make the IP address of a Thrift client available to HMSHandler.. No worries, Kevin. Thanks, for making changes. +1 Feel free to commit if tests pass. REVISION DETAIL https://reviews.facebook.net/D1701 BRANCH svn Make the IP address of a Thrift client available to HMSHandler. --- Key: HIVE-2797 URL: https://issues.apache.org/jira/browse/HIVE-2797 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2797.D1701.1.patch, HIVE-2797.D1701.2.patch, HIVE-2797.D1701.3.patch, HIVE-2797.D1701.4.patch, HIVE-2797.D1701.5.patch Currently, in unsecured mode, metastore Thrift calls are, from the HMSHandler's point of view, anonymous. If we expose the IP address of the Thrift client to the HMSHandler from the Processor, this will help to give some context, in particular for audit logging, of where the call is coming from. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1555) JDBC Storage Handler
[ https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233997#comment-13233997 ] Weihua Jiang commented on HIVE-1555: Hi Andrew, How about the progress of integration now? Where can I find your patch? I am very interested in this feature. I think I can provide some help on your work. JDBC Storage Handler Key: HIVE-1555 URL: https://issues.apache.org/jira/browse/HIVE-1555 Project: Hive Issue Type: New Feature Components: JDBC Reporter: Bob Robertson Assignee: Andrew Wilson Attachments: JDBCStorageHandler Design Doc.pdf Original Estimate: 24h Remaining Estimate: 24h With the Cassandra and HBase Storage Handlers I thought it would make sense to include a generic JDBC RDBMS Storage Handler so that you could import a standard DB table into Hive. Many people must want to perform HiveQL joins, etc against tables in other systems etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira