Questions about hive authorization under hdfs permissions.
Hi, all I have enabled hive authorization in my testing cluster. I use the user hive to create database hivedb and grant create privilege on hivedb to user root. But I come across the following problem that root can not create table in hivedb even it has the create privilege. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=root, access=WRITE, inode=/tmp/user/hive/warehouse/hivedb.db:hive:hadoop:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:214) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:158) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5499) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5481) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5455) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3455) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3425) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3397) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:724) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:502) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:48089) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042) It is obviously that the hivedb.db directory in HDFS are not allowed to be written by other user. So how does hive authorization work under the HDFS permissions? PS. if I create a table by user hive and grant update privilege to user root. The same ERROR will come across if I load data into the table by root. Look forward to your reply! Thanks
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031498#comment-14031498 ] Hive QA commented on HIVE-7105: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650150/HIVE-7105.2.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 5611 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/460/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/460/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-460/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12650150 Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
Questions about hive authorization under hdfs permissions.
Hi, all I have enabled hive authorization in my testing cluster. I use the user hive to create database hivedb and grant create privilege on hivedb to user root. But I come across the following problem that root can not create table in hivedb even it has the create privilege. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=root, access=WRITE, inode=/tmp/user/hive/ warehouse/hivedb.db:hive:hadoop:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. check(FSPermissionChecker.java:234) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. check(FSPermissionChecker.java:214) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. checkPermission(FSPermissionChecker.java:158) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. checkPermission(FSNamesystem.java:5499) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. checkPermission(FSNamesystem.java:5481) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. checkAncestorAccess(FSNamesystem.java:5455) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. mkdirsInternal(FSNamesystem.java:3455) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. mkdirsInt(FSNamesystem.java:3425) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs( FSNamesystem.java:3397) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer. mkdirs(NameNodeRpcServer.java:724) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSi deTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:502) at org.apache.hadoop.hdfs.protocol.proto. ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod( ClientNamenodeProtocolProtos.java:48089) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs( UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042) It is obviously that the hivedb.db directory in HDFS are not allowed to be written by other user. So how does hive authorization work under the HDFS permissions? PS. if I create a table by user hive and grant update privilege to user root. The same ERROR will come across if I load data into the table by root. Look forward to your reply! Thanks
[jira] [Created] (HIVE-7232) ReduceSink is emitting NULL keys due to failed keyEval
Gopal V created HIVE-7232: - Summary: ReduceSink is emitting NULL keys due to failed keyEval Key: HIVE-7232 URL: https://issues.apache.org/jira/browse/HIVE-7232 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Gopal V After HIVE-4867 has been merged in, some queries have exhibited a very weird skew towards NULL keys emitted from the ReduceSinkOperator. Added extra logging to print expr.column() in ExprNodeColumnEvaluator in reduce sink. {code} 2014-06-14 00:37:19,186 INFO [TezChild] org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: numDistributionKeys = 1 {null -- ExprNodeColumnEvaluator(_col10)} key_row={reducesinkkey0:442} {code} {code} HiveKey firstKey = toHiveKey(cachedKeys[0], tag, null); int distKeyLength = firstKey.getDistKeyLength(); if(distKeyLength = 1) { StringBuffer x1 = new StringBuffer(); x1.append(numDistributionKeys = + numDistributionKeys + \n); for (int i = 0; i numDistributionKeys; i++) { x1.append(cachedKeys[0][i] + -- + keyEval[i] + \n); } x1.append(key_row=+ SerDeUtils.getJSONString(row, keyObjectInspector)); LOG.info(GOPAL: + x1.toString()); } {code} The query is tpc-h query5, with extra NULL checks just to be sure. {code} ELECT n_name, sum(l_extendedprice * (1 - l_discount)) AS revenue FROM customer, orders, lineitem, supplier, nation, region WHERE c_custkey = o_custkey AND l_orderkey = o_orderkey AND l_suppkey = s_suppkey AND c_nationkey = s_nationkey AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA' AND o_orderdate = '1994-01-01' AND o_orderdate '1995-01-01' and l_orderkey is not null and c_custkey is not null and l_suppkey is not null and c_nationkey is not null and s_nationkey is not null and n_regionkey is not null GROUP BY n_name ORDER BY revenue DESC; {code} The reducer which has the issue has the following plan {code} Reducer 3 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {KEY.reducesinkkey0} {VALUE._col2} 1 {VALUE._col0} {KEY.reducesinkkey0} {VALUE._col3} outputColumnNames: _col0, _col3, _col10, _col11, _col14 Statistics: Num rows: 18344 Data size: 95229140992 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col10 (type: int) sort order: + Map-reduce partition columns: _col10 (type: int) Statistics: Num rows: 18344 Data size: 95229140992 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: int), _col3 (type: int), _col11 (type: int), _col14 (type: string) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6928) Beeline should not chop off describe extended results by default
[ https://issues.apache.org/jira/browse/HIVE-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031508#comment-14031508 ] Szehon Ho commented on HIVE-6928: - The concern does make sense. I only have mysql now, and saw they are wrapping long lines by default. I dont have that much DB-CLI exp, but Beeline is the only one I've seen that by default sets each Row's max-width to be the console width at startup time. Then it keeps truncating to that size even if you resize the window, it's not a great experience. Changing the default (table format) to wrap long lines like mysql sounds better to better. Thoughts? Beeline should not chop off describe extended results by default -- Key: HIVE-6928 URL: https://issues.apache.org/jira/browse/HIVE-6928 Project: Hive Issue Type: Bug Components: CLI Reporter: Szehon Ho Assignee: Chinna Rao Lalam Attachments: HIVE-6928.1.patch, HIVE-6928.2.patch, HIVE-6928.patch By default, beeline truncates long results based on the console width like: {code} +-+--+ | col_name | | +-+--+ | pat_id | string | | score | float | | acutes | float | | | | | Detailed Table Information | Table(tableName:refills, dbName:default, owner:hdadmin, createTime:1393882396, lastAccessTime:0, retention:0, sd:Sto | +-+--+ 5 rows selected (0.4 seconds) {code} This can be changed by !outputformat, but the default should behave better to give a better experience to the first-time beeline user. -- This message was sent by Atlassian JIRA (v6.2#6252)
Questions about hive authorization under hdfs permissions.
Hi, all I have enabled hive authorization in my testing cluster. I use the user hive to create database hivedb and grant create privilege on hivedb to user root. But I come across the following problem that root can not create table in hivedb even it has the create privilege. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=root, access=WRITE, inode=/tmp/user/hive/ warehouse/hivedb.db:hive:hadoop:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. check(FSPermissionChecker.java:234) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. check(FSPermissionChecker.java:214) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. checkPermission(FSPermissionChecker.java:158) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. checkPermission(FSNamesystem.java:5499) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. checkPermission(FSNamesystem.java:5481) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. checkAncestorAccess(FSNamesystem.java:5455) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. mkdirsInternal(FSNamesystem.java:3455) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. mkdirsInt(FSNamesystem.java:3425) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs( FSNamesystem.java:3397) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer. mkdirs(NameNodeRpcServer.java:724) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSi deTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:502) at org.apache.hadoop.hdfs.protocol.proto. ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod( ClientNamenodeProtocolProtos.java:48089) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs( UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042) It is obviously that the hivedb.db directory in HDFS are not allowed to be written by other user. So how does hive authorization work under the HDFS permissions? PS. if I create a table by user hive and grant update privilege to user root. The same ERROR will come across if I load data into the table by root. Look forward to your reply! Thanks
[jira] [Commented] (HIVE-7208) move SearchArgument interface into serde package
[ https://issues.apache.org/jira/browse/HIVE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031547#comment-14031547 ] Hive QA commented on HIVE-7208: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650187/HIVE-7208.01.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5536 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_file_format org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_minimr_broken_pipe {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/461/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/461/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-461/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12650187 move SearchArgument interface into serde package Key: HIVE-7208 URL: https://issues.apache.org/jira/browse/HIVE-7208 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-7208.01.patch, HIVE-7208.patch For usage in alternative input formats/serdes, it might be useful to move SearchArgument class to a place that is not in ql (because it's hard to depend on ql). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7212) Use resource re-localization instead of restarting sessions in Tez
[ https://issues.apache.org/jira/browse/HIVE-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031572#comment-14031572 ] Hive QA commented on HIVE-7212: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650200/HIVE-7212.3.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 5611 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testSubmit org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/462/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/462/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-462/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12650200 Use resource re-localization instead of restarting sessions in Tez -- Key: HIVE-7212 URL: https://issues.apache.org/jira/browse/HIVE-7212 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7212.1.patch, HIVE-7212.2.patch, HIVE-7212.3.patch scriptfile1.q is failing on Tez because of a recent breakage in localization. On top of that we're currently restarting sessions if the resources have changed. (add file/add jar/etc). Instead of doing this we should just have tez relocalize these new resources. This way no session/AM restart is required. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6928) Beeline should not chop off describe extended results by default
[ https://issues.apache.org/jira/browse/HIVE-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6928: -- Status: Open (was: Patch Available) Cancel the patch until the above concerns are addressed. Beeline should not chop off describe extended results by default -- Key: HIVE-6928 URL: https://issues.apache.org/jira/browse/HIVE-6928 Project: Hive Issue Type: Bug Components: CLI Reporter: Szehon Ho Assignee: Chinna Rao Lalam Attachments: HIVE-6928.1.patch, HIVE-6928.2.patch, HIVE-6928.patch By default, beeline truncates long results based on the console width like: {code} +-+--+ | col_name | | +-+--+ | pat_id | string | | score | float | | acutes | float | | | | | Detailed Table Information | Table(tableName:refills, dbName:default, owner:hdadmin, createTime:1393882396, lastAccessTime:0, retention:0, sd:Sto | +-+--+ 5 rows selected (0.4 seconds) {code} This can be changed by !outputformat, but the default should behave better to give a better experience to the first-time beeline user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.
[ https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031582#comment-14031582 ] Xuefu Zhang commented on HIVE-7100: --- [~ashutoshc] Since you voiced opinion before, do you have any thoughts on the above proposal? Users of hive should be able to specify skipTrash when dropping tables. --- Key: HIVE-7100 URL: https://issues.apache.org/jira/browse/HIVE-7100 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Ravi Prakash Assignee: Jayesh Attachments: HIVE-7100.patch Users of our clusters are often running up against their quota limits because of Hive tables. When they drop tables, they have to then manually delete the files from HDFS using skipTrash. This is cumbersome and unnecessary. We should enable users to skipTrash directly when dropping tables. We should also be able to provide this functionality without polluting SQL syntax. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031629#comment-14031629 ] Brock Noland commented on HIVE-6938: Yes, parquet.column.index.access needs to be documented. Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Assignee: Daniel Weeks Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch, HIVE-6938.2.patch, HIVE-6938.3.patch, HIVE-6938.3.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7211) Throws exception if the name of conf var starts with hive. does not exists in HiveConf
[ https://issues.apache.org/jira/browse/HIVE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031630#comment-14031630 ] Hive QA commented on HIVE-7211: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650209/HIVE-7211.3.patch.txt {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 5536 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_map_queries org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats2 org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats3 org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats_empty_partition org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/463/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/463/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-463/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12650209 Throws exception if the name of conf var starts with hive. does not exists in HiveConf Key: HIVE-7211 URL: https://issues.apache.org/jira/browse/HIVE-7211 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-7211.1.patch.txt, HIVE-7211.2.patch.txt, HIVE-7211.3.patch.txt Some typos in configurations are very hard to find. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7224) Set incremental printing to true by default in Beeline
[ https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031645#comment-14031645 ] Vaibhav Gumashta commented on HIVE-7224: [~thejas] Sure, let me look at that. If that is the case, I think we should have it consistent for both incremental/non-incremental fetches. On the column truncation issue, I think we should not truncate columns by default. What do you think? Set incremental printing to true by default in Beeline -- Key: HIVE-7224 URL: https://issues.apache.org/jira/browse/HIVE-7224 Project: Hive Issue Type: Bug Components: Clients, JDBC Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.14.0 Attachments: HIVE-7224.1.patch See HIVE-7221. By default beeline tries to buffer the entire output relation before printing it on stdout. This can cause OOM when the output relation is large. However, beeline has the option of incremental prints. We should keep that as the default. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs
[ https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031646#comment-14031646 ] Vaibhav Gumashta commented on HIVE-4629: [~raviprak] This can definitely be taken up. There is some good feedback on the jira too regarding the modifications. Let me know if you plan to take it up and need more help while you work on it. Thanks a lot! HS2 should support an API to retrieve query logs Key: HIVE-4629 URL: https://issues.apache.org/jira/browse/HIVE-4629 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4629-no_thrift.1.patch, HIVE-4629.1.patch, HIVE-4629.2.patch, HIVE-4629.3.patch.txt HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-3392) Hive unnecessarily validates table SerDes when dropping a table
[ https://issues.apache.org/jira/browse/HIVE-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031705#comment-14031705 ] Hive QA commented on HIVE-3392: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650218/HIVE-3392.3.patch.txt {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5536 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_altern1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/464/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/464/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-464/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12650218 Hive unnecessarily validates table SerDes when dropping a table --- Key: HIVE-3392 URL: https://issues.apache.org/jira/browse/HIVE-3392 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Reporter: Jonathan Natkins Assignee: Navis Labels: patch Attachments: HIVE-3392.2.patch.txt, HIVE-3392.3.patch.txt, HIVE-3392.Test Case - with_trunk_version.txt natty@hadoop1:~$ hive hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive create table test (a int) row format serde 'hive.serde.JSONSerDe'; OK Time taken: 2.399 seconds natty@hadoop1:~$ hive hive drop table test; FAILED: Hive Internal Error: java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe hive.serde.JSONSerDe does not exist)) java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253) at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490) at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:210) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe com.cloudera.hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:211) at
[jira] [Commented] (HIVE-7224) Set incremental printing to true by default in Beeline
[ https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031708#comment-14031708 ] Thejas M Nair commented on HIVE-7224: - bq. On the column truncation issue, I think we should not truncate columns by default. What do you think? I agree, truncation is bad. That is the opinion I express in HIVE-6928 also. Set incremental printing to true by default in Beeline -- Key: HIVE-7224 URL: https://issues.apache.org/jira/browse/HIVE-7224 Project: Hive Issue Type: Bug Components: Clients, JDBC Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.14.0 Attachments: HIVE-7224.1.patch See HIVE-7221. By default beeline tries to buffer the entire output relation before printing it on stdout. This can cause OOM when the output relation is large. However, beeline has the option of incremental prints. We should keep that as the default. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7005) MiniTez tests have non-deterministic explain plans
[ https://issues.apache.org/jira/browse/HIVE-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031741#comment-14031741 ] Hive QA commented on HIVE-7005: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650238/HIVE-7005.1.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5611 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/466/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/466/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-466/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12650238 MiniTez tests have non-deterministic explain plans -- Key: HIVE-7005 URL: https://issues.apache.org/jira/browse/HIVE-7005 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere Assignee: Gunther Hagleitner Attachments: HIVE-7005.1.patch TestMiniTezCliDriver has a few test failures where there is a diff in the explain plan generated. According to Vikram, the plan generated is correct, but the plan can be generated in a couple of different ways and so sometimes the plan will not diff against the expected output. We should probably come up with a way to validate this explain plan in a reproducible way. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7005) MiniTez tests have non-deterministic explain plans
[ https://issues.apache.org/jira/browse/HIVE-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031749#comment-14031749 ] Gunther Hagleitner commented on HIVE-7005: -- No new/related test failures. And the typically non-deterministic test didn't show up as failures. MiniTez tests have non-deterministic explain plans -- Key: HIVE-7005 URL: https://issues.apache.org/jira/browse/HIVE-7005 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere Assignee: Gunther Hagleitner Attachments: HIVE-7005.1.patch TestMiniTezCliDriver has a few test failures where there is a diff in the explain plan generated. According to Vikram, the plan generated is correct, but the plan can be generated in a couple of different ways and so sometimes the plan will not diff against the expected output. We should probably come up with a way to validate this explain plan in a reproducible way. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7005) MiniTez tests have non-deterministic explain plans
[ https://issues.apache.org/jira/browse/HIVE-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7005: - Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) MiniTez tests have non-deterministic explain plans -- Key: HIVE-7005 URL: https://issues.apache.org/jira/browse/HIVE-7005 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere Assignee: Gunther Hagleitner Fix For: 0.14.0 Attachments: HIVE-7005.1.patch TestMiniTezCliDriver has a few test failures where there is a diff in the explain plan generated. According to Vikram, the plan generated is correct, but the plan can be generated in a couple of different ways and so sometimes the plan will not diff against the expected output. We should probably come up with a way to validate this explain plan in a reproducible way. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7005) MiniTez tests have non-deterministic explain plans
[ https://issues.apache.org/jira/browse/HIVE-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031752#comment-14031752 ] Gunther Hagleitner commented on HIVE-7005: -- Committed to trunk. Thanks for the review [~jdere]! MiniTez tests have non-deterministic explain plans -- Key: HIVE-7005 URL: https://issues.apache.org/jira/browse/HIVE-7005 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere Assignee: Gunther Hagleitner Fix For: 0.14.0 Attachments: HIVE-7005.1.patch TestMiniTezCliDriver has a few test failures where there is a diff in the explain plan generated. According to Vikram, the plan generated is correct, but the plan can be generated in a couple of different ways and so sometimes the plan will not diff against the expected output. We should probably come up with a way to validate this explain plan in a reproducible way. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7219) Improve performance of serialization utils in ORC
[ https://issues.apache.org/jira/browse/HIVE-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031758#comment-14031758 ] Gunther Hagleitner commented on HIVE-7219: -- +1 assuming tests will pass. Improve performance of serialization utils in ORC - Key: HIVE-7219 URL: https://issues.apache.org/jira/browse/HIVE-7219 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-7219.1.patch, HIVE-7219.2.patch, HIVE-7219.3.patch, orc-read-perf-jmh-benchmark.png ORC uses serialization utils heavily for reading and writing data. The bitpacking and unpacking code in writeInts() and readInts() can be unrolled for better performance. Also double reader/writer performance can be improved by bulk reading/writing from/to byte array. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7204) Use NULL vertex location hint for Prewarm DAG vertices
[ https://issues.apache.org/jira/browse/HIVE-7204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7204: - Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks [~gopalv]! Use NULL vertex location hint for Prewarm DAG vertices -- Key: HIVE-7204 URL: https://issues.apache.org/jira/browse/HIVE-7204 Project: Hive Issue Type: Sub-task Components: Tez Affects Versions: 0.14.0 Reporter: Gopal V Assignee: Gopal V Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7204.1.patch The current 0.5.x branch of Tez added extra preconditions which check for parallelism settings to match between the number of containers and the vertex location hints. {code} Caused by: org.apache.hadoop.ipc.RemoteException(java.lang.IllegalArgumentException): Locations array length must match the parallelism set for the vertex at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.tez.dag.api.Vertex.setTaskLocationsHint(Vertex.java:105) at org.apache.tez.dag.app.DAGAppMaster.startPreWarmContainers(DAGAppMaster.java:1004) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7201) Fix TestHiveConf#testConfProperties test case
[ https://issues.apache.org/jira/browse/HIVE-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031781#comment-14031781 ] Hive QA commented on HIVE-7201: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650163/HIVE-7201.03.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/468/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/468/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-468/ Messages: {noformat} This message was trimmed, see log for full details Reverted 'ql/src/test/results/clientpositive/bucketmapjoin11.q.out' Reverted 'ql/src/test/results/clientpositive/join26.q.out' Reverted 'ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out' Reverted 'ql/src/test/results/clientpositive/union34.q.out' Reverted 'ql/src/test/results/clientpositive/join12.q.out' Reverted 'ql/src/test/results/clientpositive/smb_mapjoin_16.q.out' Reverted 'ql/src/test/results/clientpositive/ppd_join_filter.q.out' Reverted 'ql/src/test/results/clientpositive/join35.q.out' Reverted 'ql/src/test/results/clientpositive/bucketmapjoin2.q.out' Reverted 'ql/src/test/results/clientpositive/mapjoin_subquery2.q.out' Reverted 'ql/src/test/results/clientpositive/union20.q.out' Reverted 'ql/src/test/results/clientpositive/correlationoptimizer15.q.out' Reverted 'ql/src/test/results/clientpositive/column_access_stats.q.out' Reverted 'ql/src/test/results/clientpositive/smb_mapjoin_25.q.out' Reverted 'ql/src/test/results/clientpositive/join_map_ppr.q.out' Reverted 'ql/src/test/results/clientpositive/ppd_multi_insert.q.out' Reverted 'ql/src/test/results/clientpositive/join9.q.out' Reverted 'ql/src/test/results/clientpositive/smb_mapjoin_11.q.out' Reverted 'ql/src/test/results/clientpositive/join30.q.out' Reverted 'ql/src/test/results/clientpositive/bucketsortoptimize_insert_6.q.out' Reverted 'ql/src/test/results/clientpositive/correlationoptimizer10.q.out' Reverted 'ql/src/test/results/clientpositive/join_nullsafe.q.out' Reverted 'ql/src/test/results/clientpositive/correlationoptimizer5.q.out' Reverted 'ql/src/test/results/clientpositive/auto_join1.q.out' Reverted 'ql/src/test/results/clientpositive/union_remove_14.q.out' Reverted 'ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out' Reverted 'ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out' Reverted 'ql/src/test/results/clientpositive/skewjoinopt5.q.out' Reverted 'ql/src/test/results/clientpositive/union_remove_23.q.out' Reverted 'ql/src/test/results/clientpositive/ppd_join4.q.out' Reverted 'ql/src/test/results/clientpositive/mapjoin_subquery.q.out' Reverted 'ql/src/test/results/clientpositive/limit_pushdown.q.out' Reverted 'ql/src/test/results/clientpositive/bucket_map_join_1.q.out' Reverted 'ql/src/test/results/clientpositive/skewjoinopt17.q.out' Reverted 'ql/src/test/results/clientpositive/tez/auto_join1.q.out' Reverted 'ql/src/test/results/clientpositive/tez/cross_product_check_1.q.out' Reverted 'ql/src/test/results/clientpositive/tez/cross_join.q.out' Reverted 'ql/src/test/results/clientpositive/tez/cross_product_check_2.q.out' Reverted 'ql/src/test/results/clientpositive/smb_mapjoin_3.q.out' Reverted 'ql/src/test/results/clientpositive/auto_sortmerge_join_13.q.out' Reverted 'ql/src/test/results/clientpositive/index_auto_mult_tables.q.out' Reverted 'ql/src/test/results/clientpositive/skewjoin_union_remove_2.q.out' Reverted 'ql/src/test/results/clientpositive/bucketcontext_4.q.out' Reverted 'ql/src/test/results/clientpositive/skewjoinopt12.q.out' Reverted 'ql/src/test/results/clientpositive/auto_join17.q.out' Reverted 'ql/src/test/results/clientpositive/auto_join26.q.out' Reverted 'ql/src/test/results/clientpositive/sort_merge_join_desc_7.q.out' Reverted 'ql/src/test/results/clientpositive/bucketmapjoin9.q.out' Reverted 'ql/src/test/results/clientpositive/auto_join12.q.out' Reverted 'ql/src/test/results/clientpositive/bucketmapjoin13.q.out' Reverted 'ql/src/test/results/clientpositive/ppd2.q.out' Reverted 'ql/src/test/results/clientpositive/join28.q.out' Reverted 'ql/src/test/results/clientpositive/ppd_udf_case.q.out' Reverted 'ql/src/test/results/clientpositive/udf_case_column_pruning.q.out' Reverted 'ql/src/test/results/clientpositive/join14.q.out' Reverted 'ql/src/test/results/clientpositive/sort_merge_join_desc_2.q.out' Reverted 'ql/src/test/results/clientpositive/join37.q.out' Reverted 'ql/src/test/results/clientpositive/bucketmapjoin4.q.out' Reverted 'ql/src/test/results/clientpositive/auto_join30.q.out' Reverted 'ql/src/test/results/clientpositive/join_thrift.q.out' Reverted 'ql/src/test/results/clientpositive/smb_mapjoin_13.q.out' Reverted
[jira] [Commented] (HIVE-7159) For inner joins push a 'is not null predicate' to the join sources for every non nullSafe join condition
[ https://issues.apache.org/jira/browse/HIVE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031780#comment-14031780 ] Hive QA commented on HIVE-7159: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650258/HIVE-7159.4.patch {color:red}ERROR:{color} -1 due to 63 failed/errored test(s), 5536 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_alt_syntax org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_hive_626 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_reorder2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_reorder3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_star org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_vc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_hook org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_mapjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoins org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_join_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf_streaming org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoinopt13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoinopt14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subq_where_serialization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists_explain_rewrite org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_explain_rewrite org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notexists org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notexists_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_views org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_context org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
[jira] [Updated] (HIVE-2397) Support with rollup option for group by
[ https://issues.apache.org/jira/browse/HIVE-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-2397: --- Labels: TODOC10 (was: ) Support with rollup option for group by --- Key: HIVE-2397 URL: https://issues.apache.org/jira/browse/HIVE-2397 Project: Hive Issue Type: New Feature Reporter: Kevin Wilfong Assignee: Namit Jain Labels: TODOC10 Fix For: 0.10.0 Attachments: HIVE-2397.2.patch.txt, HIVE-2397.3.patch.txt, HIVE-2397.4.patch.txt, HIVE-2397.5.patch.txt We should support the ROLLUP operator similar to the way MySQL is implemented. Exerted from MySQL documents: mysql SELECT year, country, product, SUM(profit) - FROM sales - GROUP BY year, country, product WITH ROLLUP; +--+-++-+ | year | country | product| SUM(profit) | +--+-++-+ | 2000 | Finland | Computer |1500 | | 2000 | Finland | Phone | 100 | | 2000 | Finland | NULL |1600 | | 2000 | India | Calculator | 150 | | 2000 | India | Computer |1200 | | 2000 | India | NULL |1350 | | 2000 | USA | Calculator | 75 | | 2000 | USA | Computer |1500 | | 2000 | USA | NULL |1575 | | 2000 | NULL| NULL |4525 | | 2001 | Finland | Phone | 10 | | 2001 | Finland | NULL | 10 | | 2001 | USA | Calculator | 50 | | 2001 | USA | Computer |2700 | | 2001 | USA | TV | 250 | | 2001 | USA | NULL |3000 | | 2001 | NULL| NULL |3010 | | NULL | NULL| NULL |7535 | +--+-++-+ http://dev.mysql.com/doc/refman/5.0/en/group-by-modifiers.html -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3004) RegexSerDe should support other column types in addition to STRING
[ https://issues.apache.org/jira/browse/HIVE-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-3004: --- Labels: TODOC11 (was: ) RegexSerDe should support other column types in addition to STRING -- Key: HIVE-3004 URL: https://issues.apache.org/jira/browse/HIVE-3004 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Carl Steinbach Assignee: Shreepadma Venugopalan Labels: TODOC11 Fix For: 0.11.0 Attachments: HIVE-3004-1.patch, HIVE-3004.2.patch, HIVE-3004.3.patch.txt, HIVE-3004.4.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3073) Hive List Bucketing - DML support
[ https://issues.apache.org/jira/browse/HIVE-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-3073: --- Labels: TODOC11 (was: ) Hive List Bucketing - DML support -- Key: HIVE-3073 URL: https://issues.apache.org/jira/browse/HIVE-3073 Project: Hive Issue Type: New Feature Components: SQL Affects Versions: 0.10.0 Reporter: Gang Tim Liu Assignee: Gang Tim Liu Labels: TODOC11 Fix For: 0.11.0 Attachments: HIVE-3073.patch.12, HIVE-3073.patch.13, HIVE-3073.patch.15, HIVE-3073.patch.18, HIVE-3073.patch.19, HIVE-3073.patch.21, HIVE-3073.patch.22, HIVE-3073.patch.24, HIVE-3073.patch.26, HIVE-3073.patch.27 If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it: https://cwiki.apache.org/Hive/listbucketing.html This jira issue will track DML change for the feature: 1. single skewed column 2. manual load data -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support
[ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-3072: --- Labels: TODOC10 (was: ) Hive List Bucketing - DDL support - Key: HIVE-3072 URL: https://issues.apache.org/jira/browse/HIVE-3072 Project: Hive Issue Type: New Feature Components: SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu Labels: TODOC10 Fix For: 0.10.0 Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5, HIVE-3072.patch.6, HIVE-3072.patch.7 If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it: https://cwiki.apache.org/Hive/listbucketing.html This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3056) Create a new metastore tool to bulk update location field in Db/Table/Partition records
[ https://issues.apache.org/jira/browse/HIVE-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-3056: --- Labels: TODOC10 (was: ) Create a new metastore tool to bulk update location field in Db/Table/Partition records Key: HIVE-3056 URL: https://issues.apache.org/jira/browse/HIVE-3056 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Carl Steinbach Assignee: Shreepadma Venugopalan Labels: TODOC10 Fix For: 0.10.0 Attachments: HIVE-3056.2.patch.txt, HIVE-3056.3.patch.txt, HIVE-3056.4.patch.txt, HIVE-3056.5.patch.txt, HIVE-3056.7.patch.txt, HIVE-3056.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5161) Additional SerDe support for varchar type
[ https://issues.apache.org/jira/browse/HIVE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-5161: --- Labels: TODOC12 (was: ) Additional SerDe support for varchar type - Key: HIVE-5161 URL: https://issues.apache.org/jira/browse/HIVE-5161 Project: Hive Issue Type: Bug Components: Serializers/Deserializers, Types Reporter: Jason Dere Assignee: Jason Dere Labels: TODOC12 Fix For: 0.12.0 Attachments: D12897.1.patch, HIVE-5161.1.patch, HIVE-5161.2.patch, HIVE-5161.3.patch, HIVE-5161.v12.1.patch Breaking out support for varchar for the various SerDes as an additional task. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-2694) Add FORMAT UDF
[ https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-2694: --- Labels: TODOC10 (was: ) Add FORMAT UDF -- Key: HIVE-2694 URL: https://issues.apache.org/jira/browse/HIVE-2694 Project: Hive Issue Type: New Feature Components: UDF Reporter: Carl Steinbach Assignee: Zhenxiao Luo Labels: TODOC10 Fix For: 0.10.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2694.D1149.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2694.D1149.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2694.D1149.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2694.D2673.1.patch, HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, HIVE-2694.3.patch.txt, HIVE-2694.4.patch.txt, HIVE-2694.D2673.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-2517) Support group by on struct type
[ https://issues.apache.org/jira/browse/HIVE-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-2517: --- Labels: TODOC12 structtype uniontype (was: structtype uniontype) Support group by on struct type --- Key: HIVE-2517 URL: https://issues.apache.org/jira/browse/HIVE-2517 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Labels: TODOC12, structtype, uniontype Fix For: 0.12.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2517.D2151.1.patch, HIVE-2517_3.patch, hive-2517.patch, hive-2517_1.patch, hive-2517_2.patch Currently group by on struct and union types are not supported. This issue will enable support for those. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-2482) Convenience UDFs for binary data type
[ https://issues.apache.org/jira/browse/HIVE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-2482: --- Labels: TODOC12 (was: ) Convenience UDFs for binary data type - Key: HIVE-2482 URL: https://issues.apache.org/jira/browse/HIVE-2482 Project: Hive Issue Type: New Feature Reporter: Ashutosh Chauhan Assignee: Mark Wagner Labels: TODOC12 Fix For: 0.12.0 Attachments: HIVE-2482.1.patch, HIVE-2482.2.patch, HIVE-2482.3.patch, HIVE-2482.4.patch HIVE-2380 introduced binary data type in Hive. It will be good to have following udfs to make it more useful: * UDF's to convert to/from hex string * UDF's to convert to/from string using a specific encoding * UDF's to convert to/from base64 string -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7201) Fix TestHiveConf#testConfProperties test case
[ https://issues.apache.org/jira/browse/HIVE-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7201: --- Status: Open (was: Patch Available) Patch doesnt apply cleanly on trunk. Fix TestHiveConf#testConfProperties test case - Key: HIVE-7201 URL: https://issues.apache.org/jira/browse/HIVE-7201 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Pankit Thapar Assignee: Pankit Thapar Priority: Minor Attachments: HIVE-7201-1.patch, HIVE-7201-2.patch, HIVE-7201.03.patch, HIVE-7201.patch CHANGE 1: TEST CASE : The intention of TestHiveConf#testConfProperties() is to test the HiveConf properties being set in the priority as expected. Each HiveConf object is initialized as follows: 1) Hadoop configuration properties are applied. 2) ConfVar properties with non-null values are overlayed. 3) hive-site.xml properties are overlayed. ISSUE : The mapreduce related configurations are loaded by JobConf and not Configuration. The current test tries to get the configuration properties like : HADOOPNUMREDUCERS (mapred.job.reduces) from Configuration class. But these mapreduce related properties are loaded by JobConf class from mapred-default.xml. DETAILS : LINE 63 : checkHadoopConf(ConfVars.HADOOPNUMREDUCERS.varname, 1); --fails Because, private void checkHadoopConf(String name, String expectedHadoopVal) { Assert.assertEquals(expectedHadoopVal, new Configuration().get(name)); Second parameter is null, since its the JobConf class and not the Configuration class that initializes mapred-default values. } Code that loads mapreduce resources is in ConfigUtil and JobConf makes a call like this (in static block): public class JobConf extends Configuration { private static final Log LOG = LogFactory.getLog(JobConf.class); static{ ConfigUtil.loadResources(); -- loads mapreduce related resources (mapreduce-default.xml) } . } Please note, the test case assertion works fine if HiveConf() constructor is called before this assertion since, HiveConf() triggers JobConf() which basically sets the default values of the properties pertaining to mapreduce. This is why, there won't be any failures if testHiveSitePath() was run before testConfProperties() as that would load mapreduce properties into config properties. FIX: Instead of using a Configuration object, we can use the JobConf object to get the default values used by hadoop/mapreduce. CHANGE 2: In TestHiveConf#testHiveSitePath(), a call to static method getHiveSiteLocation() should be called statically instead of using an object. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4281) add hive.map.groupby.sorted.testmode
[ https://issues.apache.org/jira/browse/HIVE-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-4281: --- Labels: TODOC11 (was: ) add hive.map.groupby.sorted.testmode Key: HIVE-4281 URL: https://issues.apache.org/jira/browse/HIVE-4281 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Labels: TODOC11 Fix For: 0.11.0 Attachments: hive.4281.1.patch, hive.4281.2.patch, hive.4281.2.patch-nohcat, hive.4281.3.patch The idea behind this would be to test hive.map.groupby.sorted. Since this is a new feature, it might be a good idea to run it in test mode, where a query property would denote that this query plan would have changed. If a customer wants, they can run those queries offline, compare the results for correctness, and set hive.map.groupby.sorted only if all the results are the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3903) Allow updating bucketing/sorting metadata of a partition through the CLI
[ https://issues.apache.org/jira/browse/HIVE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-3903: --- Labels: TODOC11 (was: ) Allow updating bucketing/sorting metadata of a partition through the CLI Key: HIVE-3903 URL: https://issues.apache.org/jira/browse/HIVE-3903 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Samuel Yuan Labels: TODOC11 Fix For: 0.11.0 Attachments: HIVE-3903.1.patch.txt, HIVE-3903.2.patch.txt Right now users can update the bucketing/sorting metadata of a table through the CLI, but not a partition. Use case: Need to merge a partition's files, but it's bucketed/sorted, so want to mark the partition as unbucketed/unsorted. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3527) Allow CREATE TABLE LIKE command to take TBLPROPERTIES
[ https://issues.apache.org/jira/browse/HIVE-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-3527: --- Labels: TODOC12 (was: ) Allow CREATE TABLE LIKE command to take TBLPROPERTIES - Key: HIVE-3527 URL: https://issues.apache.org/jira/browse/HIVE-3527 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Labels: TODOC11 Fix For: 0.11.0 Attachments: HIVE-3527.1.patch.txt, HIVE-3527.3.patch.txt, HIVE-3527.4.patch.txt, HIVE-3527.D5883.1.patch, hive.3527.2.patch CREATE TABLE ... LIKE ... commands currently don't take TBLPROPERTIES. I think it would be a useful feature. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-3628) Provide a way to use counters in Hive through UDF
[ https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031786#comment-14031786 ] Sudarshan Rangarajan commented on HIVE-3628: Newbie here: any example to highlight how to access files in distributed cache, from inside a UDTF ? [it's been marked that HIVE-1016 duplicates this item] Provide a way to use counters in Hive through UDF - Key: HIVE-3628 URL: https://issues.apache.org/jira/browse/HIVE-3628 Project: Hive Issue Type: Improvement Components: UDF Reporter: Viji Assignee: Navis Priority: Minor Fix For: 0.11.0 Attachments: HIVE-3628.D8007.1.patch, HIVE-3628.D8007.2.patch, HIVE-3628.D8007.3.patch, HIVE-3628.D8007.4.patch, HIVE-3628.D8007.5.patch, HIVE-3628.D8007.6.patch Currently it is not possible to generate counters through UDF. We should support this. Pig currently allows this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3527) Allow CREATE TABLE LIKE command to take TBLPROPERTIES
[ https://issues.apache.org/jira/browse/HIVE-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-3527: --- Labels: TODOC11 (was: TODOC12) Allow CREATE TABLE LIKE command to take TBLPROPERTIES - Key: HIVE-3527 URL: https://issues.apache.org/jira/browse/HIVE-3527 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Labels: TODOC11 Fix For: 0.11.0 Attachments: HIVE-3527.1.patch.txt, HIVE-3527.3.patch.txt, HIVE-3527.4.patch.txt, HIVE-3527.D5883.1.patch, hive.3527.2.patch CREATE TABLE ... LIKE ... commands currently don't take TBLPROPERTIES. I think it would be a useful feature. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-2288) Adding the oracle nvl function to the UDF
[ https://issues.apache.org/jira/browse/HIVE-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-2288: --- Labels: TODOC11 hive (was: hive) Adding the oracle nvl function to the UDF - Key: HIVE-2288 URL: https://issues.apache.org/jira/browse/HIVE-2288 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: Guy Doulberg Assignee: Edward Capriolo Priority: Minor Labels: TODOC11, hive Fix For: 0.11.0 Attachments: 0002-HIVE-2288-Adding-the-oracle-nvl-function-to-the-UDF.patch, hive-2288.2.patch.txt It would be nice if we could use the nvl function, described at oracle: http://www.techonthenet.com/oracle/functions/nvl.php -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3725) Add support for pulling HBase columns with prefixes
[ https://issues.apache.org/jira/browse/HIVE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-3725: --- Labels: TODOC12 (was: ) Add support for pulling HBase columns with prefixes --- Key: HIVE-3725 URL: https://issues.apache.org/jira/browse/HIVE-3725 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.9.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Labels: TODOC12 Fix For: 0.12.0 Attachments: HIVE-3725.1.patch.txt, HIVE-3725.2.patch.txt, HIVE-3725.3.patch.txt, HIVE-3725.4.patch.txt, HIVE-3725.patch.3.txt Current HBase Hive integration supports reading many values from the same row by specifying a column family. And specifying just the column family can pull in all qualifiers within the family. We should add in support to be able to specify a prefix for the qualifier and all columns that start with the prefix would automatically get pulled in. A wildcard support would be ideal. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Documentation Policy
A few more from older releases: *0.10*: https://issues.apache.org/jira/browse/HIVE-2397?jql=project%20%3D%20HIVE%20AND%20labels%20%3D%20TODOC10%20AND%20status%20in%20(Resolved%2C%20Closed)%20ORDER%20BY%20priority%20DESC *0.11:* https://issues.apache.org/jira/browse/HIVE-3073?jql=project%20%3D%20HIVE%20AND%20labels%20%3D%20TODOC11%20AND%20status%20in%20(Resolved%2C%20Closed)%20ORDER%20BY%20priority%20DESC *0.12:* https://issues.apache.org/jira/browse/HIVE-5161?jql=project%20%3D%20HIVE%20AND%20labels%20%3D%20TODOC12%20AND%20status%20in%20(Resolved%2C%20Closed)%20ORDER%20BY%20priority%20DESC Should we create JIRA for these so that the work to be done on these does not get lost? On Fri, Jun 13, 2014 at 5:59 PM, Lefty Leverenz leftylever...@gmail.com wrote: Agreed, deleting TODOC## simplifies the labels field, so we should just use comments to keep track of docs done. Besides, doc tasks can get complicated -- my gmail inbox has a few messages with simultaneous done and to-do labels -- so comments are best for tracking progress. Also, as Szehon noticed, links in the comments make it easy to find the docs. +1 on (a): delete TODOCs when done; don't add any new labels. -- Lefty On Fri, Jun 13, 2014 at 1:31 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: +1 on deleting the TODOC tag as I think it's assumed by default that once an enhancement is done, it will be doc'ed. We may consider adding an additional docdone tag but I think we can instead just wait for a +1 from the contributor that the documentation is satisfactory (and assume a implicit +1 for no reply) before deleting the TODOC tag. On Fri, Jun 13, 2014 at 1:32 PM, Szehon Ho sze...@cloudera.com wrote: Yea, I'd imagine the TODOC tag pollutes the query of TODOC's and confuses the state of a JIRA, so its probably best to remove it. The idea of docdone is to query what docs got produced and needs review? It might be nice to have a tag for that, to easily signal to contributor or interested parties to take a look. On a side note, I already find very helpful your JIRA comments with links to doc-wikis, both to inform the contributor and just as reference for others. Thanks again for the great work. On Fri, Jun 13, 2014 at 1:33 AM, Lefty Leverenz leftylever...@gmail.com wrote: One more question: what should we do after the documentation is done for a JIRA ticket? (a) Just remove the TODOC## label. (b) Replace TODOC## with docdone (no caps, no version number). (c) Add a docdone label but keep TODOC##. (d) Something else. -- Lefty On Thu, Jun 12, 2014 at 12:54 PM, Brock Noland br...@cloudera.com wrote: Thank you guys! This is great work. On Wed, Jun 11, 2014 at 6:20 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: Going through the issues, I think overall Lefty did an awesome job catching and documenting most of them in time. Following are some of the 0.13 and 0.14 ones which I found which either do not have documentation or have outdated one and probably need one to be consumeable. Contributors, feel free to remove the label if you disagree. *TODOC13:* https://issues.apache.org/jira/browse/HIVE-6827?jql=project%20%3D%20HIVE%20AND%20labels%20%3D%20TODOC13%20AND%20status%20in%20(Resolved%2C%20Closed) *TODOC14:* https://issues.apache.org/jira/browse/HIVE-6999?jql=project%20%3D%20HIVE%20AND%20labels%20%3D%20TODOC14%20AND%20status%20in%20(Resolved%2C%20Closed) I'll continue digging through the queue going backwards to 0.12 and 0.11 and see if I find similar stuff there as well. On Wed, Jun 11, 2014 at 10:36 AM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: Feel free to label such jiras with this keyword and ask the contributors for more information if you need any. Cool. I'll start chugging through the queue today adding labels as apt. On Tue, Jun 10, 2014 at 9:45 PM, Thejas Nair the...@hortonworks.com wrote: Shall we lump 0.13.0 and 0.13.1 doc tasks as TODOC13? Sounds good to me. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have
[jira] [Commented] (HIVE-7232) ReduceSink is emitting NULL keys due to failed keyEval
[ https://issues.apache.org/jira/browse/HIVE-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031791#comment-14031791 ] Ashutosh Chauhan commented on HIVE-7232: [~gopalv] Is this resulting in wrong results (because NULL key got emitted incorrectly) or this resulting in lower perf (because it resulted in a skew towards NULL) ? ReduceSink is emitting NULL keys due to failed keyEval -- Key: HIVE-7232 URL: https://issues.apache.org/jira/browse/HIVE-7232 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Gopal V After HIVE-4867 has been merged in, some queries have exhibited a very weird skew towards NULL keys emitted from the ReduceSinkOperator. Added extra logging to print expr.column() in ExprNodeColumnEvaluator in reduce sink. {code} 2014-06-14 00:37:19,186 INFO [TezChild] org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: numDistributionKeys = 1 {null -- ExprNodeColumnEvaluator(_col10)} key_row={reducesinkkey0:442} {code} {code} HiveKey firstKey = toHiveKey(cachedKeys[0], tag, null); int distKeyLength = firstKey.getDistKeyLength(); if(distKeyLength = 1) { StringBuffer x1 = new StringBuffer(); x1.append(numDistributionKeys = + numDistributionKeys + \n); for (int i = 0; i numDistributionKeys; i++) { x1.append(cachedKeys[0][i] + -- + keyEval[i] + \n); } x1.append(key_row=+ SerDeUtils.getJSONString(row, keyObjectInspector)); LOG.info(GOPAL: + x1.toString()); } {code} The query is tpc-h query5, with extra NULL checks just to be sure. {code} ELECT n_name, sum(l_extendedprice * (1 - l_discount)) AS revenue FROM customer, orders, lineitem, supplier, nation, region WHERE c_custkey = o_custkey AND l_orderkey = o_orderkey AND l_suppkey = s_suppkey AND c_nationkey = s_nationkey AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA' AND o_orderdate = '1994-01-01' AND o_orderdate '1995-01-01' and l_orderkey is not null and c_custkey is not null and l_suppkey is not null and c_nationkey is not null and s_nationkey is not null and n_regionkey is not null GROUP BY n_name ORDER BY revenue DESC; {code} The reducer which has the issue has the following plan {code} Reducer 3 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {KEY.reducesinkkey0} {VALUE._col2} 1 {VALUE._col0} {KEY.reducesinkkey0} {VALUE._col3} outputColumnNames: _col0, _col3, _col10, _col11, _col14 Statistics: Num rows: 18344 Data size: 95229140992 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col10 (type: int) sort order: + Map-reduce partition columns: _col10 (type: int) Statistics: Num rows: 18344 Data size: 95229140992 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: int), _col3 (type: int), _col11 (type: int), _col14 (type: string) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)