[jira] [Updated] (HIVE-10971) count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true
[ https://issues.apache.org/jira/browse/HIVE-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangmeng updated HIVE-10971: Attachment: HIVE-10971.01.patch count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true --- Key: HIVE-10971 URL: https://issues.apache.org/jira/browse/HIVE-10971 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 1.2.0 Reporter: wangmeng Assignee: wangmeng Attachments: HIVE-10971.01.patch When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two theoretically, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10903) Add hive.in.test for HoS tests
[ https://issues.apache.org/jira/browse/HIVE-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578621#comment-14578621 ] Hive QA commented on HIVE-10903: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738527/HIVE-10903.2.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9004 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_merge_multi_expressions org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_louter_join_ppr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_outer_join_ppr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_cast_constant org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4222/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4222/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4222/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738527 - PreCommit-HIVE-TRUNK-Build Add hive.in.test for HoS tests -- Key: HIVE-10903 URL: https://issues.apache.org/jira/browse/HIVE-10903 Project: Hive Issue Type: Test Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-10903.1.patch, HIVE-10903.2.patch Missing the property can make CBO fails to run during UT. There should be other effects that can be identified here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10971) count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true
[ https://issues.apache.org/jira/browse/HIVE-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangmeng updated HIVE-10971: Component/s: (was: Hive) Logical Optimizer count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true --- Key: HIVE-10971 URL: https://issues.apache.org/jira/browse/HIVE-10971 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 1.2.0 Reporter: wangmeng Assignee: wangmeng When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two theoretically, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10971) count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true
[ https://issues.apache.org/jira/browse/HIVE-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangmeng updated HIVE-10971: Attachment: HIVE-10971.01.patch count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true --- Key: HIVE-10971 URL: https://issues.apache.org/jira/browse/HIVE-10971 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 1.2.0 Reporter: wangmeng Assignee: wangmeng Attachments: HIVE-10971.01.patch, HIVE-10971.01.patch When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two theoretically, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10971) count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true
[ https://issues.apache.org/jira/browse/HIVE-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangmeng updated HIVE-10971: Attachment: (was: HIVE-10971.01.patch) count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true --- Key: HIVE-10971 URL: https://issues.apache.org/jira/browse/HIVE-10971 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 1.2.0 Reporter: wangmeng Assignee: wangmeng Attachments: HIVE-10971.01.patch When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two theoretically, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10933) Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns()
[ https://issues.apache.org/jira/browse/HIVE-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10933: Component/s: (was: API) JDBC Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns() Key: HIVE-10933 URL: https://issues.apache.org/jira/browse/HIVE-10933 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.13.0 Reporter: Son Nguyen Assignee: Chaoyu Tang DatabaseMetadata.getColumns() returns COLUMN_SIZE as 0 for a column defined as varchar(32), or char(32). While ResultSetMetaData.getPrecision() returns correct value 32. Here is the segment program that reproduces the issue. {code} try { statement = connection.createStatement(); statement.execute(drop table if exists son_table); statement.execute(create table son_table( col1 varchar(32) )); statement.close(); } catch ( Exception e) { return; } // get column info using metadata try { DatabaseMetaData dmd = null; ResultSet resultSet = null; dmd = connection.getMetaData(); resultSet = dmd.getColumns(null, null, son_table, col1); if ( resultSet.next() ) { String tabName = resultSet.getString(TABLE_NAME); String colName = resultSet.getString(COLUMN_NAME); String dataType = resultSet.getString(DATA_TYPE); String typeName = resultSet.getString(TYPE_NAME); int precision = resultSet.getInt(COLUMN_SIZE); // output is: colName = col1, dataType = 12, typeName = VARCHAR, precision = 0. System.out.format(colName = %s, dataType = %s, typeName = %s, precision = %d., colName, dataType, typeName, precision); } } catch ( Exception e) { return; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10933) Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns()
[ https://issues.apache.org/jira/browse/HIVE-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10933: Description: DatabaseMetadata.getColumns() returns COLUMN_SIZE as 0 for a column defined as varchar(32), or char(32). While ResultSetMetaData.getPrecision() returns correct value 32. Here is the segment program that reproduces the issue. {code} try { statement = connection.createStatement(); statement.execute(drop table if exists son_table); statement.execute(create table son_table( col1 varchar(32) )); statement.close(); } catch ( Exception e) { return; } // get column info using metadata try { DatabaseMetaData dmd = null; ResultSet resultSet = null; dmd = connection.getMetaData(); resultSet = dmd.getColumns(null, null, son_table, col1); if ( resultSet.next() ) { String tabName = resultSet.getString(TABLE_NAME); String colName = resultSet.getString(COLUMN_NAME); String dataType = resultSet.getString(DATA_TYPE); String typeName = resultSet.getString(TYPE_NAME); int precision = resultSet.getInt(COLUMN_SIZE); // output is: colName = col1, dataType = 12, typeName = VARCHAR, precision = 0. System.out.format(colName = %s, dataType = %s, typeName = %s, precision = %d., colName, dataType, typeName, precision); } } catch ( Exception e) { return; } {code} was: DatabaseMetadata.getColumns() returns COLUMN_SIZE as 0 for a column defined as varchar(32), or char(32). While ResultSetMetaData.getPrecision() returns correct value 32. Here is the segment program that reproduces the issue. try { statement = connection.createStatement(); statement.execute(drop table if exists son_table); statement.execute(create table son_table( col1 varchar(32) )); statement.close(); } catch ( Exception e) { return; } // get column info using metadata try { DatabaseMetaData dmd = null; ResultSet resultSet = null; dmd = connection.getMetaData(); resultSet = dmd.getColumns(null, null, son_table, col1); if ( resultSet.next() ) { String tabName = resultSet.getString(TABLE_NAME); String colName = resultSet.getString(COLUMN_NAME); String dataType = resultSet.getString(DATA_TYPE); String typeName = resultSet.getString(TYPE_NAME); int precision = resultSet.getInt(COLUMN_SIZE); // output is: colName = col1, dataType = 12, typeName = VARCHAR, precision = 0. System.out.format(colName = %s, dataType = %s, typeName = %s, precision = %d., colName, dataType, typeName, precision); } } catch ( Exception e) { return; } Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns() Key: HIVE-10933 URL: https://issues.apache.org/jira/browse/HIVE-10933 Project: Hive Issue Type: Bug Components: API Affects Versions: 0.13.0 Reporter: Son Nguyen Assignee: Chaoyu Tang DatabaseMetadata.getColumns() returns COLUMN_SIZE as 0 for a column defined as varchar(32), or char(32). While ResultSetMetaData.getPrecision() returns correct value 32. Here is the segment program that reproduces the issue. {code} try { statement = connection.createStatement(); statement.execute(drop table if exists son_table); statement.execute(create table son_table( col1 varchar(32) )); statement.close(); } catch ( Exception e) { return; } // get column info using metadata try { DatabaseMetaData dmd = null; ResultSet resultSet = null; dmd = connection.getMetaData(); resultSet = dmd.getColumns(null, null, son_table, col1); if ( resultSet.next() ) { String tabName = resultSet.getString(TABLE_NAME); String colName = resultSet.getString(COLUMN_NAME);
[jira] [Commented] (HIVE-10971) count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true
[ https://issues.apache.org/jira/browse/HIVE-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578602#comment-14578602 ] wangmeng commented on HIVE-10971: - {code} hive set hive.groupby.skewindata=true; hive explain select l_returnflag,count(*),count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: lineitem Statistics: Num rows: 1008537518 Data size: 201707503616 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: l_returnflag (type: string), l_linestatus (type: string) outputColumnNames: l_returnflag, l_linestatus Statistics: Num rows: 1008537518 Data size: 201707503616 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(), count(DISTINCT l_linestatus) keys: l_returnflag (type: string), l_linestatus (type: string) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 1008537518 Data size: 201707503616 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string), _col1 (type: string) sort order: ++ Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 1008537518 Data size: 201707503616 Basic stats: COMPLETE Column stats: NONE value expressions: _col2 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0), count(DISTINCT KEY._col1:0._col0) keys: KEY._col0 (type: string) mode: complete outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 504268759 Data size: 100853751808 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string), _col1 (type: bigint), _col2 (type: bigint) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 504268759 Data size: 100853751808 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 10 Statistics: Num rows: 10 Data size: 2000 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: true Statistics: Num rows: 10 Data size: 2000 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: 10 {code} When hive.groupby.skewindata=false, the Group By operator has mode mergepartial, which gives the correct results. count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true --- Key: HIVE-10971 URL: https://issues.apache.org/jira/browse/HIVE-10971 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: wangmeng Assignee: wangmeng When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two theoretically, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-686) add UDF substring_index
[ https://issues.apache.org/jira/browse/HIVE-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-686: -- Description: SUBSTRING_INDEX(str,delim,count) Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. SUBSTRING_INDEX() performs a case-sensitive match when searching for delim. Examples: {code:sql} SELECT SUBSTRING_INDEX('www.mysql.com', '.', 3); --www.mysql.com SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2); --www.mysql SELECT SUBSTRING_INDEX('www.mysql.com', '.', 1); --www SELECT SUBSTRING_INDEX('www.mysql.com', '.', 0); --'' SELECT SUBSTRING_INDEX('www.mysql.com', '.', -1); --com SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2); --mysql.com SELECT SUBSTRING_INDEX('www.mysql.com', '.', -3); --www.mysql.com {code} {code:sql} --#delim does not exist in str SELECT SUBSTRING_INDEX('www.mysql.com', 'Q', 1); --www.mysql.com --#delim is 2 chars SELECT SUBSTRING_INDEX('www||mysql||com', '||', 2); --www||mysql --#delim is empty string SELECT SUBSTRING_INDEX('www.mysql.com', '', 2); --'' --#str is empty string SELECT SUBSTRING_INDEX('', '.', 2); --'' {code} {code:sql} --#null params SELECT SUBSTRING_INDEX(null, '.', 1); --null SELECT SUBSTRING_INDEX('www.mysql.com', null, 1); --null SELECT SUBSTRING_INDEX('www.mysql.com', '.', null); --null {code} was: SUBSTRING_INDEX(str,delim,count) Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. SUBSTRING_INDEX() performs a case-sensitive match when searching for delim. Examples: {code} SELECT SUBSTRING_INDEX('www.mysql.com', '.', 3); --www.mysql.com SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2); --www.mysql SELECT SUBSTRING_INDEX('www.mysql.com', '.', 1); --www SELECT SUBSTRING_INDEX('www.mysql.com', '.', 0); --'' SELECT SUBSTRING_INDEX('www.mysql.com', '.', -1); --com SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2); --mysql.com SELECT SUBSTRING_INDEX('www.mysql.com', '.', -3); --www.mysql.com {code} {code} --#delim does not exist in str SELECT SUBSTRING_INDEX('www.mysql.com', 'Q', 1); --www.mysql.com --#delim is 2 chars SELECT SUBSTRING_INDEX('www||mysql||com', '||', 2); --www||mysql --#delim is empty string SELECT SUBSTRING_INDEX('www.mysql.com', '', 2); --'' --#str is empty string SELECT SUBSTRING_INDEX('', '.', 2); --'' {code} {code} --#null params SELECT SUBSTRING_INDEX(null, '.', 1); --null SELECT SUBSTRING_INDEX('www.mysql.com', null, 1); --null SELECT SUBSTRING_INDEX('www.mysql.com', '.', null); --null {code} add UDF substring_index --- Key: HIVE-686 URL: https://issues.apache.org/jira/browse/HIVE-686 Project: Hive Issue Type: New Feature Components: UDF Reporter: Namit Jain Assignee: Alexander Pivovarov Fix For: 1.3.0, 2.0.0 Attachments: HIVE-686.1.patch, HIVE-686.1.patch, HIVE-686.patch, HIVE-686.patch SUBSTRING_INDEX(str,delim,count) Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. SUBSTRING_INDEX() performs a case-sensitive match when searching for delim. Examples: {code:sql} SELECT SUBSTRING_INDEX('www.mysql.com', '.', 3); --www.mysql.com SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2); --www.mysql SELECT SUBSTRING_INDEX('www.mysql.com', '.', 1); --www SELECT SUBSTRING_INDEX('www.mysql.com', '.', 0); --'' SELECT SUBSTRING_INDEX('www.mysql.com', '.', -1); --com SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2); --mysql.com SELECT SUBSTRING_INDEX('www.mysql.com', '.', -3); --www.mysql.com {code} {code:sql} --#delim does not exist in str SELECT SUBSTRING_INDEX('www.mysql.com', 'Q', 1); --www.mysql.com --#delim is 2 chars SELECT SUBSTRING_INDEX('www||mysql||com', '||', 2); --www||mysql --#delim is empty string SELECT SUBSTRING_INDEX('www.mysql.com', '', 2); --'' --#str is empty string SELECT SUBSTRING_INDEX('', '.', 2); --'' {code} {code:sql} --#null params SELECT SUBSTRING_INDEX(null, '.', 1); --null SELECT SUBSTRING_INDEX('www.mysql.com', null, 1); --null SELECT SUBSTRING_INDEX('www.mysql.com', '.', null); --null {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10971) count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true
[ https://issues.apache.org/jira/browse/HIVE-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangmeng updated HIVE-10971: Description: When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two theoretically, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. was: When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true --- Key: HIVE-10971 URL: https://issues.apache.org/jira/browse/HIVE-10971 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: wangmeng Assignee: wangmeng When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two theoretically, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10903) Add hive.in.test for HoS tests [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-10903: -- Attachment: (was: HIVE-10903.1-spark.patch) Add hive.in.test for HoS tests [Spark Branch] - Key: HIVE-10903 URL: https://issues.apache.org/jira/browse/HIVE-10903 Project: Hive Issue Type: Test Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-10903.1.patch Missing the property can make CBO fails to run during UT. There should be other effects that can be identified here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-10453) HS2 leaking open file descriptors when using UDFs
[ https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta reopened HIVE-10453: - [~ychena] In our internal testing, we found that closing the classloader removes certain jars from the classpath, resulting in ClassNotFoundException. I'm reopening the jira and reverting the patch from the respective branches via the linked jira. HS2 leaking open file descriptors when using UDFs - Key: HIVE-10453 URL: https://issues.apache.org/jira/browse/HIVE-10453 Project: Hive Issue Type: Bug Components: UDF Reporter: Yongzhi Chen Assignee: Yongzhi Chen Fix For: 1.3.0, 1.2.1, 2.0.0 Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch 1. create a custom function by CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar'; 2. Create a simple jdbc client, just do connect, run simple query which using the function such as: select myfunc(col1) from sometable 3. Disconnect. Check open file for HiveServer2 by: lsof -p HSProcID | grep myudf.jar You will see the leak as: {noformat} java 28718 ychen txt REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar java 28718 ychen 330r REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10959) Templeton launcher job should reconnect to the running child job on task retry
[ https://issues.apache.org/jira/browse/HIVE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Mitic updated HIVE-10959: -- Attachment: HIVE-10959.2.patch Attaching updated patch based on above comments and additional testing. Templeton launcher job should reconnect to the running child job on task retry -- Key: HIVE-10959 URL: https://issues.apache.org/jira/browse/HIVE-10959 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.15.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Attachments: HIVE-10959.2.patch, HIVE-10959.patch Currently, Templeton launcher kills all child jobs (jobs tagged with the parent job's id) upon task retry. Upon templeton launcher task retry, templeton should reconnect to the running job and continue tracking its progress that way. This logic cannot be used for all job kinds (e.g. for jobs that are driven by the client side like regular hive). However, for MapReduceV2, and possibly Tez and HiveOnTez, this should be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10903) Add hive.in.test for HoS tests
[ https://issues.apache.org/jira/browse/HIVE-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-10903: -- Summary: Add hive.in.test for HoS tests (was: Add hive.in.test for HoS tests [Spark Branch]) Add hive.in.test for HoS tests -- Key: HIVE-10903 URL: https://issues.apache.org/jira/browse/HIVE-10903 Project: Hive Issue Type: Test Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-10903.1.patch, HIVE-10903.2.patch Missing the property can make CBO fails to run during UT. There should be other effects that can be identified here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10903) Add hive.in.test for HoS tests [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-10903: -- Attachment: HIVE-10903.2.patch Verified that the changes to golden files are inline with the MR version. This patch also makes {{cbo_subq_in.q}} and {{groupby_complex_types_multi_single_reducer.q}} deterministic. So it's for master branch. Add hive.in.test for HoS tests [Spark Branch] - Key: HIVE-10903 URL: https://issues.apache.org/jira/browse/HIVE-10903 Project: Hive Issue Type: Test Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-10903.1.patch, HIVE-10903.2.patch Missing the property can make CBO fails to run during UT. There should be other effects that can be identified here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements
[ https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578474#comment-14578474 ] Laljo John Pullokkaran commented on HIVE-10841: --- [~a.semyannikov] #1. We need both: SemanticAnalyzer(JoinTree) change for predicate push down OpProc factory to prevent illegal push downs. #2 Yes, the failed tests need to be analyzed [WHERE col is not null] does not work sometimes for queries with many JOIN statements - Key: HIVE-10841 URL: https://issues.apache.org/jira/browse/HIVE-10841 Project: Hive Issue Type: Bug Components: Query Planning, Query Processor Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0, 1.3.0 Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Attachments: HIVE-10841.1.patch, HIVE-10841.patch The result from the following SELECT query is 3 rows but it should be 1 row. I checked it in MySQL - it returned 1 row. To reproduce the issue in Hive 1. prepare tables {code} drop table if exists L; drop table if exists LA; drop table if exists FR; drop table if exists A; drop table if exists PI; drop table if exists acct; create table L as select 4436 id; create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id; create table FR as select 4436 loan_id; create table A as select 4748 id; create table PI as select 4415 id; create table acct as select 4748 aid, 10 acc_n, 122 brn; insert into table acct values(4748, null, null); insert into table acct values(4748, null, null); {code} 2. run SELECT query {code} select acct.ACC_N, acct.brn FROM L JOIN LA ON L.id = LA.loan_id JOIN FR ON L.id = FR.loan_id JOIN A ON LA.aid = A.id JOIN PI ON PI.id = LA.pi_id JOIN acct ON A.id = acct.aid WHERE L.id = 4436 and acct.brn is not null; {code} the result is 3 rows {code} 10122 NULL NULL NULL NULL {code} but it should be 1 row {code} 10122 {code} 2.1 explain select ... output for hive-1.3.0 MR {code} STAGE DEPENDENCIES: Stage-12 is a root stage Stage-9 depends on stages: Stage-12 Stage-0 depends on stages: Stage-9 STAGE PLANS: Stage: Stage-12 Map Reduce Local Work Alias - Map Local Tables: a Fetch Operator limit: -1 acct Fetch Operator limit: -1 fr Fetch Operator limit: -1 l Fetch Operator limit: -1 pi Fetch Operator limit: -1 Alias - Map Local Operator Tree: a TableScan alias: a Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) acct TableScan alias: acct Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: aid is not null (type: boolean) Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) fr TableScan alias: fr Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (loan_id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) l TableScan alias: l Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) pi TableScan alias: pi Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator
[jira] [Commented] (HIVE-10866) Throw error when client try to insert into bucketed table
[ https://issues.apache.org/jira/browse/HIVE-10866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579110#comment-14579110 ] Yongzhi Chen commented on HIVE-10866: - insert into bucketed table with data should not succeed when hive.enforce.bucketing true. fix unit test. Throw error when client try to insert into bucketed table - Key: HIVE-10866 URL: https://issues.apache.org/jira/browse/HIVE-10866 Project: Hive Issue Type: Improvement Affects Versions: 1.2.0, 1.3.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-10866.1.patch, HIVE-10866.2.patch Currently, hive does not support appends(insert into) bucketed table, see open jira HIVE-3608. When insert into such table, the data will be corrupted and not fit for bucketmapjoin. We need find a way to prevent client from inserting into such table. Reproduce: {noformat} CREATE TABLE IF NOT EXISTS buckettestoutput1( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput2( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; set hive.enforce.bucketing = true; set hive.enforce.sorting=true; insert into table buckettestoutput1 select code from sample_07 where total_emp 134354250 limit 10; After this first insert, I did: set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; set hive.auto.convert.sortmerge.join.noconditionaltask=true; 0: jdbc:hive2://localhost:1 select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); +---+---+ | data | data | +---+---+ +---+---+ So select works fine. Second insert: 0: jdbc:hive2://localhost:1 insert into table buckettestoutput1 select code from sample_07 where total_emp = 134354250 limit 10; No rows affected (61.235 seconds) Then select: 0: jdbc:hive2://localhost:1 select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); Error: Error while compiling statement: FAILED: SemanticException [Error 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets for table buckettestoutput1 is 2, whereas the number of files is 4 (state=42000,code=10141) 0: jdbc:hive2://localhost:1 {noformat} Insert into empty table or partition will be fine, but insert into the non-empty one (after second insert in the reproduce), the bucketmapjoin will throw an error. We should not let second insert succeed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10963) Hive throws NPE rather than meaningful error message when window is missing
[ https://issues.apache.org/jira/browse/HIVE-10963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10963: Attachment: (was: HIVE-10963.patch) Hive throws NPE rather than meaningful error message when window is missing --- Key: HIVE-10963 URL: https://issues.apache.org/jira/browse/HIVE-10963 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 1.3.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10963.patch {{select sum(salary) over w1 from emp;}} throws NPE rather than meaningful error message like missing window. And also give the right window name rather than the classname in the error message after NPE issue is fixed. {noformat} org.apache.hadoop.hive.ql.parse.SemanticException: Window Spec org.apache.hadoop.hive.ql.parse.WindowingSpec$WindowSpec@7954e1de refers to an unknown source {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10903) Add hive.in.test for HoS tests
[ https://issues.apache.org/jira/browse/HIVE-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579073#comment-14579073 ] Hive QA commented on HIVE-10903: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738582/HIVE-10903.3.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9004 tests executed *Failed tests:* {noformat} org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4225/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4225/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4225/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738582 - PreCommit-HIVE-TRUNK-Build Add hive.in.test for HoS tests -- Key: HIVE-10903 URL: https://issues.apache.org/jira/browse/HIVE-10903 Project: Hive Issue Type: Test Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-10903.1.patch, HIVE-10903.2.patch, HIVE-10903.3.patch Missing the property can make CBO fails to run during UT. There should be other effects that can be identified here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10880) The bucket number is not respected in insert overwrite.
[ https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579119#comment-14579119 ] Yongzhi Chen commented on HIVE-10880: - [~xuefuz], I agree with you, there are something more serious than the missing files. I think the bucket algorithm is broken. I just tried to insert overwrite from a very big table, all the data goes to one bucket too. Seems the hash map no longer working. I will try to figure out why. The bucket number is not respected in insert overwrite. --- Key: HIVE-10880 URL: https://issues.apache.org/jira/browse/HIVE-10880 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Priority: Blocker Attachments: HIVE-10880.1.patch, HIVE-10880.2.patch, HIVE-10880.3.patch When hive.enforce.bucketing is true, the bucket number defined in the table is no longer respected in current master and 1.2. This is a regression. Reproduce: {noformat} CREATE TABLE IF NOT EXISTS buckettestinput( data string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput1( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput2( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; Then I inserted the following data into the buckettestinput table firstinsert1 firstinsert2 firstinsert3 firstinsert4 firstinsert5 firstinsert6 firstinsert7 firstinsert8 secondinsert1 secondinsert2 secondinsert3 secondinsert4 secondinsert5 secondinsert6 secondinsert7 secondinsert8 set hive.enforce.bucketing = true; set hive.enforce.sorting=true; insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'; set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); Error: Error while compiling statement: FAILED: SemanticException [Error 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 (state=42000,code=10141) {noformat} The related debug information related to insert overwrite: {noformat} 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'insert overwrite table buckettestoutput1 0: jdbc:hive2://localhost:1 ; select * from buckettestinput where data like ' first%'; INFO : Number of reduce tasks determined at compile time: 2 INFO : In order to change the average load for a reducer (in bytes): INFO : set hive.exec.reducers.bytes.per.reducer=number INFO : In order to limit the maximum number of reducers: INFO : set hive.exec.reducers.max=number INFO : In order to set a constant number of reducers: INFO : set mapred.reduce.tasks=number INFO : Job running in-process (local Hadoop) INFO : 2015-06-01 11:09:29,650 Stage-1 map = 86%, reduce = 100% INFO : Ended Job = job_local107155352_0001 INFO : Loading data to table default.buckettestoutput1 from file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1 INFO : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, totalSize=52, rawDataSize=48] No rows affected (1.692 seconds) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch
[ https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579076#comment-14579076 ] Sergio Peña commented on HIVE-10943: The variable BEELINE-CLI_URL is not valid for shell scripts because. Could you try with BEELINE_CLI_URL? {code} $ BEELINE-CLI_URL=url BEELINE-CLI_URL=url: command not found $ BEELINE_CLI_URL=url $ {code} Beeline-cli: Enable precommit for beelie-cli branch Key: HIVE-10943 URL: https://issues.apache.org/jira/browse/HIVE-10943 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10943.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10866) Throw error when client try to insert into bucketed table
[ https://issues.apache.org/jira/browse/HIVE-10866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-10866: Attachment: HIVE-10866.2.patch Throw error when client try to insert into bucketed table - Key: HIVE-10866 URL: https://issues.apache.org/jira/browse/HIVE-10866 Project: Hive Issue Type: Improvement Affects Versions: 1.2.0, 1.3.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-10866.1.patch, HIVE-10866.2.patch Currently, hive does not support appends(insert into) bucketed table, see open jira HIVE-3608. When insert into such table, the data will be corrupted and not fit for bucketmapjoin. We need find a way to prevent client from inserting into such table. Reproduce: {noformat} CREATE TABLE IF NOT EXISTS buckettestoutput1( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput2( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; set hive.enforce.bucketing = true; set hive.enforce.sorting=true; insert into table buckettestoutput1 select code from sample_07 where total_emp 134354250 limit 10; After this first insert, I did: set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; set hive.auto.convert.sortmerge.join.noconditionaltask=true; 0: jdbc:hive2://localhost:1 select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); +---+---+ | data | data | +---+---+ +---+---+ So select works fine. Second insert: 0: jdbc:hive2://localhost:1 insert into table buckettestoutput1 select code from sample_07 where total_emp = 134354250 limit 10; No rows affected (61.235 seconds) Then select: 0: jdbc:hive2://localhost:1 select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); Error: Error while compiling statement: FAILED: SemanticException [Error 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets for table buckettestoutput1 is 2, whereas the number of files is 4 (state=42000,code=10141) 0: jdbc:hive2://localhost:1 {noformat} Insert into empty table or partition will be fine, but insert into the non-empty one (after second insert in the reproduce), the bucketmapjoin will throw an error. We should not let second insert succeed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10963) Hive throws NPE rather than meaningful error message when window is missing
[ https://issues.apache.org/jira/browse/HIVE-10963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579011#comment-14579011 ] Ashutosh Chauhan commented on HIVE-10963: - +1 Hive throws NPE rather than meaningful error message when window is missing --- Key: HIVE-10963 URL: https://issues.apache.org/jira/browse/HIVE-10963 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 1.3.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10963.patch {{select sum(salary) over w1 from emp;}} throws NPE rather than meaningful error message like missing window. And also give the right window name rather than the classname in the error message after NPE issue is fixed. {noformat} org.apache.hadoop.hive.ql.parse.SemanticException: Window Spec org.apache.hadoop.hive.ql.parse.WindowingSpec$WindowSpec@7954e1de refers to an unknown source {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10963) Hive throws NPE rather than meaningful error message when window is missing
[ https://issues.apache.org/jira/browse/HIVE-10963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10963: Attachment: HIVE-10963.patch Hive throws NPE rather than meaningful error message when window is missing --- Key: HIVE-10963 URL: https://issues.apache.org/jira/browse/HIVE-10963 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 1.3.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10963.patch {{select sum(salary) over w1 from emp;}} throws NPE rather than meaningful error message like missing window. And also give the right window name rather than the classname in the error message after NPE issue is fixed. {noformat} org.apache.hadoop.hive.ql.parse.SemanticException: Window Spec org.apache.hadoop.hive.ql.parse.WindowingSpec$WindowSpec@7954e1de refers to an unknown source {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7180) BufferedReader is not closed in MetaStoreSchemaInfo ctor
[ https://issues.apache.org/jira/browse/HIVE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HIVE-7180: - Description: Here is related code: {code} BufferedReader bfReader = new BufferedReader(new FileReader(upgradeListFile)); String currSchemaVersion; while ((currSchemaVersion = bfReader.readLine()) != null) { upgradeOrderList.add(currSchemaVersion.trim()); {code} BufferedReader / FileReader should be closed upon return from ctor. was: Here is related code: {code} BufferedReader bfReader = new BufferedReader(new FileReader(upgradeListFile)); String currSchemaVersion; while ((currSchemaVersion = bfReader.readLine()) != null) { upgradeOrderList.add(currSchemaVersion.trim()); {code} BufferedReader / FileReader should be closed upon return from ctor. BufferedReader is not closed in MetaStoreSchemaInfo ctor Key: HIVE-7180 URL: https://issues.apache.org/jira/browse/HIVE-7180 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Ted Yu Assignee: skrho Priority: Minor Labels: patch Attachments: HIVE-7180.patch, HIVE-7180_001.patch Here is related code: {code} BufferedReader bfReader = new BufferedReader(new FileReader(upgradeListFile)); String currSchemaVersion; while ((currSchemaVersion = bfReader.readLine()) != null) { upgradeOrderList.add(currSchemaVersion.trim()); {code} BufferedReader / FileReader should be closed upon return from ctor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7172) Potential resource leak in HiveSchemaTool#getMetaStoreSchemaVersion()
[ https://issues.apache.org/jira/browse/HIVE-7172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HIVE-7172: - Description: {code} ResultSet res = stmt.executeQuery(versionQuery); if (!res.next()) { throw new HiveMetaException(Didn't find version data in metastore); } String currentSchemaVersion = res.getString(1); metastoreConn.close(); {code} When HiveMetaException is thrown, metastoreConn.close() would be skipped. stmt is not closed upon return from the method. was: {code} ResultSet res = stmt.executeQuery(versionQuery); if (!res.next()) { throw new HiveMetaException(Didn't find version data in metastore); } String currentSchemaVersion = res.getString(1); metastoreConn.close(); {code} When HiveMetaException is thrown, metastoreConn.close() would be skipped. stmt is not closed upon return from the method. Potential resource leak in HiveSchemaTool#getMetaStoreSchemaVersion() - Key: HIVE-7172 URL: https://issues.apache.org/jira/browse/HIVE-7172 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: HIVE-7172.patch {code} ResultSet res = stmt.executeQuery(versionQuery); if (!res.next()) { throw new HiveMetaException(Didn't find version data in metastore); } String currentSchemaVersion = res.getString(1); metastoreConn.close(); {code} When HiveMetaException is thrown, metastoreConn.close() would be skipped. stmt is not closed upon return from the method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7305) Return value from in.read() is ignored in SerializationUtils#readLongLE()
[ https://issues.apache.org/jira/browse/HIVE-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HIVE-7305: - Description: {code} long readLongLE(InputStream in) throws IOException { in.read(readBuffer, 0, 8); return (((readBuffer[0] 0xff) 0) + ((readBuffer[1] 0xff) 8) {code} Return value from read() may indicate fewer than 8 bytes read. The return value should be checked. was: {code} long readLongLE(InputStream in) throws IOException { in.read(readBuffer, 0, 8); return (((readBuffer[0] 0xff) 0) + ((readBuffer[1] 0xff) 8) {code} Return value from read() may indicate fewer than 8 bytes read. The return value should be checked. Return value from in.read() is ignored in SerializationUtils#readLongLE() - Key: HIVE-7305 URL: https://issues.apache.org/jira/browse/HIVE-7305 Project: Hive Issue Type: Bug Reporter: Ted Yu Assignee: skrho Priority: Minor Attachments: HIVE-7305_001.patch {code} long readLongLE(InputStream in) throws IOException { in.read(readBuffer, 0, 8); return (((readBuffer[0] 0xff) 0) + ((readBuffer[1] 0xff) 8) {code} Return value from read() may indicate fewer than 8 bytes read. The return value should be checked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10956) HS2 leaks HMS connections
[ https://issues.apache.org/jira/browse/HIVE-10956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-10956: --- Attachment: HIVE-10956.2.patch Attached patch v2. It is also on RB: https://reviews.apache.org/r/35256/ The new patch closes the connection when the Hive session is closed. HS2 leaks HMS connections - Key: HIVE-10956 URL: https://issues.apache.org/jira/browse/HIVE-10956 Project: Hive Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 1.3.0, 2.0.0 Attachments: HIVE-10956.1.patch, HIVE-10956.2.patch HS2 uses threadlocal to cache HMS client in class Hive. When the thread is dead, the HMS client is not closed. So the connection to the HMS is leaked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10963) Hive throws NPE rather than meaningful error message when window is missing
[ https://issues.apache.org/jira/browse/HIVE-10963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579215#comment-14579215 ] Hive QA commented on HIVE-10963: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738596/HIVE-10963.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9005 tests executed *Failed tests:* {noformat} org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4226/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4226/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4226/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738596 - PreCommit-HIVE-TRUNK-Build Hive throws NPE rather than meaningful error message when window is missing --- Key: HIVE-10963 URL: https://issues.apache.org/jira/browse/HIVE-10963 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 1.3.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10963.patch {{select sum(salary) over w1 from emp;}} throws NPE rather than meaningful error message like missing window. And also give the right window name rather than the classname in the error message after NPE issue is fixed. {noformat} org.apache.hadoop.hive.ql.parse.SemanticException: Window Spec org.apache.hadoop.hive.ql.parse.WindowingSpec$WindowSpec@7954e1de refers to an unknown source {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10882) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filterMap of join operator causes NPE exception
[ https://issues.apache.org/jira/browse/HIVE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579242#comment-14579242 ] Pengcheng Xiong commented on HIVE-10882: [~jcamachorodriguez], I have started but I have not figured out a solution yet. Thus, please go ahead and take it as I am busy with UT failures these days. Also ccing [~jpullokkaran]. Thanks. CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filterMap of join operator causes NPE exception -- Key: HIVE-10882 URL: https://issues.apache.org/jira/browse/HIVE-10882 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong CBO return path creates join operator with empty filters. However, vectorization is checking the filters of bigTable in join. This causes NPE exception. To reproduce, run vector_outer_join2.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10882) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filterMap of join operator causes NPE exception
[ https://issues.apache.org/jira/browse/HIVE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-10882: --- Assignee: Jesus Camacho Rodriguez (was: Pengcheng Xiong) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filterMap of join operator causes NPE exception -- Key: HIVE-10882 URL: https://issues.apache.org/jira/browse/HIVE-10882 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Jesus Camacho Rodriguez CBO return path creates join operator with empty filters. However, vectorization is checking the filters of bigTable in join. This causes NPE exception. To reproduce, run vector_outer_join2.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7598) Potential null pointer dereference in MergeTask#closeJob()
[ https://issues.apache.org/jira/browse/HIVE-7598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HIVE-7598: - Description: Call to Utilities.mvFileToFinalPath() passes null as second last parameter, conf. null gets passed to createEmptyBuckets() which dereferences conf directly: {code} boolean isCompressed = conf.getCompressed(); TableDesc tableInfo = conf.getTableInfo(); {code} was: Call to Utilities.mvFileToFinalPath() passes null as second last parameter, conf. null gets passed to createEmptyBuckets() which dereferences conf directly: {code} boolean isCompressed = conf.getCompressed(); TableDesc tableInfo = conf.getTableInfo(); {code} Potential null pointer dereference in MergeTask#closeJob() -- Key: HIVE-7598 URL: https://issues.apache.org/jira/browse/HIVE-7598 Project: Hive Issue Type: Bug Reporter: Ted Yu Assignee: SUYEON LEE Priority: Minor Attachments: HIVE-7598.patch Call to Utilities.mvFileToFinalPath() passes null as second last parameter, conf. null gets passed to createEmptyBuckets() which dereferences conf directly: {code} boolean isCompressed = conf.getCompressed(); TableDesc tableInfo = conf.getTableInfo(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()
[ https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HIVE-7150: - Description: Here is related code: {code} sslTrustStore.load(new FileInputStream(sslTrustStorePath), sslTrustStorePassword.toCharArray()); {code} The FileInputStream is not closed upon returning from the method. was: Here is related code: {code} sslTrustStore.load(new FileInputStream(sslTrustStorePath), sslTrustStorePassword.toCharArray()); {code} The FileInputStream is not closed upon returning from the method. FileInputStream is not closed in HiveConnection#getHttpClient() --- Key: HIVE-7150 URL: https://issues.apache.org/jira/browse/HIVE-7150 Project: Hive Issue Type: Bug Reporter: Ted Yu Labels: jdbc Attachments: HIVE-7150.1.patch, HIVE-7150.2.patch Here is related code: {code} sslTrustStore.load(new FileInputStream(sslTrustStorePath), sslTrustStorePassword.toCharArray()); {code} The FileInputStream is not closed upon returning from the method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch
[ https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579420#comment-14579420 ] Sergio Peña commented on HIVE-10943: +1 I created the job in jenkins, and added the new properties file to the jenkins instance. I think I need to restart jenkins, I'll wait until there are no more jobs running. Beeline-cli: Enable precommit for beelie-cli branch Key: HIVE-10943 URL: https://issues.apache.org/jira/browse/HIVE-10943 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10943.patch, HIVE-10943.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements
[ https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran reassigned HIVE-10841: - Assignee: Laljo John Pullokkaran (was: Alexander Pivovarov) [WHERE col is not null] does not work sometimes for queries with many JOIN statements - Key: HIVE-10841 URL: https://issues.apache.org/jira/browse/HIVE-10841 Project: Hive Issue Type: Bug Components: Query Planning, Query Processor Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0, 1.3.0 Reporter: Alexander Pivovarov Assignee: Laljo John Pullokkaran Attachments: HIVE-10841.1.patch, HIVE-10841.patch The result from the following SELECT query is 3 rows but it should be 1 row. I checked it in MySQL - it returned 1 row. To reproduce the issue in Hive 1. prepare tables {code} drop table if exists L; drop table if exists LA; drop table if exists FR; drop table if exists A; drop table if exists PI; drop table if exists acct; create table L as select 4436 id; create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id; create table FR as select 4436 loan_id; create table A as select 4748 id; create table PI as select 4415 id; create table acct as select 4748 aid, 10 acc_n, 122 brn; insert into table acct values(4748, null, null); insert into table acct values(4748, null, null); {code} 2. run SELECT query {code} select acct.ACC_N, acct.brn FROM L JOIN LA ON L.id = LA.loan_id JOIN FR ON L.id = FR.loan_id JOIN A ON LA.aid = A.id JOIN PI ON PI.id = LA.pi_id JOIN acct ON A.id = acct.aid WHERE L.id = 4436 and acct.brn is not null; {code} the result is 3 rows {code} 10122 NULL NULL NULL NULL {code} but it should be 1 row {code} 10122 {code} 2.1 explain select ... output for hive-1.3.0 MR {code} STAGE DEPENDENCIES: Stage-12 is a root stage Stage-9 depends on stages: Stage-12 Stage-0 depends on stages: Stage-9 STAGE PLANS: Stage: Stage-12 Map Reduce Local Work Alias - Map Local Tables: a Fetch Operator limit: -1 acct Fetch Operator limit: -1 fr Fetch Operator limit: -1 l Fetch Operator limit: -1 pi Fetch Operator limit: -1 Alias - Map Local Operator Tree: a TableScan alias: a Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) acct TableScan alias: acct Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: aid is not null (type: boolean) Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) fr TableScan alias: fr Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (loan_id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) l TableScan alias: l Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) pi TableScan alias: pi Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE
[jira] [Commented] (HIVE-10968) Windows: analyze json table via beeline failed throwing Class org.apache.hive.hcatalog.data.JsonSerDe not found
[ https://issues.apache.org/jira/browse/HIVE-10968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579286#comment-14579286 ] Thejas M Nair commented on HIVE-10968: -- +1 Thanks for the patch Hari! Windows: analyze json table via beeline failed throwing Class org.apache.hive.hcatalog.data.JsonSerDe not found --- Key: HIVE-10968 URL: https://issues.apache.org/jira/browse/HIVE-10968 Project: Hive Issue Type: Bug Components: HiveServer2 Environment: Windows Reporter: Takahiko Saito Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.1 Attachments: HIVE-10968.1.patch NO PRECOMMIT TESTS Run the following via beeline: {noformat}0: jdbc:hive2://localhost:10001 analyze table all100kjson compute statistics; 15/06/05 20:44:11 INFO log.PerfLogger: PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver 15/06/05 20:44:11 INFO parse.ParseDriver: Parsing command: analyze table all100kjson compute statistics 15/06/05 20:44:11 INFO parse.ParseDriver: Parse Completed 15/06/05 20:44:11 INFO log.PerfLogger: /PERFLOG method=parse start=1433537051075 end=1433537051077 duration=2 from=org. apache.hadoop.hive.ql.Driver 15/06/05 20:44:11 INFO log.PerfLogger: PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver 15/06/05 20:44:11 INFO parse.ColumnStatsSemanticAnalyzer: Invoking analyze on original query 15/06/05 20:44:11 INFO parse.ColumnStatsSemanticAnalyzer: Starting Semantic Analysis 15/06/05 20:44:11 INFO parse.ColumnStatsSemanticAnalyzer: Completed phase 1 of Semantic Analysis 15/06/05 20:44:11 INFO parse.ColumnStatsSemanticAnalyzer: Get metadata for source tables 15/06/05 20:44:11 INFO metastore.HiveMetaStore: 5: get_table : db=default tbl=all100kjson 15/06/05 20:44:11 INFO HiveMetaStore.audit: ugi=hadoopqa ip=unknown-ip-addr cmd=get_table : db=default tbl=a ll100kjson 15/06/05 20:44:11 INFO metastore.HiveMetaStore: 5: get_table : db=default tbl=all100kjson 15/06/05 20:44:11 INFO HiveMetaStore.audit: ugi=hadoopqa ip=unknown-ip-addr cmd=get_table : db=default tbl=a ll100kjson 15/06/05 20:44:11 INFO parse.ColumnStatsSemanticAnalyzer: Get metadata for subqueries 15/06/05 20:44:11 INFO parse.ColumnStatsSemanticAnalyzer: Get metadata for destination tables 15/06/05 20:44:11 INFO parse.ColumnStatsSemanticAnalyzer: Completed getting MetaData in Semantic Analysis 15/06/05 20:44:11 INFO common.FileUtils: Creating directory if it doesn't exist: hdfs://dal-hs211:8020/user/hcat/tests/d ata/all100kjson/.hive-staging_hive_2015-06-05_20-44-11_075_4520028480897676073-5 15/06/05 20:44:11 INFO parse.ColumnStatsSemanticAnalyzer: Set stats collection dir : hdfs://dal-hs211:8020/user/hcat/tes ts/data/all100kjson/.hive-staging_hive_2015-06-05_20-44-11_075_4520028480897676073-5/-ext-1 15/06/05 20:44:11 INFO ppd.OpProcFactory: Processing for TS(5) 15/06/05 20:44:11 INFO log.PerfLogger: PERFLOG method=partition-retrieving from=org.apache.hadoop.hive.ql.optimizer.ppr .PartitionPruner 15/06/05 20:44:11 INFO log.PerfLogger: /PERFLOG method=partition-retrieving start=1433537051345 end=1433537051345 durat ion=0 from=org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner 15/06/05 20:44:11 INFO metastore.HiveMetaStore: 5: get_indexes : db=default tbl=all100kjson 15/06/05 20:44:11 INFO HiveMetaStore.audit: ugi=hadoopqa ip=unknown-ip-addr cmd=get_indexes : db=default tbl =all100kjson 15/06/05 20:44:11 INFO metastore.HiveMetaStore: 5: get_indexes : db=default tbl=all100kjson 15/06/05 20:44:11 INFO HiveMetaStore.audit: ugi=hadoopqa ip=unknown-ip-addr cmd=get_indexes : db=default tbl =all100kjson 15/06/05 20:44:11 INFO physical.NullScanTaskDispatcher: Looking for table scans where optimization is applicable 15/06/05 20:44:11 INFO physical.NullScanTaskDispatcher: Found 0 null table scans 15/06/05 20:44:11 INFO physical.NullScanTaskDispatcher: Looking for table scans where optimization is applicable 15/06/05 20:44:11 INFO physical.NullScanTaskDispatcher: Found 0 null table scans 15/06/05 20:44:11 INFO physical.NullScanTaskDispatcher: Looking for table scans where optimization is applicable 15/06/05 20:44:11 INFO physical.NullScanTaskDispatcher: Found 0 null table scans 15/06/05 20:44:11 INFO physical.Vectorizer: Validating MapWork... 15/06/05 20:44:11 INFO physical.Vectorizer: Input format: org.apache.hadoop.mapred.TextInputFormat, doesn't provide vect orized input 15/06/05 20:44:11 INFO parse.ColumnStatsSemanticAnalyzer: Completed plan generation 15/06/05 20:44:11 INFO ql.Driver: Semantic Analysis Completed 15/06/05 20:44:11 INFO log.PerfLogger: /PERFLOG
[jira] [Updated] (HIVE-10966) direct SQL for stats has a cast exception on some databases
[ https://issues.apache.org/jira/browse/HIVE-10966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-10966: - Attachment: HIVE-10966.patch Uploading patch for kicking off another test run. direct SQL for stats has a cast exception on some databases --- Key: HIVE-10966 URL: https://issues.apache.org/jira/browse/HIVE-10966 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 1.3.0, 1.2.1, 2.0.0 Attachments: HIVE-10966.patch, HIVE-10966.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10866) Throw error when client try to insert into bucketed table
[ https://issues.apache.org/jira/browse/HIVE-10866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579366#comment-14579366 ] Hive QA commented on HIVE-10866: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738599/HIVE-10866.2.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9005 tests executed *Failed tests:* {noformat} org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4227/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4227/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4227/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738599 - PreCommit-HIVE-TRUNK-Build Throw error when client try to insert into bucketed table - Key: HIVE-10866 URL: https://issues.apache.org/jira/browse/HIVE-10866 Project: Hive Issue Type: Improvement Affects Versions: 1.2.0, 1.3.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-10866.1.patch, HIVE-10866.2.patch Currently, hive does not support appends(insert into) bucketed table, see open jira HIVE-3608. When insert into such table, the data will be corrupted and not fit for bucketmapjoin. We need find a way to prevent client from inserting into such table. Reproduce: {noformat} CREATE TABLE IF NOT EXISTS buckettestoutput1( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput2( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; set hive.enforce.bucketing = true; set hive.enforce.sorting=true; insert into table buckettestoutput1 select code from sample_07 where total_emp 134354250 limit 10; After this first insert, I did: set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; set hive.auto.convert.sortmerge.join.noconditionaltask=true; 0: jdbc:hive2://localhost:1 select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); +---+---+ | data | data | +---+---+ +---+---+ So select works fine. Second insert: 0: jdbc:hive2://localhost:1 insert into table buckettestoutput1 select code from sample_07 where total_emp = 134354250 limit 10; No rows affected (61.235 seconds) Then select: 0: jdbc:hive2://localhost:1 select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); Error: Error while compiling statement: FAILED: SemanticException [Error 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets for table buckettestoutput1 is 2, whereas the number of files is 4 (state=42000,code=10141) 0: jdbc:hive2://localhost:1 {noformat} Insert into empty table or partition will be fine, but insert into the non-empty one (after second insert in the reproduce), the bucketmapjoin will throw an error. We should not let second insert succeed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch
[ https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-10943: --- Attachment: HIVE-10943.patch Fix the type in the patch and reattached. Beeline-cli: Enable precommit for beelie-cli branch Key: HIVE-10943 URL: https://issues.apache.org/jira/browse/HIVE-10943 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10943.patch, HIVE-10943.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements
[ https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579397#comment-14579397 ] Laljo John Pullokkaran commented on HIVE-10841: --- [~a.semyannikov] I hope you don't mind; i have assigned the bug to myself. [WHERE col is not null] does not work sometimes for queries with many JOIN statements - Key: HIVE-10841 URL: https://issues.apache.org/jira/browse/HIVE-10841 Project: Hive Issue Type: Bug Components: Query Planning, Query Processor Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0, 1.3.0 Reporter: Alexander Pivovarov Assignee: Laljo John Pullokkaran Attachments: HIVE-10841.1.patch, HIVE-10841.patch The result from the following SELECT query is 3 rows but it should be 1 row. I checked it in MySQL - it returned 1 row. To reproduce the issue in Hive 1. prepare tables {code} drop table if exists L; drop table if exists LA; drop table if exists FR; drop table if exists A; drop table if exists PI; drop table if exists acct; create table L as select 4436 id; create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id; create table FR as select 4436 loan_id; create table A as select 4748 id; create table PI as select 4415 id; create table acct as select 4748 aid, 10 acc_n, 122 brn; insert into table acct values(4748, null, null); insert into table acct values(4748, null, null); {code} 2. run SELECT query {code} select acct.ACC_N, acct.brn FROM L JOIN LA ON L.id = LA.loan_id JOIN FR ON L.id = FR.loan_id JOIN A ON LA.aid = A.id JOIN PI ON PI.id = LA.pi_id JOIN acct ON A.id = acct.aid WHERE L.id = 4436 and acct.brn is not null; {code} the result is 3 rows {code} 10122 NULL NULL NULL NULL {code} but it should be 1 row {code} 10122 {code} 2.1 explain select ... output for hive-1.3.0 MR {code} STAGE DEPENDENCIES: Stage-12 is a root stage Stage-9 depends on stages: Stage-12 Stage-0 depends on stages: Stage-9 STAGE PLANS: Stage: Stage-12 Map Reduce Local Work Alias - Map Local Tables: a Fetch Operator limit: -1 acct Fetch Operator limit: -1 fr Fetch Operator limit: -1 l Fetch Operator limit: -1 pi Fetch Operator limit: -1 Alias - Map Local Operator Tree: a TableScan alias: a Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) acct TableScan alias: acct Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: aid is not null (type: boolean) Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) fr TableScan alias: fr Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (loan_id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) l TableScan alias: l Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) pi TableScan alias: pi Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4
[jira] [Commented] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values
[ https://issues.apache.org/jira/browse/HIVE-10929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579525#comment-14579525 ] Vaibhav Gumashta commented on HIVE-10929: - Also committed to branch-1. In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values -- Key: HIVE-10929 URL: https://issues.apache.org/jira/browse/HIVE-10929 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 1.3.0, 1.2.1, 2.0.0 Attachments: HIVE-10929.1.patch, HIVE-10929.2.patch, HIVE-10929.3.patch, HIVE-10929.4.patch {code} create table dummy(i int); insert into table dummy values (1); select * from dummy; create table partunion1(id1 int) partitioned by (part1 string); set hive.exec.dynamic.partition.mode=nonstrict; set hive.execution.engine=tez; explain insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; select * from partunion1; {code} fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10974) Use Configuration::getRaw() for the Base64 data
[ https://issues.apache.org/jira/browse/HIVE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-10974: --- Attachment: HIVE-10974.1.patch Use Configuration::getRaw() for the Base64 data --- Key: HIVE-10974 URL: https://issues.apache.org/jira/browse/HIVE-10974 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-10974.1.patch Inspired by the Twitter HadoopSummit talk {code} if (HiveConf.getBoolVar(conf, ConfVars.HIVE_RPC_QUERY_PLAN)) { LOG.debug(Loading plan from string: +path.toUri().getPath()); String planString = conf.get(path.toUri().getPath()); {code} Use getRaw() in other places where Base64 data is present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10974) Use Configuration::getRaw() for the Base64 data
[ https://issues.apache.org/jira/browse/HIVE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-10974: --- Summary: Use Configuration::getRaw() for the Base64 data (was: Use Cofiguration::getRaw() for the Base64 data) Use Configuration::getRaw() for the Base64 data --- Key: HIVE-10974 URL: https://issues.apache.org/jira/browse/HIVE-10974 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-10974.1.patch Inspired by the Twitter HadoopSummit talk {code} if (HiveConf.getBoolVar(conf, ConfVars.HIVE_RPC_QUERY_PLAN)) { LOG.debug(Loading plan from string: +path.toUri().getPath()); String planString = conf.get(path.toUri().getPath()); {code} Use getRaw() in other places where Base64 data is present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6991) History not able to disable/enable after session started
[ https://issues.apache.org/jira/browse/HIVE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579438#comment-14579438 ] Chinna Rao Lalam commented on HIVE-6991: Hi [~jxiang], Yes.. I have tested it in my live cluster by disable and enable this property, in CLI and HiveServer2. It is working as expected. Thanks for the review. History not able to disable/enable after session started Key: HIVE-6991 URL: https://issues.apache.org/jira/browse/HIVE-6991 Project: Hive Issue Type: Bug Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-6991.1.patch, HIVE-6991.2.patch, HIVE-6991.patch By default history is disabled, after session started if enable history through this command set hive.session.history.enabled=true. It is not working. I think it will help to this user query http://mail-archives.apache.org/mod_mbox/hive-user/201404.mbox/%3ccajqy7afapa_pjs6buon0o8zyt2qwfn2wt-mtznwfmurav_8...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements
[ https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579530#comment-14579530 ] Alexander Pivovarov commented on HIVE-10841: Sure, np. Btw, my apache id is apivovarov [WHERE col is not null] does not work sometimes for queries with many JOIN statements - Key: HIVE-10841 URL: https://issues.apache.org/jira/browse/HIVE-10841 Project: Hive Issue Type: Bug Components: Query Planning, Query Processor Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0, 1.3.0 Reporter: Alexander Pivovarov Assignee: Laljo John Pullokkaran Attachments: HIVE-10841.1.patch, HIVE-10841.patch The result from the following SELECT query is 3 rows but it should be 1 row. I checked it in MySQL - it returned 1 row. To reproduce the issue in Hive 1. prepare tables {code} drop table if exists L; drop table if exists LA; drop table if exists FR; drop table if exists A; drop table if exists PI; drop table if exists acct; create table L as select 4436 id; create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id; create table FR as select 4436 loan_id; create table A as select 4748 id; create table PI as select 4415 id; create table acct as select 4748 aid, 10 acc_n, 122 brn; insert into table acct values(4748, null, null); insert into table acct values(4748, null, null); {code} 2. run SELECT query {code} select acct.ACC_N, acct.brn FROM L JOIN LA ON L.id = LA.loan_id JOIN FR ON L.id = FR.loan_id JOIN A ON LA.aid = A.id JOIN PI ON PI.id = LA.pi_id JOIN acct ON A.id = acct.aid WHERE L.id = 4436 and acct.brn is not null; {code} the result is 3 rows {code} 10122 NULL NULL NULL NULL {code} but it should be 1 row {code} 10122 {code} 2.1 explain select ... output for hive-1.3.0 MR {code} STAGE DEPENDENCIES: Stage-12 is a root stage Stage-9 depends on stages: Stage-12 Stage-0 depends on stages: Stage-9 STAGE PLANS: Stage: Stage-12 Map Reduce Local Work Alias - Map Local Tables: a Fetch Operator limit: -1 acct Fetch Operator limit: -1 fr Fetch Operator limit: -1 l Fetch Operator limit: -1 pi Fetch Operator limit: -1 Alias - Map Local Operator Tree: a TableScan alias: a Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) acct TableScan alias: acct Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: aid is not null (type: boolean) Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) fr TableScan alias: fr Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (loan_id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) l TableScan alias: l Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) pi TableScan alias: pi Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-10961) LLAP: ShuffleHandler + Submit work init race condition
[ https://issues.apache.org/jira/browse/HIVE-10961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-10961: -- Attachment: HIVE-10961.1.txt LLAP: ShuffleHandler + Submit work init race condition -- Key: HIVE-10961 URL: https://issues.apache.org/jira/browse/HIVE-10961 Project: Hive Issue Type: Sub-task Affects Versions: llap Reporter: Gopal V Assignee: Siddharth Seth Fix For: llap Attachments: HIVE-10961.1.txt When flexing in a new node, it accepts DAG requests before the shuffle handler is setup, causing fatals {code} DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:2 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1433459966952_0729_1_00, diagnostics=[Task failed, taskId=task_1t at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler.get(ShuffleHandler.java:353) at org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl.submitWork(ContainerRunnerImpl.java:192) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.submitWork(LlapDaemon.java:301) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemonProtocolServerImpl.submitWork(LlapDaemonProtocolServerImpl.java:75) at org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos$LlapDaemonProtocol$2.callBlockingMethod(LlapDaemonProtocolProtos.java:12094) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:972) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2085) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2081) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1654) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2081) ], TaskAttempt 1 failed, info=[org.apache.hadoop.ipc.RemoteException(java.lang.IllegalStateException): ShuffleHandler must be started before invoking get at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler.get(ShuffleHandler.java:353) at org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl.submitWork(ContainerRunnerImpl.java:192) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.submitWork(LlapDaemon.java:301) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemonProtocolServerImpl.submitWork(LlapDaemonProtocolServerImpl.java:75) at org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos$LlapDaemonProtocol$2.callBlockingMethod(LlapDaemonProtocolProtos.java:12094) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:972) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2085) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10972) DummyTxnManager always locks the current database in shared mode, which is incorrect.
[ https://issues.apache.org/jira/browse/HIVE-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10972: Description: In DummyTxnManager [line 163 | http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#163], it always locks the current database. That is not correct since the current database can be db1, and the query can be select * from db2.tb1, which will lock db1 unnecessarily. was: In DummyTxnManager [line 163 | http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#163], it always locks the current database. That is not correct since the current database can be db1, and the query can be select * from db2.tb1, which will lock db1 unnessary. DummyTxnManager always locks the current database in shared mode, which is incorrect. - Key: HIVE-10972 URL: https://issues.apache.org/jira/browse/HIVE-10972 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu In DummyTxnManager [line 163 | http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#163], it always locks the current database. That is not correct since the current database can be db1, and the query can be select * from db2.tb1, which will lock db1 unnecessarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10956) HS2 leaks HMS connections
[ https://issues.apache.org/jira/browse/HIVE-10956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-10956: --- Attachment: HIVE-10956.3.patch HS2 leaks HMS connections - Key: HIVE-10956 URL: https://issues.apache.org/jira/browse/HIVE-10956 Project: Hive Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 1.3.0, 2.0.0 Attachments: HIVE-10956.1.patch, HIVE-10956.2.patch, HIVE-10956.3.patch HS2 uses threadlocal to cache HMS client in class Hive. When the thread is dead, the HMS client is not closed. So the connection to the HMS is leaked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10415) hive.start.cleanup.scratchdir configuration is not taking effect
[ https://issues.apache.org/jira/browse/HIVE-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579446#comment-14579446 ] Chinna Rao Lalam commented on HIVE-10415: - Hi [~jxiang], Yes.. By default this setting is off. Thanks for the review. hive.start.cleanup.scratchdir configuration is not taking effect Key: HIVE-10415 URL: https://issues.apache.org/jira/browse/HIVE-10415 Project: Hive Issue Type: Bug Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10415.patch This configuration hive.start.cleanup.scratchdir is not taking effect -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10956) HS2 leaks HMS connections
[ https://issues.apache.org/jira/browse/HIVE-10956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579491#comment-14579491 ] Hive QA commented on HIVE-10956: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738615/HIVE-10956.2.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8990 tests executed *Failed tests:* {noformat} TestEncryptedHDFSCliDriver - did not produce a TEST-*.xml file org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4228/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4228/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4228/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738615 - PreCommit-HIVE-TRUNK-Build HS2 leaks HMS connections - Key: HIVE-10956 URL: https://issues.apache.org/jira/browse/HIVE-10956 Project: Hive Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 1.3.0, 2.0.0 Attachments: HIVE-10956.1.patch, HIVE-10956.2.patch HS2 uses threadlocal to cache HMS client in class Hive. When the thread is dead, the HMS client is not closed. So the connection to the HMS is leaked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10857) Accumulo storage handler fail throwing java.lang.IllegalArgumentException: Cannot determine SASL mechanism for token class: class org.apache.accumulo.core.client.securi
[ https://issues.apache.org/jira/browse/HIVE-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579452#comment-14579452 ] Sushanth Sowmyan commented on HIVE-10857: - +1. I don't know enough about Accumulo to know that the patch does fix the issue, but the code makes sense, if it works the way it reads, and I see that this does not negatively impact the backward compatible case, and does not negatively impact hive. Also, the relevant .q and unit tests all succeed. Accumulo storage handler fail throwing java.lang.IllegalArgumentException: Cannot determine SASL mechanism for token class: class org.apache.accumulo.core.client.security.tokens.PasswordToken --- Key: HIVE-10857 URL: https://issues.apache.org/jira/browse/HIVE-10857 Project: Hive Issue Type: Bug Affects Versions: 1.2.1 Reporter: Takahiko Saito Assignee: Josh Elser Fix For: 1.2.1 Attachments: HIVE-10857.2.patch, HIVE-10857.patch create table Accumulo storage with Accumulo storage handler fails due to ACCUMULO-2815. {noformat} create table accumulo_1(key string, age int) stored by 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler' with serdeproperties ( accumulo.columns.mapping = :rowid,info:age); {noformat} The error shows: {noformat} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:org.apache.accumulo.core.client.AccumuloException: java.lang.IllegalArgumentException: Cannot determine SASL mechanism for token class: class org.apache.accumulo.core.client.security.tokens.PasswordToken at org.apache.accumulo.core.client.impl.ServerClient.execute(ServerClient.java:67) at org.apache.accumulo.core.client.impl.ConnectorImpl.init(ConnectorImpl.java:67) at org.apache.accumulo.core.client.ZooKeeperInstance.getConnector(ZooKeeperInstance.java:248) at org.apache.hadoop.hive.accumulo.AccumuloConnectionParameters.getConnector(AccumuloConnectionParameters.java:125) at org.apache.hadoop.hive.accumulo.AccumuloConnectionParameters.getConnector(AccumuloConnectionParameters.java:111) at org.apache.hadoop.hive.accumulo.AccumuloStorageHandler.preCreateTable(AccumuloStorageHandler.java:245) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:664) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:657) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156) at com.sun.proxy.$Proxy5.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:714) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4135) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:306) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at
[jira] [Commented] (HIVE-10971) count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true
[ https://issues.apache.org/jira/browse/HIVE-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579482#comment-14579482 ] Ashutosh Chauhan commented on HIVE-10971: - LGTM +1 need to update test files though. count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true --- Key: HIVE-10971 URL: https://issues.apache.org/jira/browse/HIVE-10971 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 1.2.0 Reporter: wangmeng Assignee: wangmeng Attachments: HIVE-10971.01.patch When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two theoretically, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10961) LLAP: ShuffleHandler + Submit work init race condition
[ https://issues.apache.org/jira/browse/HIVE-10961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved HIVE-10961. --- Resolution: Fixed LLAP: ShuffleHandler + Submit work init race condition -- Key: HIVE-10961 URL: https://issues.apache.org/jira/browse/HIVE-10961 Project: Hive Issue Type: Sub-task Affects Versions: llap Reporter: Gopal V Assignee: Siddharth Seth Fix For: llap Attachments: HIVE-10961.1.txt When flexing in a new node, it accepts DAG requests before the shuffle handler is setup, causing fatals {code} DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:2 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1433459966952_0729_1_00, diagnostics=[Task failed, taskId=task_1t at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler.get(ShuffleHandler.java:353) at org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl.submitWork(ContainerRunnerImpl.java:192) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.submitWork(LlapDaemon.java:301) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemonProtocolServerImpl.submitWork(LlapDaemonProtocolServerImpl.java:75) at org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos$LlapDaemonProtocol$2.callBlockingMethod(LlapDaemonProtocolProtos.java:12094) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:972) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2085) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2081) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1654) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2081) ], TaskAttempt 1 failed, info=[org.apache.hadoop.ipc.RemoteException(java.lang.IllegalStateException): ShuffleHandler must be started before invoking get at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.hadoop.hive.llap.shufflehandler.ShuffleHandler.get(ShuffleHandler.java:353) at org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl.submitWork(ContainerRunnerImpl.java:192) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.submitWork(LlapDaemon.java:301) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemonProtocolServerImpl.submitWork(LlapDaemonProtocolServerImpl.java:75) at org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos$LlapDaemonProtocol$2.callBlockingMethod(LlapDaemonProtocolProtos.java:12094) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:972) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2085) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10956) HS2 leaks HMS connections
[ https://issues.apache.org/jira/browse/HIVE-10956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579580#comment-14579580 ] Xuefu Zhang commented on HIVE-10956: +1 HS2 leaks HMS connections - Key: HIVE-10956 URL: https://issues.apache.org/jira/browse/HIVE-10956 Project: Hive Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 1.3.0, 2.0.0 Attachments: HIVE-10956.1.patch, HIVE-10956.2.patch, HIVE-10956.3.patch HS2 uses threadlocal to cache HMS client in class Hive. When the thread is dead, the HMS client is not closed. So the connection to the HMS is leaked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch
[ https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579728#comment-14579728 ] Hive QA commented on HIVE-10943: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738633/HIVE-10943.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9006 tests executed *Failed tests:* {noformat} org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4230/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4230/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4230/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738633 - PreCommit-HIVE-TRUNK-Build Beeline-cli: Enable precommit for beelie-cli branch Key: HIVE-10943 URL: https://issues.apache.org/jira/browse/HIVE-10943 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10943.patch, HIVE-10943.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10857) Accumulo storage handler fail throwing java.lang.IllegalArgumentException: Cannot determine SASL mechanism for token class: class org.apache.accumulo.core.client.securi
[ https://issues.apache.org/jira/browse/HIVE-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579747#comment-14579747 ] Josh Elser commented on HIVE-10857: --- Thanks so much for the review [~sushanth]! I know these changes are a little obtuse for Hive -- I greatly appreciate the effort. Accumulo storage handler fail throwing java.lang.IllegalArgumentException: Cannot determine SASL mechanism for token class: class org.apache.accumulo.core.client.security.tokens.PasswordToken --- Key: HIVE-10857 URL: https://issues.apache.org/jira/browse/HIVE-10857 Project: Hive Issue Type: Bug Affects Versions: 1.2.1 Reporter: Takahiko Saito Assignee: Josh Elser Fix For: 1.2.1 Attachments: HIVE-10857.2.patch, HIVE-10857.patch create table Accumulo storage with Accumulo storage handler fails due to ACCUMULO-2815. {noformat} create table accumulo_1(key string, age int) stored by 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler' with serdeproperties ( accumulo.columns.mapping = :rowid,info:age); {noformat} The error shows: {noformat} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:org.apache.accumulo.core.client.AccumuloException: java.lang.IllegalArgumentException: Cannot determine SASL mechanism for token class: class org.apache.accumulo.core.client.security.tokens.PasswordToken at org.apache.accumulo.core.client.impl.ServerClient.execute(ServerClient.java:67) at org.apache.accumulo.core.client.impl.ConnectorImpl.init(ConnectorImpl.java:67) at org.apache.accumulo.core.client.ZooKeeperInstance.getConnector(ZooKeeperInstance.java:248) at org.apache.hadoop.hive.accumulo.AccumuloConnectionParameters.getConnector(AccumuloConnectionParameters.java:125) at org.apache.hadoop.hive.accumulo.AccumuloConnectionParameters.getConnector(AccumuloConnectionParameters.java:111) at org.apache.hadoop.hive.accumulo.AccumuloStorageHandler.preCreateTable(AccumuloStorageHandler.java:245) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:664) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:657) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156) at com.sun.proxy.$Proxy5.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:714) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4135) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:306) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at
[jira] [Commented] (HIVE-10415) hive.start.cleanup.scratchdir configuration is not taking effect
[ https://issues.apache.org/jira/browse/HIVE-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579754#comment-14579754 ] Lefty Leverenz commented on HIVE-10415: --- Doc note: *hive.start.cleanup.scratchdir* is documented in the wiki here: * [Configuration Properties -- hive.start.cleanup.scratchdir | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.start.cleanup.scratchdir] I added a Fixed In line with a link to this issue. hive.start.cleanup.scratchdir configuration is not taking effect Key: HIVE-10415 URL: https://issues.apache.org/jira/browse/HIVE-10415 Project: Hive Issue Type: Bug Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Fix For: 1.3.0, 2.0.0 Attachments: HIVE-10415.patch This configuration hive.start.cleanup.scratchdir is not taking effect -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-2181) Clean up the scratch.dir (tmp/hive-root) while restarting Hive server.
[ https://issues.apache.org/jira/browse/HIVE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579749#comment-14579749 ] Lefty Leverenz commented on HIVE-2181: -- Doc note: This added configuration parameter *hive.start.cleanup.scratchdir* to HiveConf.java in release 0.8.1 (not 0.8.0 as indicated in Fix Version). HIVE-10415 fixed a bug in release 1.3.0. *hive.start.cleanup.scratchdir* is documented in the wiki here: * [Configuration Properties -- hive.start.cleanup.scratchdir | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.start.cleanup.scratchdir] Clean up the scratch.dir (tmp/hive-root) while restarting Hive server. Key: HIVE-2181 URL: https://issues.apache.org/jira/browse/HIVE-2181 Project: Hive Issue Type: Bug Components: Server Infrastructure Affects Versions: 0.8.0 Environment: Suse linux, Hadoop 20.1, Hive 0.8 Reporter: sanoj mathew Assignee: Chinna Rao Lalam Priority: Minor Fix For: 0.8.0 Attachments: HIVE-2181.1.patch, HIVE-2181.2.patch, HIVE-2181.3.patch, HIVE-2181.4.patch, HIVE-2181.5.patch, HIVE-2181.6.patch, HIVE-2181.patch Original Estimate: 48h Remaining Estimate: 48h Now queries leaves the map outputs under scratch.dir after execution. If the hive server is stopped we need not keep the stopped server's map oputputs. So whle starting the server we can clear the scratch.dir. This can help in improved disk usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10959) Templeton launcher job should reconnect to the running child job on task retry
[ https://issues.apache.org/jira/browse/HIVE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Mitic updated HIVE-10959: -- Attachment: HIVE-10959.3.patch Attaching updated patch based on offline feedback from [~thejas]. I introduced a user arg which allows specifying whether templeton should attempt to reconnect to a running job or not. This is because user jar might be doing additional work after the MR job itself, and by reconnecting templeton would lose track of this work. Templeton launcher job should reconnect to the running child job on task retry -- Key: HIVE-10959 URL: https://issues.apache.org/jira/browse/HIVE-10959 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.15.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Attachments: HIVE-10959.2.patch, HIVE-10959.3.patch, HIVE-10959.patch Currently, Templeton launcher kills all child jobs (jobs tagged with the parent job's id) upon task retry. Upon templeton launcher task retry, templeton should reconnect to the running job and continue tracking its progress that way. This logic cannot be used for all job kinds (e.g. for jobs that are driven by the client side like regular hive). However, for MapReduceV2, and possibly Tez and HiveOnTez, this should be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10816) NPE in ExecDriver::handleSampling when submitted via child JVM
[ https://issues.apache.org/jira/browse/HIVE-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579784#comment-14579784 ] Xuefu Zhang commented on HIVE-10816: Yeah. I think it makes sense to commit it to branch-1 as well. NPE in ExecDriver::handleSampling when submitted via child JVM -- Key: HIVE-10816 URL: https://issues.apache.org/jira/browse/HIVE-10816 Project: Hive Issue Type: Bug Reporter: Rui Li Assignee: Rui Li Fix For: 1.3.0 Attachments: HIVE-10816.1.patch, HIVE-10816.1.patch When {{hive.exec.submitviachild = true}}, parallel order by fails with NPE and falls back to single-reducer mode. Stack trace: {noformat} 2015-05-25 08:41:04,446 ERROR [main]: mr.ExecDriver (ExecDriver.java:execute(386)) - Sampling error java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.handleSampling(ExecDriver.java:513) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:379) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:750) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10965) direct SQL for stats fails in 0-column case
[ https://issues.apache.org/jira/browse/HIVE-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10965: Attachment: HIVE-10965.01.patch direct SQL for stats fails in 0-column case --- Key: HIVE-10965 URL: https://issues.apache.org/jira/browse/HIVE-10965 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 1.3.0, 1.2.1, 2.0.0 Attachments: HIVE-10965.01.patch, HIVE-10965.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10967) add mapreduce.job.tags to sql std authorization config whitelist
[ https://issues.apache.org/jira/browse/HIVE-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579639#comment-14579639 ] Lefty Leverenz commented on HIVE-10967: --- Doc note: Updated the description of *hive.security.authorization.sqlstd.confwhitelist* in the wiki to include this jira. * [Configuration Properties -- hive.security.authorization.sqlstd.confwhitelist | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.security.authorization.sqlstd.confwhitelist] add mapreduce.job.tags to sql std authorization config whitelist Key: HIVE-10967 URL: https://issues.apache.org/jira/browse/HIVE-10967 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.1 Attachments: HIVE-10967.1.patch mapreduce.job.tags is set by oozie for HiveServer2 actions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10816) NPE in ExecDriver::handleSampling when submitted via child JVM
[ https://issues.apache.org/jira/browse/HIVE-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579672#comment-14579672 ] Lefty Leverenz commented on HIVE-10816: --- Version question: Since this was committed to master shouldn't it say Fix Version 2.0.0, or will it also be committed to branch-1 for 1.3.0? NPE in ExecDriver::handleSampling when submitted via child JVM -- Key: HIVE-10816 URL: https://issues.apache.org/jira/browse/HIVE-10816 Project: Hive Issue Type: Bug Reporter: Rui Li Assignee: Rui Li Fix For: 1.3.0 Attachments: HIVE-10816.1.patch, HIVE-10816.1.patch When {{hive.exec.submitviachild = true}}, parallel order by fails with NPE and falls back to single-reducer mode. Stack trace: {noformat} 2015-05-25 08:41:04,446 ERROR [main]: mr.ExecDriver (ExecDriver.java:execute(386)) - Sampling error java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.handleSampling(ExecDriver.java:513) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:379) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:750) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10947) LLAP: preemption appears to count against failure count for the task
[ https://issues.apache.org/jira/browse/HIVE-10947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579604#comment-14579604 ] Siddharth Seth commented on HIVE-10947: --- If this happens again, please capture the logs. I'm not sure these tasks were actually preempted. They may have failed for other reasons. THere's 20 additional attempts, most of which were KILLED (likely due to preemption) before the 2 FAILED aatempts - which caused the task to fail. LLAP: preemption appears to count against failure count for the task Key: HIVE-10947 URL: https://issues.apache.org/jira/browse/HIVE-10947 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Siddharth Seth Looks like the following stack in very parallel workload counts as task error and DAG fails: {noformat} : Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1433459966952_0482_4_03, diagnostics=[Task failed, taskId=task_1433459966952_0482_4_03_22, diagnostics=[TaskAttempt 0 killed, TaskAttempt 1 killed, TaskAttempt 2 killed, TaskAttempt 3 killed, TaskAttempt 4 killed, TaskAttempt 5 killed, TaskAttempt 6 killed, TaskAttempt 7 killed, TaskAttempt 8 killed, TaskAttempt 9 killed, TaskAttempt 10 killed, TaskAttempt 11 killed, TaskAttempt 12 killed, TaskAttempt 13 killed, TaskAttempt 14 killed, TaskAttempt 15 killed, TaskAttempt 16 killed, TaskAttempt 17 killed, TaskAttempt 18 killed, TaskAttempt 19 failed, info=[Error: Failure while running task: attempt_1433459966952_0482_4_03_22_19:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:181) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:146) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1654) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:256) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:157) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async initialization failed at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:416) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:388) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:511) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:378) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:241) ... 15 more Caused by: java.util.concurrent.CancellationException at java.util.concurrent.FutureTask.report(FutureTask.java:121) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:408) ... 20 more ], TaskAttempt 20 failed, info=[Error: Failure while running task: attempt_1433459966952_0482_4_03_22_20:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:181) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:146) at
[jira] [Commented] (HIVE-10966) direct SQL for stats has a cast exception on some databases
[ https://issues.apache.org/jira/browse/HIVE-10966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579606#comment-14579606 ] Hive QA commented on HIVE-10966: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738624/HIVE-10966.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9006 tests executed *Failed tests:* {noformat} org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4229/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4229/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4229/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738624 - PreCommit-HIVE-TRUNK-Build direct SQL for stats has a cast exception on some databases --- Key: HIVE-10966 URL: https://issues.apache.org/jira/browse/HIVE-10966 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 1.3.0, 1.2.1, 2.0.0 Attachments: HIVE-10966.patch, HIVE-10966.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10685) Alter table concatenate oparetor will cause duplicate data
[ https://issues.apache.org/jira/browse/HIVE-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10685: - Priority: Critical (was: Major) Alter table concatenate oparetor will cause duplicate data -- Key: HIVE-10685 URL: https://issues.apache.org/jira/browse/HIVE-10685 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 1.2.1 Reporter: guoliming Assignee: guoliming Priority: Critical Fix For: 1.2.0, 1.1.0 Attachments: HIVE-10685.1.patch, HIVE-10685.patch Orders table has 15 rows and stored as ORC. {noformat} hive select count(*) from orders; OK 15 Time taken: 37.692 seconds, Fetched: 1 row(s) {noformat} The table contain 14 files,the size of each file is about 2.1 ~ 3.2 GB. After executing command : ALTER TABLE orders CONCATENATE; The table is already 1530115000 rows. My hive version is 1.1.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10970) Revert HIVE-10453: HS2 leaking open file descriptors when using UDFs
[ https://issues.apache.org/jira/browse/HIVE-10970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579597#comment-14579597 ] Vaibhav Gumashta commented on HIVE-10970: - [~xuefuz] I'm trying to reproduce it locally. Internally, we saw TestJdbcDriver2 fail with several classnotfound exception. As a quick fix, I tried reverting this and it seems to fix the issue. However, before reverting on apache, I'm going to try to get a repro and also come up with why that was happening. Revert HIVE-10453: HS2 leaking open file descriptors when using UDFs Key: HIVE-10970 URL: https://issues.apache.org/jira/browse/HIVE-10970 Project: Hive Issue Type: Bug Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10453) HS2 leaking open file descriptors when using UDFs
[ https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579593#comment-14579593 ] Vaibhav Gumashta commented on HIVE-10453: - I'm trying to reproduce this locally before I revert this (will add more comments on HIVE-10970). HS2 leaking open file descriptors when using UDFs - Key: HIVE-10453 URL: https://issues.apache.org/jira/browse/HIVE-10453 Project: Hive Issue Type: Bug Components: UDF Reporter: Yongzhi Chen Assignee: Yongzhi Chen Fix For: 1.3.0, 1.2.1, 2.0.0 Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch 1. create a custom function by CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar'; 2. Create a simple jdbc client, just do connect, run simple query which using the function such as: select myfunc(col1) from sometable 3. Disconnect. Check open file for HiveServer2 by: lsof -p HSProcID | grep myudf.jar You will see the leak as: {noformat} java 28718 ychen txt REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar java 28718 ychen 330r REG1,4741 212977666 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10533) CBO (Calcite Return Path): Join to MultiJoin support for outer joins
[ https://issues.apache.org/jira/browse/HIVE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10533: --- Attachment: HIVE-10533.02.patch CBO (Calcite Return Path): Join to MultiJoin support for outer joins Key: HIVE-10533 URL: https://issues.apache.org/jira/browse/HIVE-10533 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10533.01.patch, HIVE-10533.02.patch, HIVE-10533.02.patch, HIVE-10533.patch CBO return path: auto_join7.q can be used to reproduce the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10971) count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true
[ https://issues.apache.org/jira/browse/HIVE-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578802#comment-14578802 ] Hive QA commented on HIVE-10971: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738553/HIVE-10971.01.patch {color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 9004 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_cube1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_rollup1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_groupby2 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_groupby2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby11 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby8 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby8_map_skew org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_cube1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_rollup1 org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4223/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4223/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4223/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 20 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738553 - PreCommit-HIVE-TRUNK-Build count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true --- Key: HIVE-10971 URL: https://issues.apache.org/jira/browse/HIVE-10971 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 1.2.0 Reporter: wangmeng Assignee: wangmeng Attachments: HIVE-10971.01.patch When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two theoretically, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10903) Add hive.in.test for HoS tests
[ https://issues.apache.org/jira/browse/HIVE-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-10903: -- Attachment: HIVE-10903.3.patch Update more outputs. Add hive.in.test for HoS tests -- Key: HIVE-10903 URL: https://issues.apache.org/jira/browse/HIVE-10903 Project: Hive Issue Type: Test Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-10903.1.patch, HIVE-10903.2.patch, HIVE-10903.3.patch Missing the property can make CBO fails to run during UT. There should be other effects that can be identified here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10533) CBO (Calcite Return Path): Join to MultiJoin support for outer joins
[ https://issues.apache.org/jira/browse/HIVE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578793#comment-14578793 ] Jesus Camacho Rodriguez commented on HIVE-10533: Fails are unrelated, reuploading the patch for another QA run. CBO (Calcite Return Path): Join to MultiJoin support for outer joins Key: HIVE-10533 URL: https://issues.apache.org/jira/browse/HIVE-10533 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10533.01.patch, HIVE-10533.02.patch, HIVE-10533.patch CBO return path: auto_join7.q can be used to reproduce the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10970) Revert HIVE-10453: HS2 leaking open file descriptors when using UDFs
[ https://issues.apache.org/jira/browse/HIVE-10970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578852#comment-14578852 ] Xuefu Zhang commented on HIVE-10970: Could you please add description giving the reasoning? Thanks. Revert HIVE-10453: HS2 leaking open file descriptors when using UDFs Key: HIVE-10970 URL: https://issues.apache.org/jira/browse/HIVE-10970 Project: Hive Issue Type: Bug Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch
[ https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579799#comment-14579799 ] Ferdinand Xu commented on HIVE-10943: - Thanks [~spena] for creating the instance. Beeline-cli: Enable precommit for beelie-cli branch Key: HIVE-10943 URL: https://issues.apache.org/jira/browse/HIVE-10943 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10943.patch, HIVE-10943.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10857) Accumulo storage handler fail throwing java.lang.IllegalArgumentException: Cannot determine SASL mechanism for token class: class org.apache.accumulo.core.client.securi
[ https://issues.apache.org/jira/browse/HIVE-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579849#comment-14579849 ] Sushanth Sowmyan commented on HIVE-10857: - (Also committed to branch-1, forgot earlier) Accumulo storage handler fail throwing java.lang.IllegalArgumentException: Cannot determine SASL mechanism for token class: class org.apache.accumulo.core.client.security.tokens.PasswordToken --- Key: HIVE-10857 URL: https://issues.apache.org/jira/browse/HIVE-10857 Project: Hive Issue Type: Bug Affects Versions: 1.2.1 Reporter: Takahiko Saito Assignee: Josh Elser Fix For: 1.2.1 Attachments: HIVE-10857.2.patch, HIVE-10857.patch create table Accumulo storage with Accumulo storage handler fails due to ACCUMULO-2815. {noformat} create table accumulo_1(key string, age int) stored by 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler' with serdeproperties ( accumulo.columns.mapping = :rowid,info:age); {noformat} The error shows: {noformat} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:org.apache.accumulo.core.client.AccumuloException: java.lang.IllegalArgumentException: Cannot determine SASL mechanism for token class: class org.apache.accumulo.core.client.security.tokens.PasswordToken at org.apache.accumulo.core.client.impl.ServerClient.execute(ServerClient.java:67) at org.apache.accumulo.core.client.impl.ConnectorImpl.init(ConnectorImpl.java:67) at org.apache.accumulo.core.client.ZooKeeperInstance.getConnector(ZooKeeperInstance.java:248) at org.apache.hadoop.hive.accumulo.AccumuloConnectionParameters.getConnector(AccumuloConnectionParameters.java:125) at org.apache.hadoop.hive.accumulo.AccumuloConnectionParameters.getConnector(AccumuloConnectionParameters.java:111) at org.apache.hadoop.hive.accumulo.AccumuloStorageHandler.preCreateTable(AccumuloStorageHandler.java:245) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:664) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:657) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156) at com.sun.proxy.$Proxy5.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:714) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4135) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:306) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.lang.IllegalArgumentException: Cannot determine SASL
[jira] [Commented] (HIVE-6791) Support variable substition for Beeline shell command
[ https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579868#comment-14579868 ] Ferdinand Xu commented on HIVE-6791: Thanks [~xuefuz] for your reviews. I left some comments on the review board. Thank you! Support variable substition for Beeline shell command - Key: HIVE-6791 URL: https://issues.apache.org/jira/browse/HIVE-6791 Project: Hive Issue Type: New Feature Components: CLI, Clients Affects Versions: 0.14.0 Reporter: Xuefu Zhang Assignee: Ferdinand Xu Attachments: HIVE-6791-beeline-cli.patch A follow-up task from HIVE-6694. Similar to HIVE-6570. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10974) Use Configuration::getRaw() for the Base64 data
[ https://issues.apache.org/jira/browse/HIVE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579879#comment-14579879 ] Hive QA commented on HIVE-10974: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738665/HIVE-10974.1.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9006 tests executed *Failed tests:* {noformat} org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4232/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4232/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4232/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738665 - PreCommit-HIVE-TRUNK-Build Use Configuration::getRaw() for the Base64 data --- Key: HIVE-10974 URL: https://issues.apache.org/jira/browse/HIVE-10974 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-10974.1.patch Inspired by the Twitter HadoopSummit talk {code} if (HiveConf.getBoolVar(conf, ConfVars.HIVE_RPC_QUERY_PLAN)) { LOG.debug(Loading plan from string: +path.toUri().getPath()); String planString = conf.get(path.toUri().getPath()); {code} Use getRaw() in other places where Base64 data is present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10958) Centos: TestMiniTezCliDriver.testCliDriver_mergejoin fails
[ https://issues.apache.org/jira/browse/HIVE-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579895#comment-14579895 ] Thejas M Nair commented on HIVE-10958: -- Is it also committed to branch-1 ? Everything in branch-1.2 should go into branch-1 as well, as branch-1 is the 'trunk' for future 1.x releases. Centos: TestMiniTezCliDriver.testCliDriver_mergejoin fails -- Key: HIVE-10958 URL: https://issues.apache.org/jira/browse/HIVE-10958 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.2.1, 2.0.0 Attachments: HIVE-10958.01.patch Centos: TestMiniTezCliDriver.testCliDriver_mergejoin fails due to the statement set mapred.reduce.tasks = 18; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10956) HS2 leaks HMS connections
[ https://issues.apache.org/jira/browse/HIVE-10956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579811#comment-14579811 ] Hive QA commented on HIVE-10956: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738657/HIVE-10956.3.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9004 tests executed *Failed tests:* {noformat} org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4231/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4231/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4231/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738657 - PreCommit-HIVE-TRUNK-Build HS2 leaks HMS connections - Key: HIVE-10956 URL: https://issues.apache.org/jira/browse/HIVE-10956 Project: Hive Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 1.3.0, 2.0.0 Attachments: HIVE-10956.1.patch, HIVE-10956.2.patch, HIVE-10956.3.patch HS2 uses threadlocal to cache HMS client in class Hive. When the thread is dead, the HMS client is not closed. So the connection to the HMS is leaked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10816) NPE in ExecDriver::handleSampling when submitted via child JVM
[ https://issues.apache.org/jira/browse/HIVE-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579865#comment-14579865 ] Rui Li commented on HIVE-10816: --- Thanks [~leftylev] and [~xuefuz] for catching this. I didn't realize master is on 2.0.0 now. Committed to branch-1 as well. NPE in ExecDriver::handleSampling when submitted via child JVM -- Key: HIVE-10816 URL: https://issues.apache.org/jira/browse/HIVE-10816 Project: Hive Issue Type: Bug Reporter: Rui Li Assignee: Rui Li Fix For: 1.3.0 Attachments: HIVE-10816.1.patch, HIVE-10816.1.patch When {{hive.exec.submitviachild = true}}, parallel order by fails with NPE and falls back to single-reducer mode. Stack trace: {noformat} 2015-05-25 08:41:04,446 ERROR [main]: mr.ExecDriver (ExecDriver.java:execute(386)) - Sampling error java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.handleSampling(ExecDriver.java:513) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:379) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:750) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10959) Templeton launcher job should reconnect to the running child job on task retry
[ https://issues.apache.org/jira/browse/HIVE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579845#comment-14579845 ] Thejas M Nair commented on HIVE-10959: -- +1 Templeton launcher job should reconnect to the running child job on task retry -- Key: HIVE-10959 URL: https://issues.apache.org/jira/browse/HIVE-10959 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.15.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Attachments: HIVE-10959.2.patch, HIVE-10959.3.patch, HIVE-10959.patch Currently, Templeton launcher kills all child jobs (jobs tagged with the parent job's id) upon task retry. Upon templeton launcher task retry, templeton should reconnect to the running job and continue tracking its progress that way. This logic cannot be used for all job kinds (e.g. for jobs that are driven by the client side like regular hive). However, for MapReduceV2, and possibly Tez and HiveOnTez, this should be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10903) Add hive.in.test for HoS tests
[ https://issues.apache.org/jira/browse/HIVE-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579846#comment-14579846 ] Rui Li commented on HIVE-10903: --- cc [~xuefuz] Add hive.in.test for HoS tests -- Key: HIVE-10903 URL: https://issues.apache.org/jira/browse/HIVE-10903 Project: Hive Issue Type: Test Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-10903.1.patch, HIVE-10903.2.patch, HIVE-10903.3.patch Missing the property can make CBO fails to run during UT. There should be other effects that can be identified here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10816) NPE in ExecDriver::handleSampling when submitted via child JVM
[ https://issues.apache.org/jira/browse/HIVE-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-10816: -- Fix Version/s: 2.0.0 NPE in ExecDriver::handleSampling when submitted via child JVM -- Key: HIVE-10816 URL: https://issues.apache.org/jira/browse/HIVE-10816 Project: Hive Issue Type: Bug Reporter: Rui Li Assignee: Rui Li Fix For: 1.3.0, 2.0.0 Attachments: HIVE-10816.1.patch, HIVE-10816.1.patch When {{hive.exec.submitviachild = true}}, parallel order by fails with NPE and falls back to single-reducer mode. Stack trace: {noformat} 2015-05-25 08:41:04,446 ERROR [main]: mr.ExecDriver (ExecDriver.java:execute(386)) - Sampling error java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.handleSampling(ExecDriver.java:513) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:379) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:750) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10533) CBO (Calcite Return Path): Join to MultiJoin support for outer joins
[ https://issues.apache.org/jira/browse/HIVE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578900#comment-14578900 ] Hive QA commented on HIVE-10533: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738569/HIVE-10533.02.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9005 tests executed *Failed tests:* {noformat} org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4224/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4224/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4224/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738569 - PreCommit-HIVE-TRUNK-Build CBO (Calcite Return Path): Join to MultiJoin support for outer joins Key: HIVE-10533 URL: https://issues.apache.org/jira/browse/HIVE-10533 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10533.01.patch, HIVE-10533.02.patch, HIVE-10533.02.patch, HIVE-10533.patch CBO return path: auto_join7.q can be used to reproduce the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10933) Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns()
[ https://issues.apache.org/jira/browse/HIVE-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang resolved HIVE-10933. Resolution: Cannot Reproduce As I understand it has been resolved by HIVE-5847. If you see any related issue in future, please feel free to reopen this JIRA or open a new one. Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns() Key: HIVE-10933 URL: https://issues.apache.org/jira/browse/HIVE-10933 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.13.0 Reporter: Son Nguyen Assignee: Chaoyu Tang DatabaseMetadata.getColumns() returns COLUMN_SIZE as 0 for a column defined as varchar(32), or char(32). While ResultSetMetaData.getPrecision() returns correct value 32. Here is the segment program that reproduces the issue. {code} try { statement = connection.createStatement(); statement.execute(drop table if exists son_table); statement.execute(create table son_table( col1 varchar(32) )); statement.close(); } catch ( Exception e) { return; } // get column info using metadata try { DatabaseMetaData dmd = null; ResultSet resultSet = null; dmd = connection.getMetaData(); resultSet = dmd.getColumns(null, null, son_table, col1); if ( resultSet.next() ) { String tabName = resultSet.getString(TABLE_NAME); String colName = resultSet.getString(COLUMN_NAME); String dataType = resultSet.getString(DATA_TYPE); String typeName = resultSet.getString(TYPE_NAME); int precision = resultSet.getInt(COLUMN_SIZE); // output is: colName = col1, dataType = 12, typeName = VARCHAR, precision = 0. System.out.format(colName = %s, dataType = %s, typeName = %s, precision = %d., colName, dataType, typeName, precision); } } catch ( Exception e) { return; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10971) count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true
[ https://issues.apache.org/jira/browse/HIVE-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580047#comment-14580047 ] Hive QA commented on HIVE-10971: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738683/HIVE-10971.1.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9006 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby2 org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4234/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4234/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4234/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738683 - PreCommit-HIVE-TRUNK-Build count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true --- Key: HIVE-10971 URL: https://issues.apache.org/jira/browse/HIVE-10971 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 1.2.0 Reporter: wangmeng Assignee: wangmeng Attachments: HIVE-10971.01.patch, HIVE-10971.1.patch When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two theoretically, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10959) Templeton launcher job should reconnect to the running child job on task retry
[ https://issues.apache.org/jira/browse/HIVE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580058#comment-14580058 ] Hive QA commented on HIVE-10959: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738697/HIVE-10959.3.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4235/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4235/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4235/ Messages: {noformat} This message was trimmed, see log for full details [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-hcatalog-server-extensions --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/tmp/conf [copy] Copying 11 files to /data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hive-hcatalog-server-extensions --- [INFO] Compiling 2 source files to /data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/test-classes [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-hcatalog-server-extensions --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-hcatalog-server-extensions --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/hive-hcatalog-server-extensions-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-hcatalog-server-extensions --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-hcatalog-server-extensions --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/target/hive-hcatalog-server-extensions-2.0.0-SNAPSHOT.jar to /home/hiveptest/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-server-extensions/2.0.0-SNAPSHOT/hive-hcatalog-server-extensions-2.0.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/hcatalog/server-extensions/pom.xml to /home/hiveptest/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-server-extensions/2.0.0-SNAPSHOT/hive-hcatalog-server-extensions-2.0.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive HCatalog Webhcat Java Client 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-webhcat-java-client --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-webhcat-java-client --- [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-webhcat-java-client --- [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-webhcat-java-client --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client/src/main/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-webhcat-java-client --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-webhcat-java-client --- [INFO] Compiling 36 source files to /data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client/target/classes [WARNING] /data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatClientHMSImpl.java: /data/hive-ptest/working/apache-github-source-source/hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatClientHMSImpl.java uses or overrides a deprecated API. [WARNING]
[jira] [Commented] (HIVE-10729) Query failed when select complex columns from joinned table (tez map join only)
[ https://issues.apache.org/jira/browse/HIVE-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578856#comment-14578856 ] Greg Senia commented on HIVE-10729: --- Here is the query and source table describe that shows the arraystring which seems to be the cause... drop table debug.ct_gsd_events1_test; create table debug.ct_gsd_events1_test as select a.*, b.svcrqst_id, b.svcrqct_cds, b.svcrtyp_cd, b.cmpltyp_cd, b.sum_reason_cd as src, b.cnctmd_cd, b.notes from ctm.ct_gsd_events a inner join mbr.gsd_service_request b on a.contact_event_id = b.cnctevn_id; hive describe formatted ctm.ct_gsd_events; OK # col_name data_type comment hmoid string cumb_id_no int mbrind_id string contact_event_idstring ce_create_dtstring ce_end_dt string contact_typestring cnctevs_cd string contact_modestring cntvnst_stts_cd string total_transfers int ce_notesarraystring # Detailed Table Information Database: ctm Owner: LOAD_USER CreateTime: Fri May 29 09:41:58 EDT 2015 LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location: hdfs://xhadnnm1p.example.com:8020/apps/hive/warehouse/ctm.db/ct_gsd_events Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles154 numRows 0 rawDataSize 0 totalSize 5464108 transient_lastDdlTime 1432906919 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat:org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets:-1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format1 Time taken: 2.968 seconds, Fetched: 42 row(s) Query failed when select complex columns from joinned table (tez map join only) --- Key: HIVE-10729 URL: https://issues.apache.org/jira/browse/HIVE-10729 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0 Reporter: Selina Zhang Assignee: Selina Zhang Attachments: HIVE-10729.1.patch, HIVE-10729.2.patch When map join happens, if projection columns include complex data types, query will fail. Steps to reproduce: {code:sql} hive set hive.auto.convert.join; hive.auto.convert.join=true hive desc foo; a arrayint hive select * from foo; [1,2] hive desc src_int; key int value string hive select * from src_int where key=2; 2val_2 hive select * from foo join src_int src on src.key = foo.a[1]; {code} Query will fail with stack trace {noformat} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryArray cannot be cast to [Ljava.lang.Object; at org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector.getList(StandardListObjectInspector.java:111) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:314) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:262) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:246) at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:50) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:692) at
[jira] [Commented] (HIVE-10729) Query failed when select complex columns from joinned table (tez map join only)
[ https://issues.apache.org/jira/browse/HIVE-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578874#comment-14578874 ] Greg Senia commented on HIVE-10729: --- Here is a sample of the data I think the cause is their is a null in the arraystring field of notes... this was not a problem with Hive 0.13 it definitely started with Hive 0.14/1.x line.. Vertex failed, vertexName=Map 2, vertexId=vertex_1426958683478_216665_2_01, diagnostics=[Task failed, taskId=task_1426958683478_216665_2_01_000104, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {cumb_id_no:31585,cumb_id_no_sub:31585,cnctevn_id:0021XXX86715,svcrqst_id:003XXX346030,svcrqst_crt_dts:2015-03-09 11:25:10.927722,subject_seq_no:1,cntmbrp_id:692XX60 ,plan_component:H ,psuniq_id:14XXX279,cust_segment:RM ,idcard:MEXX ,cnctyp_cd:001,cnctmd_cd:D01,cnctevs_cd:007,svcrtyp_cd:722,svrstyp_cd:832,cmpltyp_cd: ,catsrsn_cd:,apealvl_cd: ,cnstnty_cd:001,svcrqst_asrqst_ind:Y,svcrqst_rtnorig_in:N,svcrqst_vwasof_dt:null,svcrqst_lupdusr_id:XXX ,sum_reason_cd:98,sum_reason:Exclude,crsr_master_claim_index:null,svcrqct_cds:[ ],svcrqst_lupdt:2015-03-09 11:25:10.927722,crsr_lupdt:null,cntmbrp_lupdt:2015-03-09 11:24:51.315134,cntevsds_lupdt:2015-03-09 11:25:13.429458,ignore_me:1,notes:null} at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {cumb_id_no:31XXX585,cumb_id_no_sub:31XXX585,cnctevn_id:0021XXX86715,svcrqst_id:003XXX346030,svcrqst_crt_dts:2015-03-09 11:25:10.927722,subject_seq_no:1,cntmbrp_id:692XX60 ,plan_component:H ,psuniq_id:14XXX279,cust_segment:RM ,idcard:MEXX ,cnctyp_cd:001,cnctmd_cd:D01,cnctevs_cd:007,svcrtyp_cd:722,svrstyp_cd:832,cmpltyp_cd: ,catsrsn_cd:,apealvl_cd: ,cnstnty_cd:001,svcrqst_asrqst_ind:Y,svcrqst_rtnorig_in:N,svcrqst_vwasof_dt:null,svcrqst_lupdusr_id:XXX ,sum_reason_cd:98,sum_reason:Exclude,crsr_master_claim_index:null,svcrqct_cds:[ ],svcrqst_lupdt:2015-03-09 11:25:10.927722,crsr_lupdt:null,cntmbrp_lupdt:2015-03-09 11:24:51.315134,cntevsds_lupdt:2015-03-09 11:25:13.429458,ignore_me:1,notes:null} at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:290) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {cumb_id_no:31585,cumb_id_no_sub:31585,cnctevn_id:0021XXX86715,svcrqst_id:003XXX346030,svcrqst_crt_dts:2015-03-09 11:25:10.927722,subject_seq_no:1,cntmbrp_id:692XX60 ,plan_component:H ,psuniq_id:14XXX279,cust_segment:RM ,idcard:MEXX ,cnctyp_cd:001,cnctmd_cd:D01,cnctevs_cd:007,svcrtyp_cd:722,svrstyp_cd:832,cmpltyp_cd: ,catsrsn_cd:,apealvl_cd: ,cnstnty_cd:001,svcrqst_asrqst_ind:Y,svcrqst_rtnorig_in:N,svcrqst_vwasof_dt:null,svcrqst_lupdusr_id:XXX ,sum_reason_cd:98,sum_reason:Exclude,crsr_master_claim_index:null,svcrqct_cds:[ ],svcrqst_lupdt:2015-03-09 11:25:10.927722,crsr_lupdt:null,cntmbrp_lupdt:2015-03-09 11:24:51.315134,cntevsds_lupdt:2015-03-09
[jira] [Commented] (HIVE-10880) The bucket number is not respected in insert overwrite.
[ https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578871#comment-14578871 ] Yongzhi Chen commented on HIVE-10880: - [~xuefuz], when I debug the issue, I noticed that right number of reducer is used. I also noticed that dynamic partition insert works fine because it adds the missing files. I think we should treat static partition and ordinary table the same way, so I fixed the issue by adding the missing buckets. Following is the code for dynamic partition part: {noformat} taskIDToFile = removeTempOrDuplicateFiles(items, fs); // if the table is bucketed and enforce bucketing, we should check and generate all buckets if (dpCtx.getNumBuckets() 0 taskIDToFile != null) { // refresh the file list items = fs.listStatus(parts[i].getPath()); // get the missing buckets and generate empty buckets String taskID1 = taskIDToFile.keySet().iterator().next(); Path bucketPath = taskIDToFile.values().iterator().next().getPath(); for (int j = 0; j dpCtx.getNumBuckets(); ++j) { String taskID2 = replaceTaskId(taskID1, j); if (!taskIDToFile.containsKey(taskID2)) { // create empty bucket, file name should be derived from taskID2 String path2 = replaceTaskIdFromFilename(bucketPath.toUri().getPath().toString(), j); result.add(path2); } } } {noformat} The bucket number is not respected in insert overwrite. --- Key: HIVE-10880 URL: https://issues.apache.org/jira/browse/HIVE-10880 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Priority: Blocker Attachments: HIVE-10880.1.patch, HIVE-10880.2.patch, HIVE-10880.3.patch When hive.enforce.bucketing is true, the bucket number defined in the table is no longer respected in current master and 1.2. This is a regression. Reproduce: {noformat} CREATE TABLE IF NOT EXISTS buckettestinput( data string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput1( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput2( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; Then I inserted the following data into the buckettestinput table firstinsert1 firstinsert2 firstinsert3 firstinsert4 firstinsert5 firstinsert6 firstinsert7 firstinsert8 secondinsert1 secondinsert2 secondinsert3 secondinsert4 secondinsert5 secondinsert6 secondinsert7 secondinsert8 set hive.enforce.bucketing = true; set hive.enforce.sorting=true; insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'; set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); Error: Error while compiling statement: FAILED: SemanticException [Error 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 (state=42000,code=10141) {noformat} The related debug information related to insert overwrite: {noformat} 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'insert overwrite table buckettestoutput1 0: jdbc:hive2://localhost:1 ; select * from buckettestinput where data like ' first%'; INFO : Number of reduce tasks determined at compile time: 2 INFO : In order to change the average load for a reducer (in bytes): INFO : set hive.exec.reducers.bytes.per.reducer=number INFO : In order to limit the maximum number of reducers: INFO : set hive.exec.reducers.max=number INFO : In order to set a constant number of reducers: INFO : set mapred.reduce.tasks=number INFO : Job running in-process (local Hadoop) INFO : 2015-06-01 11:09:29,650 Stage-1 map = 86%, reduce = 100% INFO : Ended Job = job_local107155352_0001 INFO : Loading data to table default.buckettestoutput1 from file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1 INFO : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, totalSize=52, rawDataSize=48] No rows affected (1.692 seconds) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10903) Add hive.in.test for HoS tests
[ https://issues.apache.org/jira/browse/HIVE-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579985#comment-14579985 ] Xuefu Zhang commented on HIVE-10903: +1 LGTM Add hive.in.test for HoS tests -- Key: HIVE-10903 URL: https://issues.apache.org/jira/browse/HIVE-10903 Project: Hive Issue Type: Test Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-10903.1.patch, HIVE-10903.2.patch, HIVE-10903.3.patch Missing the property can make CBO fails to run during UT. There should be other effects that can be identified here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10816) NPE in ExecDriver::handleSampling when submitted via child JVM
[ https://issues.apache.org/jira/browse/HIVE-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580013#comment-14580013 ] Navis commented on HIVE-10816: -- [~lirui] I don't know why I've not been notified but here is my late +1 NPE in ExecDriver::handleSampling when submitted via child JVM -- Key: HIVE-10816 URL: https://issues.apache.org/jira/browse/HIVE-10816 Project: Hive Issue Type: Bug Reporter: Rui Li Assignee: Rui Li Fix For: 1.3.0, 2.0.0 Attachments: HIVE-10816.1.patch, HIVE-10816.1.patch When {{hive.exec.submitviachild = true}}, parallel order by fails with NPE and falls back to single-reducer mode. Stack trace: {noformat} 2015-05-25 08:41:04,446 ERROR [main]: mr.ExecDriver (ExecDriver.java:execute(386)) - Sampling error java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.handleSampling(ExecDriver.java:513) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:379) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:750) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10890) Provide implementable engine selector
[ https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580028#comment-14580028 ] Navis commented on HIVE-10890: -- Right, that should also be checked. Included implementation was just for showing the intention. I'll think of a way to know the engine is configured properly. Anyway, I don't know why I'm not notified these days from hive community. Provide implementable engine selector - Key: HIVE-10890 URL: https://issues.apache.org/jira/browse/HIVE-10890 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Now hive supports three kind of engines. It would be good to have an automatic engine selector without setting explicitly engine for execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10958) Centos: TestMiniTezCliDriver.testCliDriver_mergejoin fails
[ https://issues.apache.org/jira/browse/HIVE-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579898#comment-14579898 ] Pengcheng Xiong commented on HIVE-10958: I am not sure if it is also committed to branch-1 too. [~ashutoshc], could you please take a look at [~thejas] question? Thanks. Centos: TestMiniTezCliDriver.testCliDriver_mergejoin fails -- Key: HIVE-10958 URL: https://issues.apache.org/jira/browse/HIVE-10958 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.2.1, 2.0.0 Attachments: HIVE-10958.01.patch Centos: TestMiniTezCliDriver.testCliDriver_mergejoin fails due to the statement set mapred.reduce.tasks = 18; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10478) resolved
[ https://issues.apache.org/jira/browse/HIVE-10478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579978#comment-14579978 ] wangmeng commented on HIVE-10478: - Hi, I also encountered the same problem ,how did you solve it ? SET hive.exec.parallel=false? Thanks. resolved Key: HIVE-10478 URL: https://issues.apache.org/jira/browse/HIVE-10478 Project: Hive Issue Type: Task Components: Hive Reporter: anna ken Labels: hadoop, hive, hue, kryo resolved -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10965) direct SQL for stats fails in 0-column case
[ https://issues.apache.org/jira/browse/HIVE-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579993#comment-14579993 ] Hive QA commented on HIVE-10965: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12738677/HIVE-10965.01.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9006 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_stats org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_stats org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4233/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4233/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4233/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12738677 - PreCommit-HIVE-TRUNK-Build direct SQL for stats fails in 0-column case --- Key: HIVE-10965 URL: https://issues.apache.org/jira/browse/HIVE-10965 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 1.3.0, 1.2.1, 2.0.0 Attachments: HIVE-10965.01.patch, HIVE-10965.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)