[jira] [Updated] (HIVE-7296) big data approximate processing at a very low cost based on hive sql
[ https://issues.apache.org/jira/browse/HIVE-7296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangmeng updated HIVE-7296: --- Description: For big data analysis, we often need to do the following query and statistics: 1.Cardinality Estimation, count the number of different elements in the collection, such as Unique Visitor ,UV) Now we can use hive-query: Select distinct(id) from TestTable ; 2.Frequency Estimation: estimate number of an element is repeated, such as the site visits of a user 。 Hive query: select count(1) from TestTable where name=”wangmeng” 3.Heavy Hitters, top-k elements: such as top-100 shops Hive query: select count(1), name from TestTable group by name ; need UDF…… 4.Range Query: for example, to find out the number of users between 20 to 30 Hive query : select count(1) from TestTable where age20 and age 30 5.Membership Query : for example, whether the user name is already registered? According to the implementation mechanism of hive , it will cost too large memory space and a long query time. However ,in many cases, we do not need very accurate results and a small error can be tolerated. In such case , we can use approximate processing to greatly improve the time and space efficiency. Now , based on some theoretical analysis materials ,I want to do some for these new features so much if possible. So, is there anything I can do ? Many Thanks. was: For big data analysis, we often need to do the following query and statistics: 1.Cardinality Estimation, count the number of different elements in the collection, such as Unique Visitor ,UV) Now we can use hive-query: Select distinct(id) from TestTable ; 2.Frequency Estimation: estimate number of an element is repeated, such as the site visits of a user 。 Hive query: select count(1) from TestTable where name=”wangmeng” 3.Heavy Hitters, top-k elements: such as top-100 shops Hive query: select count(1), name from TestTable group by name ; need UDF…… 4.Range Query: for example, to find out the number of users between 20 to 30 Hive query : select count(1) from TestTable where age20 and age 30 5.Membership Query : for example, whether the user name is already registered? According to the implementation mechanism of hive , it will cost too large memory space and a long query time. However ,in many cases, we do not need very accurate results and a small error can be tolerated. In such case , we can use approximate processing to greatly improve the time and space efficiency. Now , based on some theoretical analysis materials ,I want to do some for these new features so much if possible. . I am familiar with hive and hadoop , and I have implemented an efficient storage format based on hive.( https://github.com/sjtufighter/Data---Storage--). So, is there anything I can do ? Many Thanks. big data approximate processing at a very low cost based on hive sql Key: HIVE-7296 URL: https://issues.apache.org/jira/browse/HIVE-7296 Project: Hive Issue Type: New Feature Reporter: wangmeng For big data analysis, we often need to do the following query and statistics: 1.Cardinality Estimation, count the number of different elements in the collection, such as Unique Visitor ,UV) Now we can use hive-query: Select distinct(id) from TestTable ; 2.Frequency Estimation: estimate number of an element is repeated, such as the site visits of a user 。 Hive query: select count(1) from TestTable where name=”wangmeng” 3.Heavy Hitters, top-k elements: such as top-100 shops Hive query: select count(1), name from TestTable group by name ; need UDF…… 4.Range Query: for example, to find out the number of users between 20 to 30 Hive query : select count(1) from TestTable where age20 and age 30 5.Membership Query : for example, whether the user name is already registered? According to the implementation mechanism of hive , it will cost too large memory space and a long query time. However ,in many cases, we do not need very accurate results and a small error can be tolerated. In such case , we can use approximate processing to greatly improve the time and space efficiency. Now , based on some theoretical analysis materials ,I want to do some for these new features so much if possible. So, is there anything I can do ? Many Thanks. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7296) big data approximate processing at a very low cost based on hive sql
[ https://issues.apache.org/jira/browse/HIVE-7296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045604#comment-14045604 ] wangmeng commented on HIVE-7296: Sorry, they are different features big data approximate processing at a very low cost based on hive sql Key: HIVE-7296 URL: https://issues.apache.org/jira/browse/HIVE-7296 Project: Hive Issue Type: New Feature Reporter: wangmeng For big data analysis, we often need to do the following query and statistics: 1.Cardinality Estimation, count the number of different elements in the collection, such as Unique Visitor ,UV) Now we can use hive-query: Select distinct(id) from TestTable ; 2.Frequency Estimation: estimate number of an element is repeated, such as the site visits of a user 。 Hive query: select count(1) from TestTable where name=”wangmeng” 3.Heavy Hitters, top-k elements: such as top-100 shops Hive query: select count(1), name from TestTable group by name ; need UDF…… 4.Range Query: for example, to find out the number of users between 20 to 30 Hive query : select count(1) from TestTable where age20 and age 30 5.Membership Query : for example, whether the user name is already registered? According to the implementation mechanism of hive , it will cost too large memory space and a long query time. However ,in many cases, we do not need very accurate results and a small error can be tolerated. In such case , we can use approximate processing to greatly improve the time and space efficiency. Now , based on some theoretical analysis materials ,I want to do some for these new features so much if possible. . I am familiar with hive and hadoop , and I have implemented an efficient storage format based on hive.( https://github.com/sjtufighter/Data---Storage--). So, is there anything I can do ? Many Thanks. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6013) Supporting Quoted Identifiers in Column Names
[ https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045607#comment-14045607 ] Lefty Leverenz commented on HIVE-6013: -- The configuration parameter *hive.support.quoted.identifiers* is documented in the wiki: * [Configuration Properties -- hive.support.quoted.identifiers | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.support.quoted.identifiers] Supporting Quoted Identifiers in Column Names - Key: HIVE-6013 URL: https://issues.apache.org/jira/browse/HIVE-6013 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Fix For: 0.13.0 Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, QuotedIdentifier.html Hive's current behavior on Quoted Identifiers is different from the normal interpretation. Quoted Identifier (using backticks) has a special interpretation for Select expressions(as Regular Expressions). Have documented current behavior and proposed a solution in attached doc. Summary of solution is: - Introduce 'standard' quoted identifiers for columns only. - At the langauage level this is turned on by a flag. - At the metadata level we relax the constraint on column names. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045609#comment-14045609 ] Hive QA commented on HIVE-7220: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652750/HIVE-7220.5.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5670 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/613/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/613/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-613/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12652750 Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.5.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045614#comment-14045614 ] Szehon Ho commented on HIVE-7220: - dynpart_sort_optimization seems to be failing consistently, I'll try to take a look. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.5.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7302) Allow Auto-reducer parallelism to be turned off by a logical optimizer
[ https://issues.apache.org/jira/browse/HIVE-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7302: - Attachment: HIVE-7302.1.patch Allow Auto-reducer parallelism to be turned off by a logical optimizer -- Key: HIVE-7302 URL: https://issues.apache.org/jira/browse/HIVE-7302 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-7302.1.patch Auto reducer parallelism cannot be used for cases where a custom routing VertexManager is used. Allow a tri-state for ReduceSinkDesc::isAutoParallel to allow allow, disable, unset mechanics. The state machine for this setting will now be Allowed transitions unset - allow unset - disable allow - disable with no transition case for disable - allow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7288) Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs
[ https://issues.apache.org/jira/browse/HIVE-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shanyu zhao updated HIVE-7288: -- Attachment: (was: hive-7288.patch) Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs - Key: HIVE-7288 URL: https://issues.apache.org/jira/browse/HIVE-7288 Project: Hive Issue Type: New Feature Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1 Environment: HDInsight deploying HDP 2.1; Also HDP 2.1 on Windows Reporter: Azim Uddin Assignee: shanyu zhao Attachments: hive-7288.patch Issue: == Due to lack of parameters (or support for) equivalent of '-libjars' and '-archives' in WebHcat REST API, we cannot use an external Java Jars or Archive files with a Streaming MapReduce job, when the job is submitted via WebHcat/templeton. I am citing a few use cases here, but there can be plenty of scenarios like this- #1 (for -archives):In order to use R with a hadoop distribution like HDInsight or HDP on Windows, we could package the R directory up in a zip file and rename it to r.jar and put it into HDFS or WASB. We can then do something like this from hadoop command line (ignore the wasb syntax, same command can be run with hdfs) - hadoop jar %HADOOP_HOME%\lib\hadoop-streaming.jar -archives wasb:///example/jars/r.jar -files wasb:///example/apps/mapper.r,wasb:///example/apps/reducer.r -mapper ./r.jar/bin/Rscript.exe mapper.r -reducer ./r.jar/bin/Rscript.exe reducer.r -input /example/data/gutenberg -output /probe/r/wordcount This works from hadoop command line, but due to lack of support for '-archives' parameter in WebHcat, we can't submit the same Streaming MR job via WebHcat. #2 (for -libjars): Consider a scenario where a user would like to use a custom inputFormat with a Streaming MapReduce job and wrote his own custom InputFormat JAR. From a hadoop command line we can do something like this - hadoop jar /path/to/hadoop-streaming.jar \ -libjars /path/to/custom-formats.jar \ -D map.output.key.field.separator=, \ -D mapred.text.key.partitioner.options=-k1,1 \ -input my_data/ \ -output my_output/ \ -outputformat test.example.outputformat.DateFieldMultipleOutputFormat \ -mapper my_mapper.py \ -reducer my_reducer.py \ But due to lack of support for '-libjars' parameter for streaming MapReduce job in WebHcat, we can't submit the above streaming MR job (that uses a custom Java JAR) via WebHcat. Impact: We think, being able to submit jobs remotely is a vital feature for hadoop to be enterprise-ready and WebHcat plays an important role there. Streaming MapReduce job is also very important for interoperability. So, it would be very useful to keep WebHcat on par with hadoop command line in terms of streaming MR job submission capability. Ask: Enable parameter support for 'libjars' and 'archives' in WebHcat for Hadoop streaming jobs in WebHcat. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7302) Allow Auto-reducer parallelism to be turned off by a logical optimizer
[ https://issues.apache.org/jira/browse/HIVE-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045619#comment-14045619 ] Gunther Hagleitner commented on HIVE-7302: -- .1 has a draft. 2 things: a) [~vikram.dixit] is this the right place to disable? b) we need a way to test this. I am considering adding autoparallelism to the explain plan - maybe just the extended one. So we can add some tests to make sure this doesn't break. Allow Auto-reducer parallelism to be turned off by a logical optimizer -- Key: HIVE-7302 URL: https://issues.apache.org/jira/browse/HIVE-7302 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-7302.1.patch Auto reducer parallelism cannot be used for cases where a custom routing VertexManager is used. Allow a tri-state for ReduceSinkDesc::isAutoParallel to allow allow, disable, unset mechanics. The state machine for this setting will now be Allowed transitions unset - allow unset - disable allow - disable with no transition case for disable - allow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7288) Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs
[ https://issues.apache.org/jira/browse/HIVE-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shanyu zhao updated HIVE-7288: -- Attachment: hive-7288.patch Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs - Key: HIVE-7288 URL: https://issues.apache.org/jira/browse/HIVE-7288 Project: Hive Issue Type: New Feature Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1 Environment: HDInsight deploying HDP 2.1; Also HDP 2.1 on Windows Reporter: Azim Uddin Assignee: shanyu zhao Attachments: hive-7288.patch Issue: == Due to lack of parameters (or support for) equivalent of '-libjars' and '-archives' in WebHcat REST API, we cannot use an external Java Jars or Archive files with a Streaming MapReduce job, when the job is submitted via WebHcat/templeton. I am citing a few use cases here, but there can be plenty of scenarios like this- #1 (for -archives):In order to use R with a hadoop distribution like HDInsight or HDP on Windows, we could package the R directory up in a zip file and rename it to r.jar and put it into HDFS or WASB. We can then do something like this from hadoop command line (ignore the wasb syntax, same command can be run with hdfs) - hadoop jar %HADOOP_HOME%\lib\hadoop-streaming.jar -archives wasb:///example/jars/r.jar -files wasb:///example/apps/mapper.r,wasb:///example/apps/reducer.r -mapper ./r.jar/bin/Rscript.exe mapper.r -reducer ./r.jar/bin/Rscript.exe reducer.r -input /example/data/gutenberg -output /probe/r/wordcount This works from hadoop command line, but due to lack of support for '-archives' parameter in WebHcat, we can't submit the same Streaming MR job via WebHcat. #2 (for -libjars): Consider a scenario where a user would like to use a custom inputFormat with a Streaming MapReduce job and wrote his own custom InputFormat JAR. From a hadoop command line we can do something like this - hadoop jar /path/to/hadoop-streaming.jar \ -libjars /path/to/custom-formats.jar \ -D map.output.key.field.separator=, \ -D mapred.text.key.partitioner.options=-k1,1 \ -input my_data/ \ -output my_output/ \ -outputformat test.example.outputformat.DateFieldMultipleOutputFormat \ -mapper my_mapper.py \ -reducer my_reducer.py \ But due to lack of support for '-libjars' parameter for streaming MapReduce job in WebHcat, we can't submit the above streaming MR job (that uses a custom Java JAR) via WebHcat. Impact: We think, being able to submit jobs remotely is a vital feature for hadoop to be enterprise-ready and WebHcat plays an important role there. Streaming MapReduce job is also very important for interoperability. So, it would be very useful to keep WebHcat on par with hadoop command line in terms of streaming MR job submission capability. Ask: Enable parameter support for 'libjars' and 'archives' in WebHcat for Hadoop streaming jobs in WebHcat. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf
[ https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045622#comment-14045622 ] Lefty Leverenz commented on HIVE-6037: -- Will this be committed anytime soon? Synchronize HiveConf with hive-default.xml.template and support show conf - Key: HIVE-6037 URL: https://issues.apache.org/jira/browse/HIVE-6037 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Assignee: Navis Priority: Minor Fix For: 0.14.0 Attachments: CHIVE-6037.3.patch.txt, HIVE-6037-0.13.0, HIVE-6037.1.patch.txt, HIVE-6037.10.patch.txt, HIVE-6037.11.patch.txt, HIVE-6037.12.patch.txt, HIVE-6037.14.patch.txt, HIVE-6037.15.patch.txt, HIVE-6037.16.patch.txt, HIVE-6037.17.patch, HIVE-6037.2.patch.txt, HIVE-6037.4.patch.txt, HIVE-6037.5.patch.txt, HIVE-6037.6.patch.txt, HIVE-6037.7.patch.txt, HIVE-6037.8.patch.txt, HIVE-6037.9.patch.txt, HIVE-6037.patch see HIVE-5879 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez
[ https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7299: - Status: Patch Available (was: Open) Enable metadata only optimization on Tez Key: HIVE-7299 URL: https://issues.apache.org/jira/browse/HIVE-7299 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch, HIVE-7299.3.patch, HIVE-7299.4.patch Enables the metadata only optimization (the one with OneNullRowInputFormat not the query-result-from-stats optimizaton) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez
[ https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7299: - Attachment: HIVE-7299.4.patch Enable metadata only optimization on Tez Key: HIVE-7299 URL: https://issues.apache.org/jira/browse/HIVE-7299 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch, HIVE-7299.3.patch, HIVE-7299.4.patch Enables the metadata only optimization (the one with OneNullRowInputFormat not the query-result-from-stats optimizaton) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez
[ https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7299: - Status: Open (was: Patch Available) Enable metadata only optimization on Tez Key: HIVE-7299 URL: https://issues.apache.org/jira/browse/HIVE-7299 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch, HIVE-7299.3.patch, HIVE-7299.4.patch Enables the metadata only optimization (the one with OneNullRowInputFormat not the query-result-from-stats optimizaton) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7299) Enable metadata only optimization on Tez
[ https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045624#comment-14045624 ] Hive QA commented on HIVE-7299: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652765/HIVE-7299.4.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/614/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/614/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-614/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-614/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'shims/common-secure/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java' Reverted 'itests/qtest/testconfiguration.properties' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-minikdc/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/core/target hcatalog/streaming/target hcatalog/server-extensions/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen contrib/target service/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target ql/src/test/results/clientpositive/empty_dir_in_table.q.out ql/src/test/queries/clientpositive/empty_dir_in_table.q + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1605969. At revision 1605969. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12652765 Enable metadata only optimization on Tez Key: HIVE-7299 URL: https://issues.apache.org/jira/browse/HIVE-7299 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch, HIVE-7299.3.patch, HIVE-7299.4.patch Enables the metadata only optimization (the one with OneNullRowInputFormat not the query-result-from-stats optimizaton) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6375) Fix CTAS for parquet
[ https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045627#comment-14045627 ] Lefty Leverenz commented on HIVE-6375: -- Great, I'll revise the Parquet doc. Thanks [~szehon]. Fix CTAS for parquet Key: HIVE-6375 URL: https://issues.apache.org/jira/browse/HIVE-6375 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Brock Noland Assignee: Szehon Ho Priority: Critical Labels: Parquet Fix For: 0.13.0 Attachments: HIVE-6375.2.patch, HIVE-6375.3.patch, HIVE-6375.4.patch, HIVE-6375.patch More details here: https://github.com/Parquet/parquet-mr/issues/272 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs
[ https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045634#comment-14045634 ] Ravi Prakash commented on HIVE-4629: Hi Vaibhav! Sorry for the delay. I wish I had enough cycles to take this up, but that's just not happening. HS2 should support an API to retrieve query logs Key: HIVE-4629 URL: https://issues.apache.org/jira/browse/HIVE-4629 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4629-no_thrift.1.patch, HIVE-4629.1.patch, HIVE-4629.2.patch, HIVE-4629.3.patch.txt HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7302) Allow Auto-reducer parallelism to be turned off by a logical optimizer
[ https://issues.apache.org/jira/browse/HIVE-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7302: - Status: Patch Available (was: Open) Allow Auto-reducer parallelism to be turned off by a logical optimizer -- Key: HIVE-7302 URL: https://issues.apache.org/jira/browse/HIVE-7302 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-7302.1.patch Auto reducer parallelism cannot be used for cases where a custom routing VertexManager is used. Allow a tri-state for ReduceSinkDesc::isAutoParallel to allow allow, disable, unset mechanics. The state machine for this setting will now be Allowed transitions unset - allow unset - disable allow - disable with no transition case for disable - allow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez
[ https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7299: - Status: Patch Available (was: Open) Enable metadata only optimization on Tez Key: HIVE-7299 URL: https://issues.apache.org/jira/browse/HIVE-7299 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch, HIVE-7299.3.patch, HIVE-7299.4.patch, HIVE-7299.5.patch Enables the metadata only optimization (the one with OneNullRowInputFormat not the query-result-from-stats optimizaton) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez
[ https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7299: - Attachment: HIVE-7299.5.patch .5 is rebased. Enable metadata only optimization on Tez Key: HIVE-7299 URL: https://issues.apache.org/jira/browse/HIVE-7299 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch, HIVE-7299.3.patch, HIVE-7299.4.patch, HIVE-7299.5.patch Enables the metadata only optimization (the one with OneNullRowInputFormat not the query-result-from-stats optimizaton) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez
[ https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7299: - Status: Open (was: Patch Available) Enable metadata only optimization on Tez Key: HIVE-7299 URL: https://issues.apache.org/jira/browse/HIVE-7299 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch, HIVE-7299.3.patch, HIVE-7299.4.patch, HIVE-7299.5.patch Enables the metadata only optimization (the one with OneNullRowInputFormat not the query-result-from-stats optimizaton) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7288) Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs
[ https://issues.apache.org/jira/browse/HIVE-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045647#comment-14045647 ] shanyu zhao commented on HIVE-7288: --- [~ekoifman] and [~thejas], would you please help review this patch? Thanks. Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs - Key: HIVE-7288 URL: https://issues.apache.org/jira/browse/HIVE-7288 Project: Hive Issue Type: New Feature Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1 Environment: HDInsight deploying HDP 2.1; Also HDP 2.1 on Windows Reporter: Azim Uddin Assignee: shanyu zhao Attachments: hive-7288.patch Issue: == Due to lack of parameters (or support for) equivalent of '-libjars' and '-archives' in WebHcat REST API, we cannot use an external Java Jars or Archive files with a Streaming MapReduce job, when the job is submitted via WebHcat/templeton. I am citing a few use cases here, but there can be plenty of scenarios like this- #1 (for -archives):In order to use R with a hadoop distribution like HDInsight or HDP on Windows, we could package the R directory up in a zip file and rename it to r.jar and put it into HDFS or WASB. We can then do something like this from hadoop command line (ignore the wasb syntax, same command can be run with hdfs) - hadoop jar %HADOOP_HOME%\lib\hadoop-streaming.jar -archives wasb:///example/jars/r.jar -files wasb:///example/apps/mapper.r,wasb:///example/apps/reducer.r -mapper ./r.jar/bin/Rscript.exe mapper.r -reducer ./r.jar/bin/Rscript.exe reducer.r -input /example/data/gutenberg -output /probe/r/wordcount This works from hadoop command line, but due to lack of support for '-archives' parameter in WebHcat, we can't submit the same Streaming MR job via WebHcat. #2 (for -libjars): Consider a scenario where a user would like to use a custom inputFormat with a Streaming MapReduce job and wrote his own custom InputFormat JAR. From a hadoop command line we can do something like this - hadoop jar /path/to/hadoop-streaming.jar \ -libjars /path/to/custom-formats.jar \ -D map.output.key.field.separator=, \ -D mapred.text.key.partitioner.options=-k1,1 \ -input my_data/ \ -output my_output/ \ -outputformat test.example.outputformat.DateFieldMultipleOutputFormat \ -mapper my_mapper.py \ -reducer my_reducer.py \ But due to lack of support for '-libjars' parameter for streaming MapReduce job in WebHcat, we can't submit the above streaming MR job (that uses a custom Java JAR) via WebHcat. Impact: We think, being able to submit jobs remotely is a vital feature for hadoop to be enterprise-ready and WebHcat plays an important role there. Streaming MapReduce job is also very important for interoperability. So, it would be very useful to keep WebHcat on par with hadoop command line in terms of streaming MR job submission capability. Ask: Enable parameter support for 'libjars' and 'archives' in WebHcat for Hadoop streaming jobs in WebHcat. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045648#comment-14045648 ] Lefty Leverenz commented on HIVE-6938: -- I mentioned this in the Limitations section of the Parquet wikidoc, but it could use an example and some usage notes in a new subsection. * [Language Manual -- Parquet -- Limitations | https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Limitations] Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Assignee: Daniel Weeks Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch, HIVE-6938.2.patch, HIVE-6938.3.patch, HIVE-6938.3.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-1662) Add file pruning into Hive.
[ https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045656#comment-14045656 ] Lefty Leverenz commented on HIVE-1662: -- Or you could put the description of *hive.optimize.ppd.vc.filename* in a HiveConf.java comment. bq. Please supply a definition for hive.optimize.ppd.vc.filename in hive-default.xml.template. (Or is it an internal parameter that doesn't need documentation?) Add file pruning into Hive. --- Key: HIVE-1662 URL: https://issues.apache.org/jira/browse/HIVE-1662 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Navis Attachments: HIVE-1662.10.patch.txt, HIVE-1662.11.patch.txt, HIVE-1662.12.patch.txt, HIVE-1662.13.patch.txt, HIVE-1662.14.patch.txt, HIVE-1662.15.patch.txt, HIVE-1662.16.patch.txt, HIVE-1662.17.patch.txt, HIVE-1662.8.patch.txt, HIVE-1662.9.patch.txt, HIVE-1662.D8391.1.patch, HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch, HIVE-1662.D8391.4.patch, HIVE-1662.D8391.5.patch, HIVE-1662.D8391.6.patch, HIVE-1662.D8391.7.patch now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7302) Allow Auto-reducer parallelism to be turned off by a logical optimizer
[ https://issues.apache.org/jira/browse/HIVE-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045662#comment-14045662 ] Vikram Dixit K commented on HIVE-7302: -- I discussed this with Gopal today and I think that is the right place to replace it. I tested this on the bucket map join tests I am currently modifying and getting the right behavior. Showing this info in the explain extended plan is a good idea. One issue would be if during execution we actually change the number of reducers in tez, the number shown may not have any real meaning apart from saying there was an estimation. Allow Auto-reducer parallelism to be turned off by a logical optimizer -- Key: HIVE-7302 URL: https://issues.apache.org/jira/browse/HIVE-7302 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-7302.1.patch Auto reducer parallelism cannot be used for cases where a custom routing VertexManager is used. Allow a tri-state for ReduceSinkDesc::isAutoParallel to allow allow, disable, unset mechanics. The state machine for this setting will now be Allowed transitions unset - allow unset - disable allow - disable with no transition case for disable - allow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6584) Add HiveHBaseTableSnapshotInputFormat
[ https://issues.apache.org/jira/browse/HIVE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teng Yutong updated HIVE-6584: -- Attachment: HIVE-6584.6.patch hi, sorry for the late reply...this is the regenerated patch. But It won't work unless HBase has been modified. Because we need HBase expose TableSnapshotRegionSplit and convertStringToScan. BR Add HiveHBaseTableSnapshotInputFormat - Key: HIVE-6584 URL: https://issues.apache.org/jira/browse/HIVE-6584 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.14.0 Attachments: HIVE-6584.0.patch, HIVE-6584.1.patch, HIVE-6584.2.patch, HIVE-6584.3.patch, HIVE-6584.4.patch, HIVE-6584.5.patch, HIVE-6584.6.patch HBASE-8369 provided mapreduce support for reading from HBase table snapsopts. This allows a MR job to consume a stable, read-only view of an HBase table directly off of HDFS. Bypassing the online region server API provides a nice performance boost for the full scan. HBASE-10642 is backporting that feature to 0.94/0.96 and also adding a {{mapred}} implementation. Once that's available, we should add an input format. A follow-on patch could work out how to integrate this functionality into the StorageHandler, similar to how HIVE-6473 integrates the HFileOutputFormat into existing table definitions. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 23124: HIVE-7291: Refactor TestParser to understand test-property file
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23124/ --- Review request for hive and Brock Noland. Bugs: HIVE-7291 https://issues.apache.org/jira/browse/HIVE-7291 Repository: hive-git Description --- Add some new properties: qFileTests.propertyFiles.${propertyFileName}=${propertyFilePath} The testparser will look into ${propertyFilePath} and populate list of {propertyName,propertyValue} If any qtest group specifies ${propertyFileName}.${propertyName}, it will substitute that propertyName by the propertyValue in the corresponding propertyFile. Diffs - testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/conf/TestParser.java 3155f08 testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/conf/TestTestParser.java 35680b4 Diff: https://reviews.apache.org/r/23124/diff/ Testing --- Added a unit test. Also did manual testing on the current trunk-mr2.properties to verify. Thanks, Szehon Ho
[jira] [Commented] (HIVE-7302) Allow Auto-reducer parallelism to be turned off by a logical optimizer
[ https://issues.apache.org/jira/browse/HIVE-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045688#comment-14045688 ] Hive QA commented on HIVE-7302: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652762/HIVE-7302.1.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5670 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/615/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/615/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-615/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12652762 Allow Auto-reducer parallelism to be turned off by a logical optimizer -- Key: HIVE-7302 URL: https://issues.apache.org/jira/browse/HIVE-7302 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-7302.1.patch Auto reducer parallelism cannot be used for cases where a custom routing VertexManager is used. Allow a tri-state for ReduceSinkDesc::isAutoParallel to allow allow, disable, unset mechanics. The state machine for this setting will now be Allowed transitions unset - allow unset - disable allow - disable with no transition case for disable - allow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7291) Refactor TestParser to understand test-property file
[ https://issues.apache.org/jira/browse/HIVE-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-7291: Attachment: HIVE-7291.patch Attaching a patch, which adds following possible syntax to the property file: qFileTests.propertyFiles.${propertyFileName}=${propertyFilePath} The testparser will look into ${propertyFilePath} and populate list of {propertyName,propertyValue} If any qtest group specifies ${propertyFileName}.${propertyName}, it will substitute that propertyName by the propertyValue in the corresponding propertyFile. Refactor TestParser to understand test-property file Key: HIVE-7291 URL: https://issues.apache.org/jira/browse/HIVE-7291 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7291.patch, trunk-mr2.properties -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7291) Refactor TestParser to understand test-property file
[ https://issues.apache.org/jira/browse/HIVE-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-7291: Attachment: trunk-mr2.properties For reference, am attaching the new trunk-mr2.properties that it will run on. I believe as the hard-coded list in the old trunk-mr2.properties (see parent JIRA) is out of date, we might see some test failures of new tests once we switch. Refactor TestParser to understand test-property file Key: HIVE-7291 URL: https://issues.apache.org/jira/browse/HIVE-7291 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7291.patch, trunk-mr2.properties -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23124: HIVE-7291: Refactor TestParser to understand test-property file
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23124/ --- (Updated June 27, 2014, 8:06 a.m.) Review request for hive and Brock Noland. Bugs: HIVE-7291 https://issues.apache.org/jira/browse/HIVE-7291 Repository: hive-git Description (updated) --- Add some new properties: qFileTests.propertyFiles.${propertyFileName}=${propertyFilePath} The testparser will look into ${propertyFilePath} and populate list of {propertyName,propertyValue} If any qtest group specifies ${propertyFileName}.${propertyName}, it will substitute that propertyName by the propertyValue in the corresponding propertyFile. Also before {excluded} was exclusive from {included}. But I'm changing that for qtest so its easier to disable tests in ptest framework in emergency. (Before we hard-coded the tests in the {included} group and could change at will, but now this list is shared with the main build so we no longer can change at will). Diffs - testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/conf/TestParser.java 3155f08 testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/conf/TestTestParser.java 35680b4 Diff: https://reviews.apache.org/r/23124/diff/ Testing --- Added a unit test. Also did manual testing on the current trunk-mr2.properties to verify. Thanks, Szehon Ho
[jira] [Updated] (HIVE-7291) Refactor TestParser to understand test-property file
[ https://issues.apache.org/jira/browse/HIVE-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-7291: Status: Patch Available (was: Open) Refactor TestParser to understand test-property file Key: HIVE-7291 URL: https://issues.apache.org/jira/browse/HIVE-7291 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7291.patch, trunk-mr2.properties NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7291) Refactor TestParser to understand test-property file
[ https://issues.apache.org/jira/browse/HIVE-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-7291: Description: NO PRECOMMIT TESTS Refactor TestParser to understand test-property file Key: HIVE-7291 URL: https://issues.apache.org/jira/browse/HIVE-7291 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7291.patch, trunk-mr2.properties NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7291) Refactor TestParser to understand test-property file
[ https://issues.apache.org/jira/browse/HIVE-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045703#comment-14045703 ] Szehon Ho commented on HIVE-7291: - That first comment auto-formatted strangely, writing the same again in formatted blocks. With the new syntax: {noformat} qFileTests.propertyFiles.${propertyFileName}=${propertyFilePath} {noformat} The testparser will open the file specified at {noformat}${propertyFilePath} {noformat} and populate a list of {noformat}{propertyName,propertyValue}{noformat} Then If any qtest test-list specifies {noformat} ${propertyFileName}.${propertyName} {noformat} as an item, it will resolve that by looking up the corresponding propertyValue (list of tests), in the corresponding propertyFile. In the real example, propertyFile will be testconfiguration.properties added in HIVE-7258, and the propertyNames listed in that file are miniTez.query.files, minimr.query.files, etc. Refactor TestParser to understand test-property file Key: HIVE-7291 URL: https://issues.apache.org/jira/browse/HIVE-7291 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7291.patch, trunk-mr2.properties NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7299) Enable metadata only optimization on Tez
[ https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045798#comment-14045798 ] Hive QA commented on HIVE-7299: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652770/HIVE-7299.5.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5655 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/616/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/616/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-616/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12652770 Enable metadata only optimization on Tez Key: HIVE-7299 URL: https://issues.apache.org/jira/browse/HIVE-7299 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch, HIVE-7299.3.patch, HIVE-7299.4.patch, HIVE-7299.5.patch Enables the metadata only optimization (the one with OneNullRowInputFormat not the query-result-from-stats optimizaton) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6584) Add HiveHBaseTableSnapshotInputFormat
[ https://issues.apache.org/jira/browse/HIVE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045804#comment-14045804 ] Hive QA commented on HIVE-6584: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652773/HIVE-6584.6.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/617/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/617/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-617/ Messages: {noformat} This message was trimmed, see log for full details [INFO] - [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSnapshotSplit.java:[10,66] package org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat does not exist [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSnapshotSplit.java:[17,17] cannot find symbol symbol: class TableSnapshotRegionSplit location: class org.apache.hadoop.hive.hbase.HBaseSnapshotSplit [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSnapshotSplit.java:[24,29] cannot find symbol symbol: class TableSnapshotRegionSplit location: class org.apache.hadoop.hive.hbase.HBaseSnapshotSplit [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSnapshotSplit.java:[29,10] cannot find symbol symbol: class TableSnapshotRegionSplit location: class org.apache.hadoop.hive.hbase.HBaseSnapshotSplit [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableSnapshotInputFormat.java:[28,41] cannot find symbol symbol: class TableSnapshotInputFormat location: package org.apache.hadoop.hbase.mapreduce [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableSnapshotInputFormat.java:[31,66] package org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat does not exist [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableSnapshotInputFormat.java:[33,47] cannot find symbol symbol: class ColumnMapping location: class org.apache.hadoop.hive.hbase.HBaseSerDe [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableSnapshotInputFormat.java:[76,3] cannot find symbol symbol: class TableSnapshotInputFormat location: class org.apache.hadoop.hive.hbase.HiveHBaseTableSnapshotInputFormat [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSnapshotStorageHandler.java:[34,41] cannot find symbol symbol: class TableSnapshotInputFormatImpl location: package org.apache.hadoop.hbase.mapreduce [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSnapshotStorageHandler.java:[37,47] cannot find symbol symbol: class ColumnMapping location: class org.apache.hadoop.hive.hbase.HBaseSerDe [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSnapshotSplit.java:[21,17] cannot find symbol symbol: class TableSnapshotRegionSplit location: class org.apache.hadoop.hive.hbase.HBaseSnapshotSplit [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableSnapshotInputFormat.java:[76,43] cannot find symbol symbol: class TableSnapshotInputFormat location: class org.apache.hadoop.hive.hbase.HiveHBaseTableSnapshotInputFormat [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableSnapshotInputFormat.java:[87,10] cannot find symbol symbol: class ColumnMapping location: class org.apache.hadoop.hive.hbase.HiveHBaseTableSnapshotInputFormat [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableSnapshotInputFormat.java:[89,54] incompatible types required: java.util.ListColumnMapping found:org.apache.hadoop.hive.hbase.ColumnMappings [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableSnapshotInputFormat.java:[108,9] cannot find symbol symbol: variable HiveHBaseInputFormatUtil location: class org.apache.hadoop.hive.hbase.HiveHBaseTableSnapshotInputFormat [ERROR]
Re: Review Request 23006: Escape control characters for explain result
On June 26, 2014, 6:51 p.m., Xuefu Zhang wrote: ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java, line 195 https://reviews.apache.org/r/23006/diff/1/?file=618089#file618089line195 For my understanding, why cannot we just simply replace 0x00 with a different character such as ' '? Why we are dealing with quotes and commas? Can you give an example what's transformed to what? Navis Ryu wrote: With comments with spaces, just replacing 0x00 into a space would be confusing, IMHO. comment 10x00comment 20x00 will be printed like 'comment 1','comment 2',''. For 0x000x00, null will be returned (nothing in comments). Xuefu Zhang wrote: I don't know how 0x00 came into the picutre at the first place. It seems reasonable to me that comment should contain nothing but a string. Navis Ryu wrote: Like other schema information (column names and types), it's comments for multiple columns, delimited by the special character, 0x00. Xuefu Zhang wrote: It seems anyway we cannot avoid separator conflicts completely if we replace 0x00 with any character. For instance, if a comment contain ' or , then your solution can also lead to confusion. If that's the case, I'm wondering if replacing 0x00 with some uncommonly used character such as | would make it simpler. That is: comment10x00comment20x00 = comment1|comment2| 0x000x00 = || Of course, this is very subjective. Navis Ryu wrote: Agree. I thought quote+',' would be enough but it seemed better to use '|'. How about null on no comment in any columns? Is '||' better option? Using 0x00 as the separator doesn't solve the null problem either, so I think we don't have to address that in this JIRA either. This is just for comments, so probably it doesn't really matter much. - Xuefu --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23006/#review46773 --- On June 26, 2014, 9:05 a.m., Navis Ryu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23006/ --- (Updated June 26, 2014, 9:05 a.m.) Review request for hive. Bugs: HIVE-7024 https://issues.apache.org/jira/browse/HIVE-7024 Repository: hive-git Description --- Comments for columns are now delimited by 0x00, which is binary and make git refuse to make proper diff file. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 92545d8 ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 1149bda ql/src/test/results/clientpositive/alter_partition_coltype.q.out e86cc06 ql/src/test/results/clientpositive/annotate_stats_filter.q.out c7d58f6 ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 6f72964 ql/src/test/results/clientpositive/annotate_stats_join.q.out cc816c8 ql/src/test/results/clientpositive/annotate_stats_part.q.out a0b4602 ql/src/test/results/clientpositive/annotate_stats_select.q.out 97e9473 ql/src/test/results/clientpositive/annotate_stats_table.q.out bb2d18c ql/src/test/results/clientpositive/annotate_stats_union.q.out 6d179b6 ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 3f4f902 ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 72640df ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out c660cd0 ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 4abda32 ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 52a3194 ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out d807791 ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 35e0a30 ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out af3d9d6 ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 05ef5d8 ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out e423d14 ql/src/test/results/clientpositive/binary_output_format.q.out 294aabb ql/src/test/results/clientpositive/bucket1.q.out f3eb15c ql/src/test/results/clientpositive/bucket2.q.out 9a22160 ql/src/test/results/clientpositive/bucket3.q.out 8fa9c7b ql/src/test/results/clientpositive/bucket4.q.out 032272b ql/src/test/results/clientpositive/bucket5.q.out d19fbe5 ql/src/test/results/clientpositive/bucket_map_join_1.q.out 8674a6c ql/src/test/results/clientpositive/bucket_map_join_2.q.out 8a5984d ql/src/test/results/clientpositive/bucketcontext_1.q.out 1513515 ql/src/test/results/clientpositive/bucketcontext_2.q.out d18a9be ql/src/test/results/clientpositive/bucketcontext_3.q.out e12c155 ql/src/test/results/clientpositive/bucketcontext_4.q.out 77b4882
[jira] [Resolved] (HIVE-7097) The Support for REGEX Column Broken in HIVE 0.13
[ https://issues.apache.org/jira/browse/HIVE-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumit Kumar resolved HIVE-7097. --- Resolution: Not a Problem Thank you [~leftylev]. Marking this Resolved/Not a problem The Support for REGEX Column Broken in HIVE 0.13 Key: HIVE-7097 URL: https://issues.apache.org/jira/browse/HIVE-7097 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0 Reporter: Sun Rui The Support for REGEX Column is OK in HIVE 0.12, but is broken in HIVE 0.13. For example: {code:sql} select `key.*` from src limit 1; {code} will fail in HIVE 0.13 with the following error from SemanticAnalyzer: {noformat} FAILED: SemanticException [Error 10004]: Line 1:7 Invalid table alias or column reference 'key.*': (possible column names are: key, value) {noformat} This issue is related to HIVE-6037. When set hive.support.quoted.identifiers=none, the issue will be gone. I am not sure the configuration was intended to break regex column. But at least the documentation needs to be updated: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-REGEXColumnSpecification I would argue backward compatibility is more important. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7304) Transitive Predicate Propagation doesn't happen properly after HIVE-7159
Harish Butani created HIVE-7304: --- Summary: Transitive Predicate Propagation doesn't happen properly after HIVE-7159 Key: HIVE-7304 URL: https://issues.apache.org/jira/browse/HIVE-7304 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani The reason for the regression is that PredicateTransitivePropagate looks at the FilterOperator below the ReduceSink. SemanticAnalyzer::genNotNullFilterForJoinSourcePlan was stacking another FilterOp for the not null check, so only that predicate was being applied transitively by PredicateTransitivePropagate. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-3628) Provide a way to use counters in Hive through UDF
[ https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046073#comment-14046073 ] Sudarshan Rangarajan commented on HIVE-3628: Thanks, Navis. Is this applicable to add a file that I would want placed on the distributed cache from within my UDTF ? [not before invoking the UDTF but from within the UDTF, once I know which files to add programmatically] Lefty - I find the documentation for UDTF to be wanting. Is there a way I can edit the wikidoc too ? Provide a way to use counters in Hive through UDF - Key: HIVE-3628 URL: https://issues.apache.org/jira/browse/HIVE-3628 Project: Hive Issue Type: Improvement Components: UDF Reporter: Viji Assignee: Navis Priority: Minor Fix For: 0.11.0 Attachments: HIVE-3628.D8007.1.patch, HIVE-3628.D8007.2.patch, HIVE-3628.D8007.3.patch, HIVE-3628.D8007.4.patch, HIVE-3628.D8007.5.patch, HIVE-3628.D8007.6.patch Currently it is not possible to generate counters through UDF. We should support this. Pig currently allows this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7295) FileStatus.getOwner on Windows returns name of group the user belongs to, instead of user name expected, fails many authorization related unit tests
[ https://issues.apache.org/jira/browse/HIVE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046119#comment-14046119 ] Xiaobing Zhou commented on HIVE-7295: - Thanks [~cnauroth] for pointing-out. Not setting ownership of new files to Administrators is preferable, this is loose constraint as compared. FileStatus.getOwner on Windows returns name of group the user belongs to, instead of user name expected, fails many authorization related unit tests Key: HIVE-7295 URL: https://issues.apache.org/jira/browse/HIVE-7295 Project: Hive Issue Type: Bug Components: Authorization, HCatalog, Security, Windows Affects Versions: 0.13.0 Environment: Windows Server 2008 R2 Reporter: Xiaobing Zhou Priority: Critical Unit test in TestHdfsAuthorizationProvider, e.g. org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testTableOps. fails to run. Running org.apache.hcatalog.security.TestHdfsAuthorizationProvider Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 15.799 sec FAILURE! - in org.apache.hcatalog.security.TestHdfsAuthorizationProvider testTableOps(org.apache.hcatalog.security.TestHdfsAuthorizationProvider) Time elapsed: 15.546 sec FAILURE! junit.framework.AssertionFailedError: FAILED: AuthorizationException org.apache.hadoop.security.AccessControlException: action WRITE not permitted on path pfile:/Users/xz hou/hworks/workspace/hwx-hive-ws/hive/hcatalog/core/target/warehouse for user xzhou expected:0 but was:4 at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at junit.framework.Assert.assertEquals(Assert.java:199) at org.apache.hcatalog.security.TestHdfsAuthorizationProvider.exec(TestHdfsAuthorizationProvider.java:172) at org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testTableOps(TestHdfsAuthorizationProvider.java:307) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7304) Transitive Predicate Propagation doesn't happen properly after HIVE-7159
[ https://issues.apache.org/jira/browse/HIVE-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-7304: Attachment: HIVE-7304.1.patch Transitive Predicate Propagation doesn't happen properly after HIVE-7159 Key: HIVE-7304 URL: https://issues.apache.org/jira/browse/HIVE-7304 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7304.1.patch The reason for the regression is that PredicateTransitivePropagate looks at the FilterOperator below the ReduceSink. SemanticAnalyzer::genNotNullFilterForJoinSourcePlan was stacking another FilterOp for the not null check, so only that predicate was being applied transitively by PredicateTransitivePropagate. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7304) Transitive Predicate Propagation doesn't happen properly after HIVE-7159
[ https://issues.apache.org/jira/browse/HIVE-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-7304: Status: Patch Available (was: Open) Transitive Predicate Propagation doesn't happen properly after HIVE-7159 Key: HIVE-7304 URL: https://issues.apache.org/jira/browse/HIVE-7304 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7304.1.patch The reason for the regression is that PredicateTransitivePropagate looks at the FilterOperator below the ReduceSink. SemanticAnalyzer::genNotNullFilterForJoinSourcePlan was stacking another FilterOp for the not null check, so only that predicate was being applied transitively by PredicateTransitivePropagate. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-2089) Add a new input format to be able to combine multiple .gz text files
[ https://issues.apache.org/jira/browse/HIVE-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumit Kumar resolved HIVE-2089. --- Resolution: Won't Fix I verified [~slider]'s observation. It indeed works. Marking this JIRA as Won't Fix Add a new input format to be able to combine multiple .gz text files Key: HIVE-2089 URL: https://issues.apache.org/jira/browse/HIVE-2089 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Attachments: HIVE-2089.1.patch For files that is not splittable, CombineHiveInputFormat won't help. This jira is to add a new inputformat to support this feature. This is very useful for partitions with tens of thousands of .gz files. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22772: HIVE-6637: UDF in_file() doesn't take CHAR or VARCHAR as input
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22772/#review46879 --- data/files/in_file.dat https://reviews.apache.org/r/22772/#comment82447 It's fine. I just wanted to point out that reusing existing file/table is preferred. We don't want to create an one-line for each test we added. - Xuefu Zhang On June 26, 2014, 5:12 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22772/ --- (Updated June 26, 2014, 5:12 p.m.) Review request for hive. Bugs: HIVE-6637 https://issues.apache.org/jira/browse/HIVE-6637 Repository: hive-git Description --- HIVE-6637: UDF in_file() doesn't take CHAR or VARCHAR as input Diffs - data/files/in_file.dat PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInFile.java ea52537d0b85191f0b633a29aa3f7ddb556c288d ql/src/test/queries/clientpositive/udf_in_file.q 9d9efe8e23d6e73429ee5cd2c8470359ba2b3498 ql/src/test/results/clientpositive/udf_in_file.q.out b63143760d80f3f6a8ba0a23c0d87e8bb86fce66 Diff: https://reviews.apache.org/r/22772/diff/ Testing --- Tested with qtest. Thanks, Ashish Singh
[jira] [Created] (HIVE-7305) Return value from in.read() is ignored in SerializationUtils#readLongLE()
Ted Yu created HIVE-7305: Summary: Return value from in.read() is ignored in SerializationUtils#readLongLE() Key: HIVE-7305 URL: https://issues.apache.org/jira/browse/HIVE-7305 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor {code} long readLongLE(InputStream in) throws IOException { in.read(readBuffer, 0, 8); return (((readBuffer[0] 0xff) 0) + ((readBuffer[1] 0xff) 8) {code} Return value from read() may indicate fewer than 8 bytes read. The return value should be checked. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7306) Ineffective null check in GenericUDAFAverage#GenericUDAFAverageEvaluatorDouble#getNextResult()
Ted Yu created HIVE-7306: Summary: Ineffective null check in GenericUDAFAverage#GenericUDAFAverageEvaluatorDouble#getNextResult() Key: HIVE-7306 URL: https://issues.apache.org/jira/browse/HIVE-7306 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor {code} Object[] o = ss.intermediateVals.remove(0); Double d = o == null ? 0.0 : (Double) o[0]; r = r == null ? null : r - d; cnt = cnt - ((Long) o[1]); {code} Array o is accessed without null check in the last line above. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7306) Ineffective null check in GenericUDAFAverage#GenericUDAFAverageEvaluatorDouble#getNextResult()
[ https://issues.apache.org/jira/browse/HIVE-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046215#comment-14046215 ] Ted Yu commented on HIVE-7306: -- Similar problem exists in GenericUDAFAverage#GenericUDAFAverageEvaluatorDecimal as well. Ineffective null check in GenericUDAFAverage#GenericUDAFAverageEvaluatorDouble#getNextResult() -- Key: HIVE-7306 URL: https://issues.apache.org/jira/browse/HIVE-7306 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor {code} Object[] o = ss.intermediateVals.remove(0); Double d = o == null ? 0.0 : (Double) o[0]; r = r == null ? null : r - d; cnt = cnt - ((Long) o[1]); {code} Array o is accessed without null check in the last line above. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7307) Lack of synchronization for TxnHandler#getDbConn()
Ted Yu created HIVE-7307: Summary: Lack of synchronization for TxnHandler#getDbConn() Key: HIVE-7307 URL: https://issues.apache.org/jira/browse/HIVE-7307 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor TxnHandler#getDbConn() accesses connPool without holding lock on TxnHandler.class -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7307) Lack of synchronization for TxnHandler#getDbConn()
[ https://issues.apache.org/jira/browse/HIVE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HIVE-7307: - Description: TxnHandler#getDbConn() accesses connPool without holding lock on TxnHandler.class {code} Connection dbConn = connPool.getConnection(); dbConn.setAutoCommit(false); {code} null check should be performed on the return value, dbConn. was:TxnHandler#getDbConn() accesses connPool without holding lock on TxnHandler.class Lack of synchronization for TxnHandler#getDbConn() -- Key: HIVE-7307 URL: https://issues.apache.org/jira/browse/HIVE-7307 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor TxnHandler#getDbConn() accesses connPool without holding lock on TxnHandler.class {code} Connection dbConn = connPool.getConnection(); dbConn.setAutoCommit(false); {code} null check should be performed on the return value, dbConn. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22772: HIVE-6637: UDF in_file() doesn't take CHAR or VARCHAR as input
On June 27, 2014, 5:33 p.m., Xuefu Zhang wrote: data/files/in_file.dat, line 1 https://reviews.apache.org/r/22772/diff/6/?file=618531#file618531line1 It's fine. I just wanted to point out that reusing existing file/table is preferred. We don't want to create an one-line for each test we added. Agreed Xuefu. - Ashish --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22772/#review46879 --- On June 26, 2014, 5:12 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22772/ --- (Updated June 26, 2014, 5:12 p.m.) Review request for hive. Bugs: HIVE-6637 https://issues.apache.org/jira/browse/HIVE-6637 Repository: hive-git Description --- HIVE-6637: UDF in_file() doesn't take CHAR or VARCHAR as input Diffs - data/files/in_file.dat PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInFile.java ea52537d0b85191f0b633a29aa3f7ddb556c288d ql/src/test/queries/clientpositive/udf_in_file.q 9d9efe8e23d6e73429ee5cd2c8470359ba2b3498 ql/src/test/results/clientpositive/udf_in_file.q.out b63143760d80f3f6a8ba0a23c0d87e8bb86fce66 Diff: https://reviews.apache.org/r/22772/diff/ Testing --- Tested with qtest. Thanks, Ashish Singh
[jira] [Commented] (HIVE-7282) HCatLoader fail to load Orc map with null key
[ https://issues.apache.org/jira/browse/HIVE-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046250#comment-14046250 ] Eugene Koifman commented on HIVE-7282: -- Would it not make more sense to add the new test to TestHCatLoaderComplexSchema, so that it's run with both ORC and RCFile? HCatLoader fail to load Orc map with null key - Key: HIVE-7282 URL: https://issues.apache.org/jira/browse/HIVE-7282 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.14.0 Attachments: HIVE-7282-1.patch, HIVE-7282-2.patch Here is the stack: Get exception: AttemptID:attempt_1403634189382_0011_m_00_0 Info:Error: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.NullPointerException at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToPigMap(PigHCatUtil.java:469) at org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:404) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:456) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:374) at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64) ... 13 more -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [ANNOUNCE] New Hive Committers - Gopal Vijayaraghavan and Szehon Ho
Congrats! On Mon, Jun 23, 2014 at 11:05 AM, Jayesh Senjaliya jhsonl...@gmail.com wrote: Congratulations Gopal and Szehon !! On Mon, Jun 23, 2014 at 10:35 AM, Vikram Dixit vik...@hortonworks.com wrote: Congrats Gopal and Szehon! On Mon, Jun 23, 2014 at 10:34 AM, Jason Dere jd...@hortonworks.com wrote: Congrats! On Jun 23, 2014, at 10:28 AM, Hari Subramaniyan hsubramani...@hortonworks.com wrote: congrats to Gopal and Szehon! Thanks Hari On Mon, Jun 23, 2014 at 9:59 AM, Xiaobing Zhou xz...@hortonworks.com wrote: Congrats! On Mon, Jun 23, 2014 at 9:52 AM, Vaibhav Gumashta vgumas...@hortonworks.com wrote: Congrats Gopal and Szehon! --Vaibhav On Mon, Jun 23, 2014 at 8:48 AM, Szehon Ho sze...@cloudera.com wrote: Thank you all very much, and congrats Gopal! Szehon On Sun, Jun 22, 2014 at 8:42 PM, Carl Steinbach c...@apache.org wrote: The Apache Hive PMC has voted to make Gopal Vijayaraghavan and Szehon Ho committers on the Apache Hive Project. Please join me in congratulating Gopal and Szehon! Thanks. - Carl -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-7288) Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs
[ https://issues.apache.org/jira/browse/HIVE-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046261#comment-14046261 ] Eugene Koifman commented on HIVE-7288: -- [~shanyu] Please add tests for this feature. Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs - Key: HIVE-7288 URL: https://issues.apache.org/jira/browse/HIVE-7288 Project: Hive Issue Type: New Feature Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1 Environment: HDInsight deploying HDP 2.1; Also HDP 2.1 on Windows Reporter: Azim Uddin Assignee: shanyu zhao Attachments: hive-7288.patch Issue: == Due to lack of parameters (or support for) equivalent of '-libjars' and '-archives' in WebHcat REST API, we cannot use an external Java Jars or Archive files with a Streaming MapReduce job, when the job is submitted via WebHcat/templeton. I am citing a few use cases here, but there can be plenty of scenarios like this- #1 (for -archives):In order to use R with a hadoop distribution like HDInsight or HDP on Windows, we could package the R directory up in a zip file and rename it to r.jar and put it into HDFS or WASB. We can then do something like this from hadoop command line (ignore the wasb syntax, same command can be run with hdfs) - hadoop jar %HADOOP_HOME%\lib\hadoop-streaming.jar -archives wasb:///example/jars/r.jar -files wasb:///example/apps/mapper.r,wasb:///example/apps/reducer.r -mapper ./r.jar/bin/Rscript.exe mapper.r -reducer ./r.jar/bin/Rscript.exe reducer.r -input /example/data/gutenberg -output /probe/r/wordcount This works from hadoop command line, but due to lack of support for '-archives' parameter in WebHcat, we can't submit the same Streaming MR job via WebHcat. #2 (for -libjars): Consider a scenario where a user would like to use a custom inputFormat with a Streaming MapReduce job and wrote his own custom InputFormat JAR. From a hadoop command line we can do something like this - hadoop jar /path/to/hadoop-streaming.jar \ -libjars /path/to/custom-formats.jar \ -D map.output.key.field.separator=, \ -D mapred.text.key.partitioner.options=-k1,1 \ -input my_data/ \ -output my_output/ \ -outputformat test.example.outputformat.DateFieldMultipleOutputFormat \ -mapper my_mapper.py \ -reducer my_reducer.py \ But due to lack of support for '-libjars' parameter for streaming MapReduce job in WebHcat, we can't submit the above streaming MR job (that uses a custom Java JAR) via WebHcat. Impact: We think, being able to submit jobs remotely is a vital feature for hadoop to be enterprise-ready and WebHcat plays an important role there. Streaming MapReduce job is also very important for interoperability. So, it would be very useful to keep WebHcat on par with hadoop command line in terms of streaming MR job submission capability. Ask: Enable parameter support for 'libjars' and 'archives' in WebHcat for Hadoop streaming jobs in WebHcat. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7286) Parameterize HCatMapReduceTest for testing against all Hive storage formats
[ https://issues.apache.org/jira/browse/HIVE-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046268#comment-14046268 ] Szehon Ho commented on HIVE-7286: - Yea it seems worthwhile to get HIVE-5976 in, thanks for your help in that. Sorry for late response. Those Serdes in SERDEUSINGMETASTOREFORSCHEMA use the native hive metadata to determine schemas, instead of ones like avro that specify it outside. Hence those can easily plug into HCatMapReduceTest via params, as it creates table using native hive metadata. But I'm personally not that eager to force other Serde to plugin to the test, as you had to write lengthy schema-conversion code for avro to do that, that is test-only code and a burden to maintain as there's no real use-case elsewhere for that. I think its wonderful if a test framework can automatically generate tests for new serdes, but I don't think it should enforce this unnecessary work on new-serde devs, as test-coverage can be achieved in more natural ways. Hence, would it make sense to just automate/enforce paremeterization of the test for serde's in SERDEUSINGMETASTOREFORSCHEMA, and handle other serdes like avro as a one-off? Parameterize HCatMapReduceTest for testing against all Hive storage formats --- Key: HIVE-7286 URL: https://issues.apache.org/jira/browse/HIVE-7286 Project: Hive Issue Type: Test Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7286.1.patch Currently, HCatMapReduceTest, which is extended by the following test suites: * TestHCatDynamicPartitioned * TestHCatNonPartitioned * TestHCatPartitioned * TestHCatExternalDynamicPartitioned * TestHCatExternalNonPartitioned * TestHCatExternalPartitioned * TestHCatMutableDynamicPartitioned * TestHCatMutableNonPartitioned * TestHCatMutablePartitioned These tests run against RCFile. Currently, only TestHCatDynamicPartitioned is run against any other storage format (ORC). Ideally, HCatalog should be tested against all storage formats supported by Hive. The easiest way to accomplish this is to turn HCatMapReduceTest into a parameterized test fixture that enumerates all Hive storage formats. Until HIVE-5976 is implemented, we would need to manually create the mapping of SerDe to InputFormat and OutputFormat. This way, we can explicitly keep track of which storage formats currently work with HCatalog or which ones are untested or have test failures. The test fixture should also use Reflection to find all classes in the classpath that implements the SerDe interface and raise a failure if any of them are not enumerated. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7282) HCatLoader fail to load Orc map with null key
[ https://issues.apache.org/jira/browse/HIVE-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046273#comment-14046273 ] Eugene Koifman commented on HIVE-7282: -- Also, can HIVE-5020 now be closed as duplicate? HCatLoader fail to load Orc map with null key - Key: HIVE-7282 URL: https://issues.apache.org/jira/browse/HIVE-7282 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.14.0 Attachments: HIVE-7282-1.patch, HIVE-7282-2.patch Here is the stack: Get exception: AttemptID:attempt_1403634189382_0011_m_00_0 Info:Error: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.NullPointerException at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToPigMap(PigHCatUtil.java:469) at org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:404) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:456) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:374) at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64) ... 13 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6637) UDF in_file() doesn't take CHAR or VARCHAR as input
[ https://issues.apache.org/jira/browse/HIVE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046315#comment-14046315 ] Xuefu Zhang commented on HIVE-6637: --- +1 UDF in_file() doesn't take CHAR or VARCHAR as input --- Key: HIVE-6637 URL: https://issues.apache.org/jira/browse/HIVE-6637 Project: Hive Issue Type: Bug Components: Types, UDF Affects Versions: 0.14.0 Reporter: Xuefu Zhang Assignee: Ashish Kumar Singh Attachments: HIVE-6637.1.patch, HIVE-6637.2.patch, HIVE-6637.3.patch {code} hive desc alter_varchar_1; key string None value varchar(3) None key2 int None value2varchar(10) None hive select in_file(value, value2) from alter_varchar_1; FAILED: SemanticException [Error 10016]: Line 1:15 Argument type mismatch 'value': The 1st argument of function IN_FILE must be a string but org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableHiveVarcharObjectInspector@10f1f34a was given. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7275) optimize these functions for windowing function.
[ https://issues.apache.org/jira/browse/HIVE-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046320#comment-14046320 ] Harish Butani commented on HIVE-7275: - 1. most of these are UDFs: coalesce floor sign abs ltrim substring to_char nvl cast decode nothing to be done for these. 2. ranking fns already support streaming. 3. We need streaming for count, row_num, mean, stddev. Also HIVE-7062 didn't add streaming for fVal, lVal. optimize these functions for windowing function. Key: HIVE-7275 URL: https://issues.apache.org/jira/browse/HIVE-7275 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: Hadoop 2.4.0, Hive 13.0 Reporter: Kiet Ly Please apply the window streaming optimization from issue HIVE-7143/7062 to these functions if they are applicable. row_number count rank dense_rank nvl rank dense_rank nvl cast decode median stddev coalesce floor sign abs ltrim substring to_char -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7294) sql std auth - authorize show grant statements
[ https://issues.apache.org/jira/browse/HIVE-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-7294: Status: Patch Available (was: Open) sql std auth - authorize show grant statements -- Key: HIVE-7294 URL: https://issues.apache.org/jira/browse/HIVE-7294 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7294.1.patch A non admin user should not be allowed to run show grant commands only for themselves or a role they belong to. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7294) sql std auth - authorize show grant statements
[ https://issues.apache.org/jira/browse/HIVE-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-7294: Attachment: HIVE-7294.1.patch HIVE-7294.1.patch - the patch also fixes 'show grant;' in sql std auth, and authorizes that as well. sql std auth - authorize show grant statements -- Key: HIVE-7294 URL: https://issues.apache.org/jira/browse/HIVE-7294 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7294.1.patch A non admin user should not be allowed to run show grant commands only for themselves or a role they belong to. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7308) Show auto-parallelism in extended explain
Gunther Hagleitner created HIVE-7308: Summary: Show auto-parallelism in extended explain Key: HIVE-7308 URL: https://issues.apache.org/jira/browse/HIVE-7308 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Add auto-parallelism flag to explain so that we can write tests verifying that we don't break bmj, etc... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7302) Allow Auto-reducer parallelism to be turned off by a logical optimizer
[ https://issues.apache.org/jira/browse/HIVE-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046346#comment-14046346 ] Gunther Hagleitner commented on HIVE-7302: -- Opened HIVE-7303 for the explain changes. Allow Auto-reducer parallelism to be turned off by a logical optimizer -- Key: HIVE-7302 URL: https://issues.apache.org/jira/browse/HIVE-7302 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-7302.1.patch Auto reducer parallelism cannot be used for cases where a custom routing VertexManager is used. Allow a tri-state for ReduceSinkDesc::isAutoParallel to allow allow, disable, unset mechanics. The state machine for this setting will now be Allowed transitions unset - allow unset - disable allow - disable with no transition case for disable - allow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7299) Enable metadata only optimization on Tez
[ https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046347#comment-14046347 ] Gunther Hagleitner commented on HIVE-7299: -- https://reviews.apache.org/r/23007/ Enable metadata only optimization on Tez Key: HIVE-7299 URL: https://issues.apache.org/jira/browse/HIVE-7299 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch, HIVE-7299.3.patch, HIVE-7299.4.patch, HIVE-7299.5.patch Enables the metadata only optimization (the one with OneNullRowInputFormat not the query-result-from-stats optimizaton) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7302) Allow Auto-reducer parallelism to be turned off by a logical optimizer
[ https://issues.apache.org/jira/browse/HIVE-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046355#comment-14046355 ] Vikram Dixit K commented on HIVE-7302: -- Left a minor comment on the jira which can be fixed at commit time. LGTM +1. Allow Auto-reducer parallelism to be turned off by a logical optimizer -- Key: HIVE-7302 URL: https://issues.apache.org/jira/browse/HIVE-7302 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-7302.1.patch Auto reducer parallelism cannot be used for cases where a custom routing VertexManager is used. Allow a tri-state for ReduceSinkDesc::isAutoParallel to allow allow, disable, unset mechanics. The state machine for this setting will now be Allowed transitions unset - allow unset - disable allow - disable with no transition case for disable - allow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7304) Transitive Predicate Propagation doesn't happen properly after HIVE-7159
[ https://issues.apache.org/jira/browse/HIVE-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046362#comment-14046362 ] Hive QA commented on HIVE-7304: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652864/HIVE-7304.1.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 5655 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_regex_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_table_access_keys_stats org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/619/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/619/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-619/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12652864 Transitive Predicate Propagation doesn't happen properly after HIVE-7159 Key: HIVE-7304 URL: https://issues.apache.org/jira/browse/HIVE-7304 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7304.1.patch The reason for the regression is that PredicateTransitivePropagate looks at the FilterOperator below the ReduceSink. SemanticAnalyzer::genNotNullFilterForJoinSourcePlan was stacking another FilterOp for the not null check, so only that predicate was being applied transitively by PredicateTransitivePropagate. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 23139: HIVE-7294 : sql std auth - authorize show grant statements
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23139/ --- Review request for hive. Bugs: HIVE-7294 https://issues.apache.org/jira/browse/HIVE-7294 Repository: hive-git Description --- See jira Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 24f829f ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java e4f5aac ql/src/test/queries/clientnegative/authorization_insertoverwrite_nodel.q 90fe6e1 ql/src/test/queries/clientnegative/authorization_priv_current_role_neg.q bbf3b66 ql/src/test/queries/clientnegative/authorization_show_grant_otherrole.q PRE-CREATION ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_all.q PRE-CREATION ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_alltabs.q PRE-CREATION ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_wtab.q PRE-CREATION ql/src/test/queries/clientpositive/authorization_grant_public_role.q 8473178 ql/src/test/queries/clientpositive/authorization_grant_table_priv.q 02d364e ql/src/test/queries/clientpositive/authorization_insert.q 5de6f50 ql/src/test/queries/clientpositive/authorization_revoke_table_priv.q ccda3b5 ql/src/test/queries/clientpositive/authorization_show_grant.q PRE-CREATION ql/src/test/queries/clientpositive/authorization_view_sqlstd.q bd7bbfe ql/src/test/results/clientnegative/authorization_insertoverwrite_nodel.q.out de1d230 ql/src/test/results/clientnegative/authorization_show_grant_otherrole.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_show_grant_otheruser_all.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_show_grant_otheruser_alltabs.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_show_grant_otheruser_wtab.q.out PRE-CREATION ql/src/test/results/clientpositive/authorization_grant_public_role.q.out a0a45f7 ql/src/test/results/clientpositive/authorization_grant_table_priv.q.out 9a6ec17 ql/src/test/results/clientpositive/authorization_insert.q.out f94d9a9 ql/src/test/results/clientpositive/authorization_show_grant.q.out PRE-CREATION ql/src/test/results/clientpositive/authorization_view_sqlstd.q.out 50c0247 Diff: https://reviews.apache.org/r/23139/diff/ Testing --- test cases included. Thanks, Thejas Nair
[jira] [Commented] (HIVE-6637) UDF in_file() doesn't take CHAR or VARCHAR as input
[ https://issues.apache.org/jira/browse/HIVE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046364#comment-14046364 ] Ashish Kumar Singh commented on HIVE-6637: -- [~xuefuz], thanks again for reviewing :) UDF in_file() doesn't take CHAR or VARCHAR as input --- Key: HIVE-6637 URL: https://issues.apache.org/jira/browse/HIVE-6637 Project: Hive Issue Type: Bug Components: Types, UDF Affects Versions: 0.14.0 Reporter: Xuefu Zhang Assignee: Ashish Kumar Singh Attachments: HIVE-6637.1.patch, HIVE-6637.2.patch, HIVE-6637.3.patch {code} hive desc alter_varchar_1; key string None value varchar(3) None key2 int None value2varchar(10) None hive select in_file(value, value2) from alter_varchar_1; FAILED: SemanticException [Error 10016]: Line 1:15 Argument type mismatch 'value': The 1st argument of function IN_FILE must be a string but org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableHiveVarcharObjectInspector@10f1f34a was given. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-3628) Provide a way to use counters in Hive through UDF
[ https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046400#comment-14046400 ] Lefty Leverenz commented on HIVE-3628: -- [~trsudarshan], to get wiki edit access you just need to create a Confluence username and send a request to the u...@hive.apache.org mailing list. Instructions here: * [Hive: About This Wiki | https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki] Agreed, UDTF documentation could be improved. The UDF and Lateral View docs in the Language Manual cover UDTFs and there's an orphan (unlinked) doc called Writing UDTF's which I'll add links to today: * [LanguageManual -- Operators and UDFs -- Built-in Table-Generating Functions (UDTF) | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inTable-GeneratingFunctions(UDTF)] * [LanguageManual -- Lateral View | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView] * [DeveloperGuide UDTF -- Writing UDTFs | https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide+UDTF] Other wikidocs that mention UDTF, not counting Configuration Properties: * [LanguageManual -- XPath UDF (just an empty heading for UDTFs) | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+XPathUDF] * [HiveServer2 Thrift API (design doc, just shows UDTF as a value for FUNCTION_TYPE) | https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API] * [Plugin Developer Kit (trivial mention) | https://cwiki.apache.org/confluence/display/Hive/PluginDeveloperKit] Thanks for offering to improve the docs! Provide a way to use counters in Hive through UDF - Key: HIVE-3628 URL: https://issues.apache.org/jira/browse/HIVE-3628 Project: Hive Issue Type: Improvement Components: UDF Reporter: Viji Assignee: Navis Priority: Minor Fix For: 0.11.0 Attachments: HIVE-3628.D8007.1.patch, HIVE-3628.D8007.2.patch, HIVE-3628.D8007.3.patch, HIVE-3628.D8007.4.patch, HIVE-3628.D8007.5.patch, HIVE-3628.D8007.6.patch Currently it is not possible to generate counters through UDF. We should support this. Pig currently allows this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7309) Add query string to the HiveDriverFilterHookContext constructor
Arun Suresh created HIVE-7309: - Summary: Add query string to the HiveDriverFilterHookContext constructor Key: HIVE-7309 URL: https://issues.apache.org/jira/browse/HIVE-7309 Project: Hive Issue Type: Bug Components: Authorization Reporter: Arun Suresh There are cases where more context is required by the {{HiveDriverFilterHook}} to make an authorization decision. For eg. : In the case of the SHOW TABLES in some_db command, currently only the current DB (the landing db when a user logs in .. or DB to which the user has switched to) is available in the {{HiveDriverFilterHookContext}}. The target table, some_db, in this example is also required. One suggestion is to pass the complete query string in the hook context. since at that point, there are no query inputs available. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7304) Transitive Predicate Propagation doesn't happen properly after HIVE-7159
[ https://issues.apache.org/jira/browse/HIVE-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-7304: Status: Patch Available (was: Open) Transitive Predicate Propagation doesn't happen properly after HIVE-7159 Key: HIVE-7304 URL: https://issues.apache.org/jira/browse/HIVE-7304 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7304.1.patch, HIVE-7304.2.patch The reason for the regression is that PredicateTransitivePropagate looks at the FilterOperator below the ReduceSink. SemanticAnalyzer::genNotNullFilterForJoinSourcePlan was stacking another FilterOp for the not null check, so only that predicate was being applied transitively by PredicateTransitivePropagate. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7304) Transitive Predicate Propagation doesn't happen properly after HIVE-7159
[ https://issues.apache.org/jira/browse/HIVE-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-7304: Status: Open (was: Patch Available) Transitive Predicate Propagation doesn't happen properly after HIVE-7159 Key: HIVE-7304 URL: https://issues.apache.org/jira/browse/HIVE-7304 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7304.1.patch, HIVE-7304.2.patch The reason for the regression is that PredicateTransitivePropagate looks at the FilterOperator below the ReduceSink. SemanticAnalyzer::genNotNullFilterForJoinSourcePlan was stacking another FilterOp for the not null check, so only that predicate was being applied transitively by PredicateTransitivePropagate. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7304) Transitive Predicate Propagation doesn't happen properly after HIVE-7159
[ https://issues.apache.org/jira/browse/HIVE-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-7304: Attachment: HIVE-7304.2.patch Transitive Predicate Propagation doesn't happen properly after HIVE-7159 Key: HIVE-7304 URL: https://issues.apache.org/jira/browse/HIVE-7304 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7304.1.patch, HIVE-7304.2.patch The reason for the regression is that PredicateTransitivePropagate looks at the FilterOperator below the ReduceSink. SemanticAnalyzer::genNotNullFilterForJoinSourcePlan was stacking another FilterOp for the not null check, so only that predicate was being applied transitively by PredicateTransitivePropagate. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7310) Turning CBO on results in NPE on some queries
[ https://issues.apache.org/jira/browse/HIVE-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7310: - Labels: cbo (was: ) Turning CBO on results in NPE on some queries - Key: HIVE-7310 URL: https://issues.apache.org/jira/browse/HIVE-7310 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Assignee: Laljo John Pullokkaran Labels: cbo On the CBO branch if I do the following: hive set hive.cbo.enable=true; hive select i_item_id, s_state, GROUPING__ID, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales ss join customer_demographics cd on (ss.ss_cdemo_sk = cd.cd_demo_sk) join date_dim d on (ss.ss_sold_date_sk = d.d_date_sk) join store s on (ss.ss_store_sk = s.s_store_sk) join item i on (ss.ss_item_sk = i.i_item_sk) where cd_gender = 'M' and cd_marital_status = 'S' and cd_education_status = 'Secondary' and d_year = 2002 and s_state in ('OH','SD', 'LA', 'MO', 'WA', 'MN') group by i_item_id, s_state with rollup order by i_item_id ,s_state limit 100 ; I get an NPE. The stack trace is: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9555) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:328) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:412) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:962) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1027) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:277) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:439) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:812) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11732) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.access$200(SemanticAnalyzer.java:11711) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9532) ... 18 more Caused by: java.lang.RuntimeException: java.lang.NullPointerException at net.hydromatic.optiq.tools.Frameworks.withPrepare(Frameworks.java:170) at net.hydromatic.optiq.tools.Frameworks.withPlanner(Frameworks.java:142) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11727) ... 20 more Caused by: java.lang.NullPointerException at org.eigenbase.reltype.RelDataTypeImpl.getField(RelDataTypeImpl.java:79) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.convertAgg(SemanticAnalyzer.java:12129) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBRelNode(SemanticAnalyzer.java:12184) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBLogicalPlan(SemanticAnalyzer.java:12324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genLogicalPlan(SemanticAnalyzer.java:12749) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11758) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11711) at net.hydromatic.optiq.tools.Frameworks$1.apply(Frameworks.java:146) at
[jira] [Created] (HIVE-7310) Turning CBO on results in NPE on some queries
Gunther Hagleitner created HIVE-7310: Summary: Turning CBO on results in NPE on some queries Key: HIVE-7310 URL: https://issues.apache.org/jira/browse/HIVE-7310 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Assignee: Laljo John Pullokkaran On the CBO branch if I do the following: hive set hive.cbo.enable=true; hive select i_item_id, s_state, GROUPING__ID, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales ss join customer_demographics cd on (ss.ss_cdemo_sk = cd.cd_demo_sk) join date_dim d on (ss.ss_sold_date_sk = d.d_date_sk) join store s on (ss.ss_store_sk = s.s_store_sk) join item i on (ss.ss_item_sk = i.i_item_sk) where cd_gender = 'M' and cd_marital_status = 'S' and cd_education_status = 'Secondary' and d_year = 2002 and s_state in ('OH','SD', 'LA', 'MO', 'WA', 'MN') group by i_item_id, s_state with rollup order by i_item_id ,s_state limit 100 ; I get an NPE. The stack trace is: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9555) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:328) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:412) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:962) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1027) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:277) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:439) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:812) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11732) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.access$200(SemanticAnalyzer.java:11711) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9532) ... 18 more Caused by: java.lang.RuntimeException: java.lang.NullPointerException at net.hydromatic.optiq.tools.Frameworks.withPrepare(Frameworks.java:170) at net.hydromatic.optiq.tools.Frameworks.withPlanner(Frameworks.java:142) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11727) ... 20 more Caused by: java.lang.NullPointerException at org.eigenbase.reltype.RelDataTypeImpl.getField(RelDataTypeImpl.java:79) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.convertAgg(SemanticAnalyzer.java:12129) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBRelNode(SemanticAnalyzer.java:12184) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBLogicalPlan(SemanticAnalyzer.java:12324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genLogicalPlan(SemanticAnalyzer.java:12749) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11758) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11711) at net.hydromatic.optiq.tools.Frameworks$1.apply(Frameworks.java:146) at net.hydromatic.optiq.prepare.OptiqPrepareImpl.perform(OptiqPrepareImpl.java:536) at net.hydromatic.optiq.tools.Frameworks.withPrepare(Frameworks.java:168) ... 22 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7310) Turning CBO on results in NPE on some queries
[ https://issues.apache.org/jira/browse/HIVE-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-7310: - Attachment: HIVE-7310.patch Turning CBO on results in NPE on some queries - Key: HIVE-7310 URL: https://issues.apache.org/jira/browse/HIVE-7310 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Assignee: Laljo John Pullokkaran Labels: cbo Attachments: HIVE-7310.patch On the CBO branch if I do the following: hive set hive.cbo.enable=true; hive select i_item_id, s_state, GROUPING__ID, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales ss join customer_demographics cd on (ss.ss_cdemo_sk = cd.cd_demo_sk) join date_dim d on (ss.ss_sold_date_sk = d.d_date_sk) join store s on (ss.ss_store_sk = s.s_store_sk) join item i on (ss.ss_item_sk = i.i_item_sk) where cd_gender = 'M' and cd_marital_status = 'S' and cd_education_status = 'Secondary' and d_year = 2002 and s_state in ('OH','SD', 'LA', 'MO', 'WA', 'MN') group by i_item_id, s_state with rollup order by i_item_id ,s_state limit 100 ; I get an NPE. The stack trace is: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9555) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:328) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:412) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:962) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1027) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:277) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:439) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:812) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11732) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.access$200(SemanticAnalyzer.java:11711) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9532) ... 18 more Caused by: java.lang.RuntimeException: java.lang.NullPointerException at net.hydromatic.optiq.tools.Frameworks.withPrepare(Frameworks.java:170) at net.hydromatic.optiq.tools.Frameworks.withPlanner(Frameworks.java:142) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11727) ... 20 more Caused by: java.lang.NullPointerException at org.eigenbase.reltype.RelDataTypeImpl.getField(RelDataTypeImpl.java:79) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.convertAgg(SemanticAnalyzer.java:12129) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBRelNode(SemanticAnalyzer.java:12184) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBLogicalPlan(SemanticAnalyzer.java:12324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genLogicalPlan(SemanticAnalyzer.java:12749) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11758) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11711) at net.hydromatic.optiq.tools.Frameworks$1.apply(Frameworks.java:146)
[jira] [Updated] (HIVE-5775) Introduce Cost Based Optimizer to Hive
[ https://issues.apache.org/jira/browse/HIVE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-5775: - Attachment: HIVE-7310.patch Introduce Cost Based Optimizer to Hive -- Key: HIVE-5775 URL: https://issues.apache.org/jira/browse/HIVE-5775 Project: Hive Issue Type: New Feature Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Attachments: CBO-2.pdf, HIVE-5775.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5775) Introduce Cost Based Optimizer to Hive
[ https://issues.apache.org/jira/browse/HIVE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-5775: - Attachment: (was: HIVE-7310.patch) Introduce Cost Based Optimizer to Hive -- Key: HIVE-5775 URL: https://issues.apache.org/jira/browse/HIVE-5775 Project: Hive Issue Type: New Feature Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Attachments: CBO-2.pdf, HIVE-5775.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7310) Turning CBO on results in NPE on some queries
[ https://issues.apache.org/jira/browse/HIVE-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-7310: - Status: Patch Available (was: Open) Turning CBO on results in NPE on some queries - Key: HIVE-7310 URL: https://issues.apache.org/jira/browse/HIVE-7310 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Assignee: Laljo John Pullokkaran Labels: cbo Attachments: HIVE-7310.patch On the CBO branch if I do the following: hive set hive.cbo.enable=true; hive select i_item_id, s_state, GROUPING__ID, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales ss join customer_demographics cd on (ss.ss_cdemo_sk = cd.cd_demo_sk) join date_dim d on (ss.ss_sold_date_sk = d.d_date_sk) join store s on (ss.ss_store_sk = s.s_store_sk) join item i on (ss.ss_item_sk = i.i_item_sk) where cd_gender = 'M' and cd_marital_status = 'S' and cd_education_status = 'Secondary' and d_year = 2002 and s_state in ('OH','SD', 'LA', 'MO', 'WA', 'MN') group by i_item_id, s_state with rollup order by i_item_id ,s_state limit 100 ; I get an NPE. The stack trace is: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9555) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:328) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:412) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:962) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1027) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:277) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:439) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:812) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11732) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.access$200(SemanticAnalyzer.java:11711) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9532) ... 18 more Caused by: java.lang.RuntimeException: java.lang.NullPointerException at net.hydromatic.optiq.tools.Frameworks.withPrepare(Frameworks.java:170) at net.hydromatic.optiq.tools.Frameworks.withPlanner(Frameworks.java:142) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11727) ... 20 more Caused by: java.lang.NullPointerException at org.eigenbase.reltype.RelDataTypeImpl.getField(RelDataTypeImpl.java:79) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.convertAgg(SemanticAnalyzer.java:12129) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBRelNode(SemanticAnalyzer.java:12184) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBLogicalPlan(SemanticAnalyzer.java:12324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genLogicalPlan(SemanticAnalyzer.java:12749) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11758) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11711) at
[jira] [Reopened] (HIVE-7283) CBO: plumb in HepPlanner and FieldTrimmer(ColumnPruner) into Optiq based planning
[ https://issues.apache.org/jira/browse/HIVE-7283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner reopened HIVE-7283: -- I'm seeing this in the unit tests: Exception: null java.lang.AssertionError at org.eigenbase.relopt.RelOptUtil.classifyFilters(RelOptUtil.java:1871) at org.apache.hadoop.hive.ql.optimizer.optiq.rules.HivePushFilterPastJoinRule.perform(HivePushFilterPastJoinRule.java:95) at org.apache.hadoop.hive.ql.optimizer.optiq.rules.HivePushFilterPastJoinRule$2.onMatch(HivePushFilterPastJoinRule.java:41) at org.eigenbase.relopt.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:280) at org.eigenbase.relopt.hep.HepPlanner.applyRule(HepPlanner.java:482) at org.eigenbase.relopt.hep.HepPlanner.applyRules(HepPlanner.java:359) at org.eigenbase.relopt.hep.HepPlanner.executeInstruction(HepPlanner.java:222) at org.eigenbase.relopt.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:119) at org.eigenbase.relopt.hep.HepPlanner.executeProgram(HepPlanner.java:173) at org.eigenbase.relopt.hep.HepPlanner.findBestExp(HepPlanner.java:160) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.applyPreCBOTransforms(SemanticAnalyzer.java:11818) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11768) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11715) at net.hydromatic.optiq.tools.Frameworks$1.apply(Frameworks.java:146) at net.hydromatic.optiq.prepare.OptiqPrepareImpl.perform(OptiqPrepareImpl.java:536) at net.hydromatic.optiq.tools.Frameworks.withPrepare(Frameworks.java:168) at net.hydromatic.optiq.tools.Frameworks.withPlanner(Frameworks.java:142) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11731) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.access$200(SemanticAnalyzer.java:11715) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9536) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:328) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:412) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:962) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1027) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:277) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:439) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:375) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:920) at org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:133) at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_correctness(TestCliDriver.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) CBO: plumb in HepPlanner and FieldTrimmer(ColumnPruner) into Optiq based planning - Key: HIVE-7283 URL: https://issues.apache.org/jira/browse/HIVE-7283 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Harish Butani
[jira] [Commented] (HIVE-7294) sql std auth - authorize show grant statements
[ https://issues.apache.org/jira/browse/HIVE-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046517#comment-14046517 ] Hive QA commented on HIVE-7294: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652889/HIVE-7294.1.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5660 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/620/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/620/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-620/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12652889 sql std auth - authorize show grant statements -- Key: HIVE-7294 URL: https://issues.apache.org/jira/browse/HIVE-7294 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7294.1.patch A non admin user should not be allowed to run show grant commands only for themselves or a role they belong to. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7040) TCP KeepAlive for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046520#comment-14046520 ] Xuefu Zhang commented on HIVE-7040: --- [~nicothieb] Thanks for working on this. A few of questions though: 1. Please generate patch according to Hive guidelines. 2. It seems that the config default value is true. Is this desirable? 3. Are we going to do this for SSL also? 4. A review board link for code review would be great. TCP KeepAlive for HiveServer2 - Key: HIVE-7040 URL: https://issues.apache.org/jira/browse/HIVE-7040 Project: Hive Issue Type: Improvement Components: HiveServer2, Server Infrastructure Reporter: Nicolas Thiébaud Attachments: HIVE-7040.patch, HIVE-7040.patch.2 Implement TCP KeepAlive for HiverServer 2 to avoid half open connections. A setting could be added {code} property namehive.server2.tcp.keepalive/name valuetrue/value descriptionWhether to enable TCP keepalive for Hive Server 2/description /property {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7283) CBO: plumb in HepPlanner and FieldTrimmer(ColumnPruner) into Optiq based planning
[ https://issues.apache.org/jira/browse/HIVE-7283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046539#comment-14046539 ] Gunther Hagleitner commented on HIVE-7283: -- Tests pass without patch. Reverted change in cbo branch. CBO: plumb in HepPlanner and FieldTrimmer(ColumnPruner) into Optiq based planning - Key: HIVE-7283 URL: https://issues.apache.org/jira/browse/HIVE-7283 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7283.1.patch 1. HepPlanner initially used for: - Predicate Pushdown - Transitive Predicate inference - Partition Pruning 2. Use Optiq's FieldTrimmer for ColumnPruner To begin with the rules are copies of Optiq base rules. Once Optiq is refactored to work on Base RelNode classes, the copied rules will be removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7283) CBO: plumb in HepPlanner and FieldTrimmer(ColumnPruner) into Optiq based planning
[ https://issues.apache.org/jira/browse/HIVE-7283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046543#comment-14046543 ] Harish Butani commented on HIVE-7283: - ok, sorry was running cbo_correctness.q w/o setting hive,enable.cbo=true on command line. I see the issue: for semiJoins joinType is set to Inner on JoinRelBase, so the assertion that num cols in Join = left + right fails in RelOptUtil.classifyFilters For now going to bail in HivePushFilterPastJoinRule for SemiJoins. Should have a patch soon. CBO: plumb in HepPlanner and FieldTrimmer(ColumnPruner) into Optiq based planning - Key: HIVE-7283 URL: https://issues.apache.org/jira/browse/HIVE-7283 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7283.1.patch 1. HepPlanner initially used for: - Predicate Pushdown - Transitive Predicate inference - Partition Pruning 2. Use Optiq's FieldTrimmer for ColumnPruner To begin with the rules are copies of Optiq base rules. Once Optiq is refactored to work on Base RelNode classes, the copied rules will be removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22996: HIVE-7090 Support session-level temporary tables in Hive
On June 26, 2014, 4:21 a.m., Brock Noland wrote: itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniMr.java, line 238 https://reviews.apache.org/r/22996/diff/1/?file=617741#file617741line238 We should be checking for something more specific than a sql exception. error code, message, etc. Sure, will try to check for Table not found. - Jason --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22996/#review46712 --- On June 26, 2014, 2:05 a.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22996/ --- (Updated June 26, 2014, 2:05 a.m.) Review request for hive, Gunther Hagleitner, Navis Ryu, and Harish Butani. Bugs: HIVE-7090 https://issues.apache.org/jira/browse/HIVE-7090 Repository: hive-git Description --- Temp tables managed in memory by SessionState. SessionHiveMetaStoreClient overrides table-related methods in HiveMetaStore to access the temp tables saved in the SessionState when appropriate. Diffs - itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniMr.java 9fb7550 itests/qtest/testconfiguration.properties 6731561 metastore/if/hive_metastore.thrift cc802c6 metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 9e8d912 ql/src/java/org/apache/hadoop/hive/ql/Context.java abc4290 ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 24f829f ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 4d35176 ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 3df2690 ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java 1270520 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f934ac4 ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 71471f4 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83d09c0 ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 2537b75 ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableLikeDesc.java cb5d64c ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 2143d0c ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezTask.java 43125f7 ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 98c3cc3 ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 91de8da ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestHiveAuthorizationTaskFactory.java 20d08b3 ql/src/test/queries/clientnegative/temp_table_authorize_create_tbl.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_column_stats.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_create_like_partitions.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_index.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_partitions.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_rename.q PRE-CREATION ql/src/test/queries/clientpositive/show_create_table_temp_table.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_external.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_gb1.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_join1.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_names.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_options1.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_precedence.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_subquery1.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_windowing_expressions.q PRE-CREATION ql/src/test/results/clientnegative/temp_table_authorize_create_tbl.q.out PRE-CREATION ql/src/test/results/clientnegative/temp_table_column_stats.q.out PRE-CREATION ql/src/test/results/clientnegative/temp_table_create_like_partitions.q.out PRE-CREATION ql/src/test/results/clientnegative/temp_table_index.q.out PRE-CREATION ql/src/test/results/clientnegative/temp_table_partitions.q.out PRE-CREATION ql/src/test/results/clientnegative/temp_table_rename.q.out PRE-CREATION ql/src/test/results/clientpositive/show_create_table_temp_table.q.out PRE-CREATION ql/src/test/results/clientpositive/temp_table.q.out PRE-CREATION ql/src/test/results/clientpositive/temp_table_external.q.out PRE-CREATION ql/src/test/results/clientpositive/temp_table_gb1.q.out PRE-CREATION ql/src/test/results/clientpositive/temp_table_join1.q.out
Re: Review Request 22996: HIVE-7090 Support session-level temporary tables in Hive
On June 26, 2014, 4:26 a.m., Brock Noland wrote: ql/src/java/org/apache/hadoop/hive/ql/Context.java, line 76 https://reviews.apache.org/r/22996/diff/1/?file=617745#file617745line76 What is the purpose of removing from here? I don't see any other changes? Looks like that change was not necessary, will remove. On June 26, 2014, 4:26 a.m., Brock Noland wrote: ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java, line 226 https://reviews.apache.org/r/22996/diff/1/?file=617748#file617748line226 if (not) else This should be reversed. will fix in next patch On June 26, 2014, 4:26 a.m., Brock Noland wrote: ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java, line 317 https://reviews.apache.org/r/22996/diff/1/?file=617748#file617748line317 We shouldn't lose the stack trace here MetaException does not have a constructor to propagate the cause, will try calling initCause() on the exception before throwing. On June 26, 2014, 4:26 a.m., Brock Noland wrote: ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java, line 332 https://reviews.apache.org/r/22996/diff/1/?file=617748#file617748line332 Losing stack trace here will fix in next patch On June 26, 2014, 4:26 a.m., Brock Noland wrote: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 10170 https://reviews.apache.org/r/22996/diff/1/?file=617753#file617753line10170 Could we have a better message? will fix in next patch On June 26, 2014, 4:26 a.m., Brock Noland wrote: ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java, line 382 https://reviews.apache.org/r/22996/diff/1/?file=617756#file617756line382 a bunch of if (not) else statements here This is an issue? I can change in next patch. - Jason --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22996/#review46713 --- On June 26, 2014, 2:05 a.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22996/ --- (Updated June 26, 2014, 2:05 a.m.) Review request for hive, Gunther Hagleitner, Navis Ryu, and Harish Butani. Bugs: HIVE-7090 https://issues.apache.org/jira/browse/HIVE-7090 Repository: hive-git Description --- Temp tables managed in memory by SessionState. SessionHiveMetaStoreClient overrides table-related methods in HiveMetaStore to access the temp tables saved in the SessionState when appropriate. Diffs - itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniMr.java 9fb7550 itests/qtest/testconfiguration.properties 6731561 metastore/if/hive_metastore.thrift cc802c6 metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 9e8d912 ql/src/java/org/apache/hadoop/hive/ql/Context.java abc4290 ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 24f829f ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 4d35176 ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 3df2690 ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java 1270520 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f934ac4 ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 71471f4 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83d09c0 ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 2537b75 ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableLikeDesc.java cb5d64c ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 2143d0c ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezTask.java 43125f7 ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 98c3cc3 ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 91de8da ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestHiveAuthorizationTaskFactory.java 20d08b3 ql/src/test/queries/clientnegative/temp_table_authorize_create_tbl.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_column_stats.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_create_like_partitions.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_index.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_partitions.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_rename.q PRE-CREATION
[jira] [Commented] (HIVE-7310) Turning CBO on results in NPE on some queries
[ https://issues.apache.org/jira/browse/HIVE-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046558#comment-14046558 ] Gunther Hagleitner commented on HIVE-7310: -- LGTM +1 Turning CBO on results in NPE on some queries - Key: HIVE-7310 URL: https://issues.apache.org/jira/browse/HIVE-7310 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Assignee: Laljo John Pullokkaran Labels: cbo Attachments: HIVE-7310.patch On the CBO branch if I do the following: hive set hive.cbo.enable=true; hive select i_item_id, s_state, GROUPING__ID, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales ss join customer_demographics cd on (ss.ss_cdemo_sk = cd.cd_demo_sk) join date_dim d on (ss.ss_sold_date_sk = d.d_date_sk) join store s on (ss.ss_store_sk = s.s_store_sk) join item i on (ss.ss_item_sk = i.i_item_sk) where cd_gender = 'M' and cd_marital_status = 'S' and cd_education_status = 'Secondary' and d_year = 2002 and s_state in ('OH','SD', 'LA', 'MO', 'WA', 'MN') group by i_item_id, s_state with rollup order by i_item_id ,s_state limit 100 ; I get an NPE. The stack trace is: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9555) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:328) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:412) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:962) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1027) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:277) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:439) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:812) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11732) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.access$200(SemanticAnalyzer.java:11711) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9532) ... 18 more Caused by: java.lang.RuntimeException: java.lang.NullPointerException at net.hydromatic.optiq.tools.Frameworks.withPrepare(Frameworks.java:170) at net.hydromatic.optiq.tools.Frameworks.withPlanner(Frameworks.java:142) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11727) ... 20 more Caused by: java.lang.NullPointerException at org.eigenbase.reltype.RelDataTypeImpl.getField(RelDataTypeImpl.java:79) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.convertAgg(SemanticAnalyzer.java:12129) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBRelNode(SemanticAnalyzer.java:12184) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBLogicalPlan(SemanticAnalyzer.java:12324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genLogicalPlan(SemanticAnalyzer.java:12749) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11758) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11711) at
[jira] [Updated] (HIVE-7090) Support session-level temporary tables in Hive
[ https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-7090: - Attachment: HIVE-7090.6.patch Patch v6, changes based on Brock's comments. Support session-level temporary tables in Hive -- Key: HIVE-7090 URL: https://issues.apache.org/jira/browse/HIVE-7090 Project: Hive Issue Type: Bug Components: SQL Reporter: Gunther Hagleitner Assignee: Jason Dere Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, HIVE-7090.4.patch, HIVE-7090.5.patch, HIVE-7090.6.patch It's common to see sql scripts that create some temporary table as an intermediate result, run some additional queries against it and then clean up at the end. We should support temporary tables properly, meaning automatically manage the life cycle and make sure the visibility is restricted to the creating connection/session. Without these it's common to see left over tables in meta-store or weird errors with clashing tmp table names. Proposed syntax: CREATE TEMPORARY TABLE CTAS, CTL, INSERT INTO, should all be supported as usual. Knowing that a user wants a temp table can enable us to further optimize access to it. E.g.: temp tables should be kept in memory where possible, compactions and merging table files aren't required, ... -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22996: HIVE-7090 Support session-level temporary tables in Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22996/ --- (Updated June 28, 2014, 12:35 a.m.) Review request for hive, Gunther Hagleitner, Navis Ryu, and Harish Butani. Changes --- Update patches based on comments from Brock. Bugs: HIVE-7090 https://issues.apache.org/jira/browse/HIVE-7090 Repository: hive-git Description --- Temp tables managed in memory by SessionState. SessionHiveMetaStoreClient overrides table-related methods in HiveMetaStore to access the temp tables saved in the SessionState when appropriate. Diffs (updated) - itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniMr.java 9fb7550 itests/qtest/testconfiguration.properties 1462ecd metastore/if/hive_metastore.thrift cc802c6 metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 9e8d912 ql/src/java/org/apache/hadoop/hive/ql/Context.java abc4290 ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java d8d900b ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 4d35176 ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 3df2690 ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java 1270520 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f934ac4 ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 71471f4 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83d09c0 ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 2537b75 ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableLikeDesc.java cb5d64c ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 2143d0c ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezTask.java 43125f7 ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 98c3cc3 ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 91de8da ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestHiveAuthorizationTaskFactory.java 20d08b3 ql/src/test/queries/clientnegative/temp_table_authorize_create_tbl.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_column_stats.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_create_like_partitions.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_index.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_partitions.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_rename.q PRE-CREATION ql/src/test/queries/clientpositive/show_create_table_temp_table.q PRE-CREATION ql/src/test/queries/clientpositive/stats19.q 51514bd ql/src/test/queries/clientpositive/temp_table.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_external.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_gb1.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_join1.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_names.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_options1.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_precedence.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_subquery1.q PRE-CREATION ql/src/test/queries/clientpositive/temp_table_windowing_expressions.q PRE-CREATION ql/src/test/results/clientnegative/temp_table_authorize_create_tbl.q.out PRE-CREATION ql/src/test/results/clientnegative/temp_table_column_stats.q.out PRE-CREATION ql/src/test/results/clientnegative/temp_table_create_like_partitions.q.out PRE-CREATION ql/src/test/results/clientnegative/temp_table_index.q.out PRE-CREATION ql/src/test/results/clientnegative/temp_table_partitions.q.out PRE-CREATION ql/src/test/results/clientnegative/temp_table_rename.q.out PRE-CREATION ql/src/test/results/clientpositive/nullformat.q.out d311825 ql/src/test/results/clientpositive/nullformatCTAS.q.out cab23d5 ql/src/test/results/clientpositive/show_create_table_alter.q.out 206f4f8 ql/src/test/results/clientpositive/show_create_table_db_table.q.out 528dd36 ql/src/test/results/clientpositive/show_create_table_delimited.q.out d4ffd53 ql/src/test/results/clientpositive/show_create_table_serde.q.out a9e92b4 ql/src/test/results/clientpositive/show_create_table_temp_table.q.out PRE-CREATION ql/src/test/results/clientpositive/temp_table.q.out PRE-CREATION ql/src/test/results/clientpositive/temp_table_external.q.out PRE-CREATION ql/src/test/results/clientpositive/temp_table_gb1.q.out PRE-CREATION ql/src/test/results/clientpositive/temp_table_join1.q.out PRE-CREATION ql/src/test/results/clientpositive/temp_table_names.q.out PRE-CREATION ql/src/test/results/clientpositive/temp_table_options1.q.out PRE-CREATION
[jira] [Updated] (HIVE-7310) Turning CBO on results in NPE on some queries
[ https://issues.apache.org/jira/browse/HIVE-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7310: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to cbo branch. Thanks [~jpullokkaran]! Turning CBO on results in NPE on some queries - Key: HIVE-7310 URL: https://issues.apache.org/jira/browse/HIVE-7310 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Assignee: Laljo John Pullokkaran Labels: cbo Attachments: HIVE-7310.patch On the CBO branch if I do the following: hive set hive.cbo.enable=true; hive select i_item_id, s_state, GROUPING__ID, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales ss join customer_demographics cd on (ss.ss_cdemo_sk = cd.cd_demo_sk) join date_dim d on (ss.ss_sold_date_sk = d.d_date_sk) join store s on (ss.ss_store_sk = s.s_store_sk) join item i on (ss.ss_item_sk = i.i_item_sk) where cd_gender = 'M' and cd_marital_status = 'S' and cd_education_status = 'Secondary' and d_year = 2002 and s_state in ('OH','SD', 'LA', 'MO', 'WA', 'MN') group by i_item_id, s_state with rollup order by i_item_id ,s_state limit 100 ; I get an NPE. The stack trace is: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9555) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:328) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:412) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:962) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1027) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:277) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:439) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:812) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11732) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.access$200(SemanticAnalyzer.java:11711) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9532) ... 18 more Caused by: java.lang.RuntimeException: java.lang.NullPointerException at net.hydromatic.optiq.tools.Frameworks.withPrepare(Frameworks.java:170) at net.hydromatic.optiq.tools.Frameworks.withPlanner(Frameworks.java:142) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:11727) ... 20 more Caused by: java.lang.NullPointerException at org.eigenbase.reltype.RelDataTypeImpl.getField(RelDataTypeImpl.java:79) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.convertAgg(SemanticAnalyzer.java:12129) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBRelNode(SemanticAnalyzer.java:12184) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genGBLogicalPlan(SemanticAnalyzer.java:12324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genLogicalPlan(SemanticAnalyzer.java:12749) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11758) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.apply(SemanticAnalyzer.java:11711)
[jira] [Updated] (HIVE-7311) add cbo enable flag to cbo_correctness script and add it to the Tez tests as well
[ https://issues.apache.org/jira/browse/HIVE-7311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7311: - Attachment: HIVE-7311.1.q add cbo enable flag to cbo_correctness script and add it to the Tez tests as well - Key: HIVE-7311 URL: https://issues.apache.org/jira/browse/HIVE-7311 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7311.1.q -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7311) add cbo enable flag to cbo_correctness script and add it to the Tez tests as well
Gunther Hagleitner created HIVE-7311: Summary: add cbo enable flag to cbo_correctness script and add it to the Tez tests as well Key: HIVE-7311 URL: https://issues.apache.org/jira/browse/HIVE-7311 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7311.1.q -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7301) Restore constants moved to HiveConf by HIVE-7211
[ https://issues.apache.org/jira/browse/HIVE-7301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-7301: - Resolution: Fixed Fix Version/s: 0.14.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Patch committed. Thanks Navis! Restore constants moved to HiveConf by HIVE-7211 Key: HIVE-7301 URL: https://issues.apache.org/jira/browse/HIVE-7301 Project: Hive Issue Type: Task Components: Configuration Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.14.0 Attachments: HIVE-7301.1.patch.txt NO PRECOMMIT TESTS HIVE-7211 moved RCFile related constants to HiveConf. But for the backward compatibility, restore those as was. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-7311) add cbo enable flag to cbo_correctness script and add it to the Tez tests as well
[ https://issues.apache.org/jira/browse/HIVE-7311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner resolved HIVE-7311. -- Resolution: Fixed Committed to cbo branch. add cbo enable flag to cbo_correctness script and add it to the Tez tests as well - Key: HIVE-7311 URL: https://issues.apache.org/jira/browse/HIVE-7311 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7311.1.q -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5976) Decouple input formats from STORED as keywords
[ https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5976: - Status: Open (was: Patch Available) [~davidzchen], I'd like to review this and get it committed. Can you post an RB? Thanks! Decouple input formats from STORED as keywords -- Key: HIVE-5976 URL: https://issues.apache.org/jira/browse/HIVE-5976 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5976.2.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch As noted in HIVE-5783, we hard code the input formats mapped to keywords. It'd be nice if there was a registration system so we didn't need to do that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7304) Transitive Predicate Propagation doesn't happen properly after HIVE-7159
[ https://issues.apache.org/jira/browse/HIVE-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046644#comment-14046644 ] Hive QA commented on HIVE-7304: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652901/HIVE-7304.2.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5655 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/621/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/621/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-621/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12652901 Transitive Predicate Propagation doesn't happen properly after HIVE-7159 Key: HIVE-7304 URL: https://issues.apache.org/jira/browse/HIVE-7304 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7304.1.patch, HIVE-7304.2.patch The reason for the regression is that PredicateTransitivePropagate looks at the FilterOperator below the ReduceSink. SemanticAnalyzer::genNotNullFilterForJoinSourcePlan was stacking another FilterOp for the not null check, so only that predicate was being applied transitively by PredicateTransitivePropagate. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7063) Optimize for the Top N within a Group use case
[ https://issues.apache.org/jira/browse/HIVE-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046696#comment-14046696 ] Ashutosh Chauhan commented on HIVE-7063: This is not going to optimize limit with rank like following : {code} select * from ( select p_mfgr, rank() over(..) from part) a limit 4; {code} Rather, this optimization is targeted for rank with filter predicates. It does seem like users are likely to write query with filter predicate given semantics of rank so this may not be an issue, but I think its good to note here so expectations are clear. Optimize for the Top N within a Group use case -- Key: HIVE-7063 URL: https://issues.apache.org/jira/browse/HIVE-7063 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7063.1.patch, HIVE-7063.2.patch It is common to rank within a Group/Partition and then only return the Top N entries within each Group. With Streaming mode for Windowing, we should push the post filter on the rank into the Windowing processing as a Limit expression. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7312) CBO throws ArrayIndexOutOfBounds
Gunther Hagleitner created HIVE-7312: Summary: CBO throws ArrayIndexOutOfBounds Key: HIVE-7312 URL: https://issues.apache.org/jira/browse/HIVE-7312 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Running tpcds query 17. Still confirming if col stats are available. When I turn CBO on (this is just the relevant snipped, the actual exception is pages long): Caused by: java.lang.IndexOutOfBoundsException: Index: 24, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.optimizer.optiq.RelOptHiveTable.getColStat(RelOptHiveTable.java:97) at org.apache.hadoop.hive.ql.optimizer.optiq.reloperators.HiveTableScanRel.getColStat(HiveTableScanRel.java:73) at org.apache.hadoop.hive.ql.optimizer.optiq.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:47) at org.apache.hadoop.hive.ql.optimizer.optiq.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:36) ... 272 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7063) Optimize for the Top N within a Group use case
[ https://issues.apache.org/jira/browse/HIVE-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046699#comment-14046699 ] Harish Butani commented on HIVE-7063: - Yes, in your case we can optimize as though 'rank 5' was specified. Though I cannot see a valid use case of writing a limit after a windowing expression, as you point out the more common case is a predicate on rank. Optimize for the Top N within a Group use case -- Key: HIVE-7063 URL: https://issues.apache.org/jira/browse/HIVE-7063 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7063.1.patch, HIVE-7063.2.patch It is common to rank within a Group/Partition and then only return the Top N entries within each Group. With Streaming mode for Windowing, we should push the post filter on the rank into the Windowing processing as a Limit expression. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7313) Allow session-level temp-tables to be marked as in-memory tables
Gopal V created HIVE-7313: - Summary: Allow session-level temp-tables to be marked as in-memory tables Key: HIVE-7313 URL: https://issues.apache.org/jira/browse/HIVE-7313 Project: Hive Issue Type: Improvement Components: Tez Reporter: Gopal V When the hadoop-2.3C shims are in action, APIs which can pin small tables into memory are available. Any session with an in-memory table can create HDFS in-memory pools with default caching semantics and add its files into the cache pool. Example code to implement the behaviour was prototyped for the Tez Application Master, but the AM does not have enough information to determine the cache policies. https://github.com/rajeshbalamohan/hdfs-cache-tool/blob/master/src/main/java/org/apache/hadoop/hdfs/tools/HDFSCache.java#L74 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7312) CBO throws ArrayIndexOutOfBounds
[ https://issues.apache.org/jira/browse/HIVE-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046700#comment-14046700 ] Gunther Hagleitner commented on HIVE-7312: -- Only happens when col stats haven't been gathered. Leaving this open though. There should be a graceful way to detect that and maybe log what's happening. CBO throws ArrayIndexOutOfBounds Key: HIVE-7312 URL: https://issues.apache.org/jira/browse/HIVE-7312 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Running tpcds query 17. Still confirming if col stats are available. When I turn CBO on (this is just the relevant snipped, the actual exception is pages long): Caused by: java.lang.IndexOutOfBoundsException: Index: 24, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.optimizer.optiq.RelOptHiveTable.getColStat(RelOptHiveTable.java:97) at org.apache.hadoop.hive.ql.optimizer.optiq.reloperators.HiveTableScanRel.getColStat(HiveTableScanRel.java:73) at org.apache.hadoop.hive.ql.optimizer.optiq.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:47) at org.apache.hadoop.hive.ql.optimizer.optiq.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:36) ... 272 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive
[ https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046710#comment-14046710 ] Hive QA commented on HIVE-7090: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652926/HIVE-7090.6.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5686 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/623/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/623/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-623/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12652926 Support session-level temporary tables in Hive -- Key: HIVE-7090 URL: https://issues.apache.org/jira/browse/HIVE-7090 Project: Hive Issue Type: Bug Components: SQL Reporter: Gunther Hagleitner Assignee: Jason Dere Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, HIVE-7090.4.patch, HIVE-7090.5.patch, HIVE-7090.6.patch It's common to see sql scripts that create some temporary table as an intermediate result, run some additional queries against it and then clean up at the end. We should support temporary tables properly, meaning automatically manage the life cycle and make sure the visibility is restricted to the creating connection/session. Without these it's common to see left over tables in meta-store or weird errors with clashing tmp table names. Proposed syntax: CREATE TEMPORARY TABLE CTAS, CTL, INSERT INTO, should all be supported as usual. Knowing that a user wants a temp table can enable us to further optimize access to it. E.g.: temp tables should be kept in memory where possible, compactions and merging table files aren't required, ... -- This message was sent by Atlassian JIRA (v6.2#6252)