[jira] [Commented] (HIVE-8689) handle overflows in statistics better
[ https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192975#comment-14192975 ] Hive QA commented on HIVE-8689: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678641/HIVE-8689.01.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6608 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1586/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1586/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1586/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678641 - PreCommit-HIVE-TRUNK-Build handle overflows in statistics better - Key: HIVE-8689 URL: https://issues.apache.org/jira/browse/HIVE-8689 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.14.0 Attachments: HIVE-8689.01.patch, HIVE-8689.patch Improve overflow checks in StatsAnnotation optimizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8688) serialized plan OutputStream is not being closed
[ https://issues.apache.org/jira/browse/HIVE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192984#comment-14192984 ] Thejas M Nair commented on HIVE-8688: - I found this issue while trying to find the cause of the failure. I am not sure if this fixes the following issue, because its hard to reproduce. {code} Error: java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: Buffer underflow. at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:422) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:285) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:263) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:475) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:468) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:648) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:169) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Buffer underflow. at org.apache.hive.com.esotericsoftware.kryo.io.Input.require(Input.java:181) at org.apache.hive.com.esotericsoftware.kryo.io.Input.readVarInt(Input.java:355) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:809) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:670) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:1023) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:931) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:945) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:389) ... 13 more {code} serialized plan OutputStream is not being closed Key: HIVE-8688 URL: https://issues.apache.org/jira/browse/HIVE-8688 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8688.1.patch The OutputStream to which serialized plan is not being closed in several places. This can result in plan not getting written correctly. I have seen intermittent issues in deserializing the plan, and I think this could be the/a cause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8688) serialized plan OutputStream is not being closed
[ https://issues.apache.org/jira/browse/HIVE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192985#comment-14192985 ] Thejas M Nair commented on HIVE-8688: - [~hagleitn] This is a simple bug fix. It might help avoid issues like above stack trace. I think its useful for 0.14 serialized plan OutputStream is not being closed Key: HIVE-8688 URL: https://issues.apache.org/jira/browse/HIVE-8688 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8688.1.patch The OutputStream to which serialized plan is not being closed in several places. This can result in plan not getting written correctly. I have seen intermittent issues in deserializing the plan, and I think this could be the/a cause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails
[ https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192997#comment-14192997 ] Hive QA commented on HIVE-8656: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678609/HIVE-8656.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6609 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1587/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1587/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1587/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678609 - PreCommit-HIVE-TRUNK-Build CBO: auto_join_filters fails Key: HIVE-8656 URL: https://issues.apache.org/jira/browse/HIVE-8656 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Sergey Shelukhin Assignee: Ashutosh Chauhan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8656.patch Haven't looked why yet -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000
[ https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192998#comment-14192998 ] Hive QA commented on HIVE-8461: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678616/HIVE-8461.04.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1588/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1588/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1588/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-1588/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/TypeConverter.java' ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/core/target hcatalog/streaming/target hcatalog/server-extensions/target hcatalog/hcatalog-pig-adapter/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target accumulo-handler/target hwi/target common/target common/src/gen contrib/target service/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1635895. At revision 1635895. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12678616 - PreCommit-HIVE-TRUNK-Build Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... . - Key: HIVE-8461 URL: https://issues.apache.org/jira/browse/HIVE-8461 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, HIVE-8461.03.patch, HIVE-8461.04.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8395) CBO: enable by default
[ https://issues.apache.org/jira/browse/HIVE-8395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193022#comment-14193022 ] Hive QA commented on HIVE-8395: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678618/HIVE-8395.20.patch {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 6608 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_outer_join_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_router_join_ppr org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1589/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1589/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1589/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678618 - PreCommit-HIVE-TRUNK-Build CBO: enable by default -- Key: HIVE-8395 URL: https://issues.apache.org/jira/browse/HIVE-8395 Project: Hive Issue Type: Improvement Components: CBO Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.15.0 Attachments: HIVE-8395.01.patch, HIVE-8395.02.patch, HIVE-8395.03.patch, HIVE-8395.04.patch, HIVE-8395.05.patch, HIVE-8395.06.patch, HIVE-8395.07.patch, HIVE-8395.08.patch, HIVE-8395.09.patch, HIVE-8395.10.patch, HIVE-8395.11.patch, HIVE-8395.12.patch, HIVE-8395.12.patch, HIVE-8395.13.patch, HIVE-8395.13.patch, HIVE-8395.14.patch, HIVE-8395.15.patch, HIVE-8395.16.patch, HIVE-8395.17.patch, HIVE-8395.18.patch, HIVE-8395.18.patch, HIVE-8395.19.patch, HIVE-8395.20.patch, HIVE-8395.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8690) Move Avro dependency to 1.7.7
[ https://issues.apache.org/jira/browse/HIVE-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193029#comment-14193029 ] Hive QA commented on HIVE-8690: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678631/HIVE-8690.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6609 tests executed *Failed tests:* {noformat} org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1590/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1590/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1590/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678631 - PreCommit-HIVE-TRUNK-Build Move Avro dependency to 1.7.7 - Key: HIVE-8690 URL: https://issues.apache.org/jira/browse/HIVE-8690 Project: Hive Issue Type: New Feature Affects Versions: 0.13.1 Reporter: Thomas Friedrich Assignee: Thomas Friedrich Priority: Minor Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8690.1.patch Move Avro dependency from 1.7.5 to current release 1.7.7. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8687) Support Avro through HCatalog
[ https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193049#comment-14193049 ] Hive QA commented on HIVE-8687: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678649/HIVE-8687.3.patch {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 6634 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_charvarchar org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_decimal_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1591/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1591/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1591/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678649 - PreCommit-HIVE-TRUNK-Build Support Avro through HCatalog - Key: HIVE-8687 URL: https://issues.apache.org/jira/browse/HIVE-8687 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.14.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, HIVE-8687.branch-0.14.patch, HIVE-8687.patch Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4490) HS2 - 'select null ..' fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193052#comment-14193052 ] qiaohaijun commented on HIVE-4490: -- +1 HS2 - 'select null ..' fails with NPE - Key: HIVE-4490 URL: https://issues.apache.org/jira/browse/HIVE-4490 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Eg, from beeline {code} select null, i from t1 ; Error: Error running query: java.lang.NullPointerException (state=,code=0) Error: Error running query: java.lang.NullPointerException (state=,code=0) {code} In HS2 log org.apache.hive.service.cli.HiveSQLException: Error running query: java.lang.NullPointerException at org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:113) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:169) at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:62) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:524) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:57) at $Proxy8.executeStatement(Unknown Source) at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:203) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:565) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4172) JDBC2 does not support VOID type
[ https://issues.apache.org/jira/browse/HIVE-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193053#comment-14193053 ] qiaohaijun commented on HIVE-4172: -- +1 JDBC2 does not support VOID type Key: HIVE-4172 URL: https://issues.apache.org/jira/browse/HIVE-4172 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Navis Assignee: Navis Priority: Minor Labels: HiveServer2 Fix For: 0.12.0 Attachments: HIVE-4172.D9555.1.patch, HIVE-4172.D9555.2.patch, HIVE-4172.D9555.3.patch, HIVE-4172.D9555.4.patch, HIVE-4172.D9555.5.patch In beeline, select key, null from src fails with exception, {noformat} org.apache.hive.service.cli.HiveSQLException: Error running query: java.lang.NullPointerException at org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:112) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:166) at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:183) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:39) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5683) JDBC support for char
[ https://issues.apache.org/jira/browse/HIVE-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193061#comment-14193061 ] qiaohaijun commented on HIVE-5683: -- +1 JDBC support for char - Key: HIVE-5683 URL: https://issues.apache.org/jira/browse/HIVE-5683 Project: Hive Issue Type: Bug Components: JDBC, Types Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.13.0 Attachments: HIVE-5683.1.patch, HIVE-5683.2.patch, HIVE-5683.3.patch Support char type in JDBC, including char length in result set metadata. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5230) Better error reporting by async threads in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193060#comment-14193060 ] qiaohaijun commented on HIVE-5230: -- +1 Better error reporting by async threads in HiveServer2 -- Key: HIVE-5230 URL: https://issues.apache.org/jira/browse/HIVE-5230 Project: Hive Issue Type: Sub-task Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5230.1.patch, HIVE-5230.1.patch, HIVE-5230.10.patch, HIVE-5230.2.patch, HIVE-5230.3.patch, HIVE-5230.4.patch, HIVE-5230.6.patch, HIVE-5230.7.patch, HIVE-5230.8.patch, HIVE-5230.9.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. When a background thread gets an error, currently the client can only poll for the operation state and also the error with its stacktrace is logged. However, it will be useful to provide a richer error response like thrift API does with TStatus (which is constructed while building a Thrift response object). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8529) HiveSessionImpl#fetchResults should not try to fetch operation log when hive.server2.logging.operation.enabled is false.
[ https://issues.apache.org/jira/browse/HIVE-8529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193070#comment-14193070 ] qiaohaijun commented on HIVE-8529: -- +1 HiveSessionImpl#fetchResults should not try to fetch operation log when hive.server2.logging.operation.enabled is false. Key: HIVE-8529 URL: https://issues.apache.org/jira/browse/HIVE-8529 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Vaibhav Gumashta Fix For: 0.15.0 Throws this even when it is disabled: {code} 14/10/20 15:53:14 [HiveServer2-Handler-Pool: Thread-53]: DEBUG security.UserGroupInformation: PrivilegedActionException as:vgumashta (auth:SIMPLE) cause:org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated with operation handle: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b3d05ca6-e3e8-4bef-b869-0ea0732c3ac5] 14/10/20 15:53:14 [HiveServer2-Handler-Pool: Thread-53]: WARN thrift.ThriftCLIService: Error fetching results: org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated with operation handle: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b3d05ca6-e3e8-4bef-b869-0ea0732c3ac5] at org.apache.hive.service.cli.operation.OperationManager.getOperationLogRowSet(OperationManager.java:240) at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:665) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:508) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60) at com.sun.proxy.$Proxy20.fetchResults(Unknown Source) at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:427) at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:582) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:695) 14/10/20 15:53:14 [HiveServer2-Handler-Pool: Thread-53]: DEBUG transport.TSaslTransport: writing data length: 2525 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken
[ https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193079#comment-14193079 ] qiaohaijun commented on HIVE-6050: -- +1 JDBC backward compatibility is broken - Key: HIVE-6050 URL: https://issues.apache.org/jira/browse/HIVE-6050 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.13.0 Reporter: Szehon Ho Assignee: Carl Steinbach Priority: Blocker Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of Hive 0.10 (TProtocolVersion=v1), will return the following exception: {noformat} java.sql.SQLException: Could not establish connection to jdbc:hive2://localhost:1/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73) at org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187) at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914) Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327) ... 37 more {noformat} On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, which doesn't seem to be backward-compatible. Look at the code path in the generated file 'TOpenSessionReq.java', method TOpenSessionReqStandardScheme.read(): 1. The method will call 'TProtocolVersion.findValue()' on the thrift protocol's byte stream, which returns null if the client is sending an enum value unknown to the server. (v4 is unknown to server) 2. The method will then call struct.validate(), which will throw the above exception because of null version. So doesn't look like the current backward-compatibility scheme will work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6160) Follow-on to HS2 ResultSet Serialization Performance Regression
[ https://issues.apache.org/jira/browse/HIVE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193080#comment-14193080 ] qiaohaijun commented on HIVE-6160: -- +1 Follow-on to HS2 ResultSet Serialization Performance Regression --- Key: HIVE-6160 URL: https://issues.apache.org/jira/browse/HIVE-6160 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.0 Reporter: George Chow Assignee: Xiao Meng Priority: Minor As suggested by Brock, this is follow-on to HIVE-3746 to address: 1) test backwards compatibility with the older driver and fix any outstanding issues 2) remove the debug stuff that is included (printStackTrace and System.out) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8693) Separate out fair scheduler dependency from hadoop 0.23 shim
[ https://issues.apache.org/jira/browse/HIVE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193081#comment-14193081 ] Hive QA commented on HIVE-8693: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678653/HIVE-8693.1.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 6609 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1592/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1592/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1592/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678653 - PreCommit-HIVE-TRUNK-Build Separate out fair scheduler dependency from hadoop 0.23 shim Key: HIVE-8693 URL: https://issues.apache.org/jira/browse/HIVE-8693 Project: Hive Issue Type: Bug Components: HiveServer2, Shims Affects Versions: 0.14.0, 0.15.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-8693.1.patch As part of HIVE-8424 HiveServer2 uses Fair scheduler APIs to determine resource queue allocation for non-impersonation case. This adds a hard dependency of Yarn server jars for Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken
[ https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193085#comment-14193085 ] qiaohaijun commented on HIVE-6050: -- 14/11/01 19:12:44 ERROR jdbc.HiveConnection: Error opening session org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:156) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:143) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:415) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:193) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:145) at org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:186) at org.apache.hive.beeline.Commands.connect(Commands.java:959) at org.apache.hive.beeline.Commands.connect(Commands.java:880) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:44) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:801) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351) Error: Invalid URL: jdbc:hive2://10.134.34.181:1 (state=08S01,code=0) --- spark 1.1.1 hive 0.12-probuf-2.5 JDBC backward compatibility is broken - Key: HIVE-6050 URL: https://issues.apache.org/jira/browse/HIVE-6050 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.13.0 Reporter: Szehon Ho Assignee: Carl Steinbach Priority: Blocker Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of Hive 0.10 (TProtocolVersion=v1), will return the following exception: {noformat} java.sql.SQLException: Could not establish connection to jdbc:hive2://localhost:1/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73) at org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187) at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at
[jira] [Commented] (HIVE-8688) serialized plan OutputStream is not being closed
[ https://issues.apache.org/jira/browse/HIVE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193117#comment-14193117 ] Hive QA commented on HIVE-8688: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678590/HIVE-8688.1.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6609 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1593/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1593/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1593/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678590 - PreCommit-HIVE-TRUNK-Build serialized plan OutputStream is not being closed Key: HIVE-8688 URL: https://issues.apache.org/jira/browse/HIVE-8688 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8688.1.patch The OutputStream to which serialized plan is not being closed in several places. This can result in plan not getting written correctly. I have seen intermittent issues in deserializing the plan, and I think this could be the/a cause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats
[ https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193142#comment-14193142 ] Hive QA commented on HIVE-8671: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678665/HIVE-8671.5.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6609 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1594/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1594/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1594/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678665 - PreCommit-HIVE-TRUNK-Build Overflow in estimate row count and data size with fetch column stats Key: HIVE-8671 URL: https://issues.apache.org/jira/browse/HIVE-8671 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Prasanth J Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8671.1.patch, HIVE-8671.2.patch, HIVE-8671.3.patch, HIVE-8671.4.patch, HIVE-8671.5.patch Overflow in row counts and data size for several TPC-DS queries. Interestingly the operators which have overflow end up running with a small parallelism. For instance Reducer 2 has an overflow but it only runs with parallelism of 2. {code} Reducer 2 Reduce Operator Tree: Group By Operator aggregations: sum(VALUE._col0) keys: KEY._col0 (type: string), KEY._col1 (type: string), KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col3 (type: string), _col3 (type: string) sort order: ++ Map-reduce partition columns: _col3 (type: string) Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: float), _col5 (type: double) Execution mode: vectorized {code} {code} VERTEX TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS INPUT_RECORDS OUTPUT_RECORDS Map 1 62 26.41 1,779,510 211,978,502 60,628,390 Map 5 14.28 6,950 138,098 138,098 Map 6 12.44 3,910 31 31 Reducer 2 2 22.69 61,320 60,628,390 69,182 Reducer 3 12.63 3,910 69,182 100 Reducer 4 11.01 1,180 100 100 {code} Query {code} explain select i_item_desc ,i_category ,i_class ,i_current_price ,i_item_id ,sum(ws_ext_sales_price) as itemrevenue ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over (partition by i_class) as revenueratio from web_sales ,item ,date_dim where web_sales.ws_item_sk = item.i_item_sk and item.i_category in ('Jewelry', 'Sports', 'Books') and web_sales.ws_sold_date_sk = date_dim.d_date_sk and date_dim.d_date between '2001-01-12' and '2001-02-11' group by i_item_id ,i_item_desc ,i_category ,i_class ,i_current_price order by i_category ,i_class ,i_item_id ,i_item_desc ,revenueratio limit 100 {code} Explain {code} STAGE PLANS: Stage: Stage-1 Tez Edges: Map 1
[jira] [Commented] (HIVE-8435) Add identity project remover optimization
[ https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193176#comment-14193176 ] Hive QA commented on HIVE-8435: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678666/HIVE-8435.06.patch {color:red}ERROR:{color} -1 due to 891 failed/errored test(s), 6610 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver_accumulo_predicate_pushdown org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver_accumulo_queries org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver_accumulo_single_sourced_multi_insert org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_allcolref_in_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2_orc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguous_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_analyze_table_null_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_create_temp_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join21 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join28 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join29 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join31 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Description: This makes DDL commands fail This was stupidly broken in HIVE-8643 was: This makes DDL commands fail This was stupidly broken in HIVE-8643 NO PRECOMMIT TESTS DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Status: Patch Available (was: Open) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Status: Open (was: Patch Available) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Attachment: HIVE-8685.3.patch patch 2 and 3 are the same - just trying to kick off build bot DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Status: Patch Available (was: Open) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Status: Open (was: Patch Available) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8687) Support Avro through HCatalog
[ https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-8687: --- Status: Open (was: Patch Available) Support Avro through HCatalog - Key: HIVE-8687 URL: https://issues.apache.org/jira/browse/HIVE-8687 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.14.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, HIVE-8687.branch-0.14.patch, HIVE-8687.patch Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8687) Support Avro through HCatalog
[ https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-8687: --- Attachment: HIVE-8687.4.patch Attaching updated version of trunk patch to fix above issue (branch-0.14 version 3 of the patch was good for branch-0.14) Support Avro through HCatalog - Key: HIVE-8687 URL: https://issues.apache.org/jira/browse/HIVE-8687 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.14.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.4.patch, HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, HIVE-8687.branch-0.14.patch, HIVE-8687.patch Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8687) Support Avro through HCatalog
[ https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-8687: --- Status: Patch Available (was: Open) Support Avro through HCatalog - Key: HIVE-8687 URL: https://issues.apache.org/jira/browse/HIVE-8687 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.14.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.4.patch, HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, HIVE-8687.branch-0.14.patch, HIVE-8687.patch Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000
[ https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8461: --- Status: In Progress (was: Patch Available) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... . - Key: HIVE-8461 URL: https://issues.apache.org/jira/browse/HIVE-8461 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, HIVE-8461.03.patch, HIVE-8461.04.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000
[ https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8461: --- Attachment: HIVE-8461.05.patch Merge conflicts from recent commit (HIVE-8632) that touched VectorHashKeyWrapper. Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... . - Key: HIVE-8461 URL: https://issues.apache.org/jira/browse/HIVE-8461 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, HIVE-8461.03.patch, HIVE-8461.04.patch, HIVE-8461.05.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8395) CBO: enable by default
[ https://issues.apache.org/jira/browse/HIVE-8395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193297#comment-14193297 ] Sergey Shelukhin commented on HIVE-8395: [~ashutoshc] after fixing a bug usually some out files would need to be updated (assuming they have acceptable changes) as a followup... this might be such a case CBO: enable by default -- Key: HIVE-8395 URL: https://issues.apache.org/jira/browse/HIVE-8395 Project: Hive Issue Type: Improvement Components: CBO Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.15.0 Attachments: HIVE-8395.01.patch, HIVE-8395.02.patch, HIVE-8395.03.patch, HIVE-8395.04.patch, HIVE-8395.05.patch, HIVE-8395.06.patch, HIVE-8395.07.patch, HIVE-8395.08.patch, HIVE-8395.09.patch, HIVE-8395.10.patch, HIVE-8395.11.patch, HIVE-8395.12.patch, HIVE-8395.12.patch, HIVE-8395.13.patch, HIVE-8395.13.patch, HIVE-8395.14.patch, HIVE-8395.15.patch, HIVE-8395.16.patch, HIVE-8395.17.patch, HIVE-8395.18.patch, HIVE-8395.18.patch, HIVE-8395.19.patch, HIVE-8395.20.patch, HIVE-8395.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000
[ https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8461: --- Status: Patch Available (was: In Progress) Try again. Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... . - Key: HIVE-8461 URL: https://issues.apache.org/jira/browse/HIVE-8461 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, HIVE-8461.03.patch, HIVE-8461.04.patch, HIVE-8461.05.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8594) Wrong condition in SettableConfigUpdater#setHiveConfWhiteList()
[ https://issues.apache.org/jira/browse/HIVE-8594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HIVE-8594: - Attachment: hive-8594.txt Wrong condition in SettableConfigUpdater#setHiveConfWhiteList() --- Key: HIVE-8594 URL: https://issues.apache.org/jira/browse/HIVE-8594 Project: Hive Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: hive-8594.txt {code} if(whiteListParamsStr == null whiteListParamsStr.trim().isEmpty()) { {code} If whiteListParamsStr is null, the call to trim() would result in NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-8594) Wrong condition in SettableConfigUpdater#setHiveConfWhiteList()
[ https://issues.apache.org/jira/browse/HIVE-8594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HIVE-8594: Assignee: Ted Yu Wrong condition in SettableConfigUpdater#setHiveConfWhiteList() --- Key: HIVE-8594 URL: https://issues.apache.org/jira/browse/HIVE-8594 Project: Hive Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: hive-8594.txt {code} if(whiteListParamsStr == null whiteListParamsStr.trim().isEmpty()) { {code} If whiteListParamsStr is null, the call to trim() would result in NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8594) Wrong condition in SettableConfigUpdater#setHiveConfWhiteList()
[ https://issues.apache.org/jira/browse/HIVE-8594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HIVE-8594: - Status: Patch Available (was: Open) Wrong condition in SettableConfigUpdater#setHiveConfWhiteList() --- Key: HIVE-8594 URL: https://issues.apache.org/jira/browse/HIVE-8594 Project: Hive Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: hive-8594.txt {code} if(whiteListParamsStr == null whiteListParamsStr.trim().isEmpty()) { {code} If whiteListParamsStr is null, the call to trim() would result in NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8687) Support Avro through HCatalog
[ https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193339#comment-14193339 ] Hive QA commented on HIVE-8687: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678690/HIVE-8687.4.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6637 tests executed *Failed tests:* {noformat} org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1596/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1596/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1596/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678690 - PreCommit-HIVE-TRUNK-Build Support Avro through HCatalog - Key: HIVE-8687 URL: https://issues.apache.org/jira/browse/HIVE-8687 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.14.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.4.patch, HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, HIVE-8687.branch-0.14.patch, HIVE-8687.patch Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats
[ https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193351#comment-14193351 ] Prasanth J commented on HIVE-8671: -- [~hagleitn] Can we have this is 0.14? Overflow in estimate row count and data size with fetch column stats Key: HIVE-8671 URL: https://issues.apache.org/jira/browse/HIVE-8671 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Prasanth J Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8671.1.patch, HIVE-8671.2.patch, HIVE-8671.3.patch, HIVE-8671.4.patch, HIVE-8671.5.patch Overflow in row counts and data size for several TPC-DS queries. Interestingly the operators which have overflow end up running with a small parallelism. For instance Reducer 2 has an overflow but it only runs with parallelism of 2. {code} Reducer 2 Reduce Operator Tree: Group By Operator aggregations: sum(VALUE._col0) keys: KEY._col0 (type: string), KEY._col1 (type: string), KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col3 (type: string), _col3 (type: string) sort order: ++ Map-reduce partition columns: _col3 (type: string) Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: float), _col5 (type: double) Execution mode: vectorized {code} {code} VERTEX TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS INPUT_RECORDS OUTPUT_RECORDS Map 1 62 26.41 1,779,510 211,978,502 60,628,390 Map 5 14.28 6,950 138,098 138,098 Map 6 12.44 3,910 31 31 Reducer 2 2 22.69 61,320 60,628,390 69,182 Reducer 3 12.63 3,910 69,182 100 Reducer 4 11.01 1,180 100 100 {code} Query {code} explain select i_item_desc ,i_category ,i_class ,i_current_price ,i_item_id ,sum(ws_ext_sales_price) as itemrevenue ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over (partition by i_class) as revenueratio from web_sales ,item ,date_dim where web_sales.ws_item_sk = item.i_item_sk and item.i_category in ('Jewelry', 'Sports', 'Books') and web_sales.ws_sold_date_sk = date_dim.d_date_sk and date_dim.d_date between '2001-01-12' and '2001-02-11' group by i_item_id ,i_item_desc ,i_category ,i_class ,i_current_price order by i_category ,i_class ,i_item_id ,i_item_desc ,revenueratio limit 100 {code} Explain {code} STAGE PLANS: Stage: Stage-1 Tez Edges: Map 1 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE) Reducer 2 - Map 1 (SIMPLE_EDGE) Reducer 3 - Reducer 2 (SIMPLE_EDGE) Reducer 4 - Reducer 3 (SIMPLE_EDGE) DagName: mmokhtar_20141019164343_854cb757-01bd-40cb-843e-9ada7c5e6f38:1 Vertices: Map 1 Map Operator Tree: TableScan alias: web_sales filterExpr: ws_item_sk is not null (type: boolean) Statistics: Num rows: 21594638446 Data size: 2850189889652 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ws_item_sk is not null (type: boolean) Statistics: Num rows: 21594638446 Data size: 172746300152 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ws_item_sk (type: int), ws_ext_sales_price (type: float), ws_sold_date_sk (type: int) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 21594638446
[jira] [Commented] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats
[ https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193366#comment-14193366 ] Gunther Hagleitner commented on HIVE-8671: -- +1 for 0.14 Overflow in estimate row count and data size with fetch column stats Key: HIVE-8671 URL: https://issues.apache.org/jira/browse/HIVE-8671 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Prasanth J Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8671.1.patch, HIVE-8671.2.patch, HIVE-8671.3.patch, HIVE-8671.4.patch, HIVE-8671.5.patch Overflow in row counts and data size for several TPC-DS queries. Interestingly the operators which have overflow end up running with a small parallelism. For instance Reducer 2 has an overflow but it only runs with parallelism of 2. {code} Reducer 2 Reduce Operator Tree: Group By Operator aggregations: sum(VALUE._col0) keys: KEY._col0 (type: string), KEY._col1 (type: string), KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col3 (type: string), _col3 (type: string) sort order: ++ Map-reduce partition columns: _col3 (type: string) Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: float), _col5 (type: double) Execution mode: vectorized {code} {code} VERTEX TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS INPUT_RECORDS OUTPUT_RECORDS Map 1 62 26.41 1,779,510 211,978,502 60,628,390 Map 5 14.28 6,950 138,098 138,098 Map 6 12.44 3,910 31 31 Reducer 2 2 22.69 61,320 60,628,390 69,182 Reducer 3 12.63 3,910 69,182 100 Reducer 4 11.01 1,180 100 100 {code} Query {code} explain select i_item_desc ,i_category ,i_class ,i_current_price ,i_item_id ,sum(ws_ext_sales_price) as itemrevenue ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over (partition by i_class) as revenueratio from web_sales ,item ,date_dim where web_sales.ws_item_sk = item.i_item_sk and item.i_category in ('Jewelry', 'Sports', 'Books') and web_sales.ws_sold_date_sk = date_dim.d_date_sk and date_dim.d_date between '2001-01-12' and '2001-02-11' group by i_item_id ,i_item_desc ,i_category ,i_class ,i_current_price order by i_category ,i_class ,i_item_id ,i_item_desc ,revenueratio limit 100 {code} Explain {code} STAGE PLANS: Stage: Stage-1 Tez Edges: Map 1 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE) Reducer 2 - Map 1 (SIMPLE_EDGE) Reducer 3 - Reducer 2 (SIMPLE_EDGE) Reducer 4 - Reducer 3 (SIMPLE_EDGE) DagName: mmokhtar_20141019164343_854cb757-01bd-40cb-843e-9ada7c5e6f38:1 Vertices: Map 1 Map Operator Tree: TableScan alias: web_sales filterExpr: ws_item_sk is not null (type: boolean) Statistics: Num rows: 21594638446 Data size: 2850189889652 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ws_item_sk is not null (type: boolean) Statistics: Num rows: 21594638446 Data size: 172746300152 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ws_item_sk (type: int), ws_ext_sales_price (type: float), ws_sold_date_sk (type: int) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 21594638446 Data size:
[jira] [Resolved] (HIVE-8424) Support fair scheduler user queue mapping in non-impersonation mode
[ https://issues.apache.org/jira/browse/HIVE-8424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V resolved HIVE-8424. --- Resolution: Fixed Sure, [~brocknoland] - I see that HIVE-8693 has been opened for this. Support fair scheduler user queue mapping in non-impersonation mode --- Key: HIVE-8424 URL: https://issues.apache.org/jira/browse/HIVE-8424 Project: Hive Issue Type: Improvement Components: Shims Reporter: Mohit Sabharwal Assignee: Mohit Sabharwal Labels: TODOC15 Fix For: 0.15.0 Attachments: HIVE-8424.1.patch, HIVE-8424.2.patch, HIVE-8424.3.patch, HIVE-8424.patch Under non-impersonation mode, all MR jobs run as the hive system user. The default scheduler queue mapping is one queue per user. This is problematic for users who use the queues to regulate and track their MR resource usage. Yarn exposes an API to retrieve the fair scheduler queue mapping, which we can use to set the appropriate MR queue for the current user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8687) Support Avro through HCatalog
[ https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193383#comment-14193383 ] Gunther Hagleitner commented on HIVE-8687: -- +1 for hive .14 Support Avro through HCatalog - Key: HIVE-8687 URL: https://issues.apache.org/jira/browse/HIVE-8687 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.14.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.4.patch, HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, HIVE-8687.branch-0.14.patch, HIVE-8687.patch Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats
[ https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-8671: - Resolution: Fixed Fix Version/s: 0.15.0 Status: Resolved (was: Patch Available) Patch committed to trunk and branch-0.14. Overflow in estimate row count and data size with fetch column stats Key: HIVE-8671 URL: https://issues.apache.org/jira/browse/HIVE-8671 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Prasanth J Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8671.1.patch, HIVE-8671.2.patch, HIVE-8671.3.patch, HIVE-8671.4.patch, HIVE-8671.5.patch Overflow in row counts and data size for several TPC-DS queries. Interestingly the operators which have overflow end up running with a small parallelism. For instance Reducer 2 has an overflow but it only runs with parallelism of 2. {code} Reducer 2 Reduce Operator Tree: Group By Operator aggregations: sum(VALUE._col0) keys: KEY._col0 (type: string), KEY._col1 (type: string), KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col3 (type: string), _col3 (type: string) sort order: ++ Map-reduce partition columns: _col3 (type: string) Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: float), _col5 (type: double) Execution mode: vectorized {code} {code} VERTEX TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS INPUT_RECORDS OUTPUT_RECORDS Map 1 62 26.41 1,779,510 211,978,502 60,628,390 Map 5 14.28 6,950 138,098 138,098 Map 6 12.44 3,910 31 31 Reducer 2 2 22.69 61,320 60,628,390 69,182 Reducer 3 12.63 3,910 69,182 100 Reducer 4 11.01 1,180 100 100 {code} Query {code} explain select i_item_desc ,i_category ,i_class ,i_current_price ,i_item_id ,sum(ws_ext_sales_price) as itemrevenue ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over (partition by i_class) as revenueratio from web_sales ,item ,date_dim where web_sales.ws_item_sk = item.i_item_sk and item.i_category in ('Jewelry', 'Sports', 'Books') and web_sales.ws_sold_date_sk = date_dim.d_date_sk and date_dim.d_date between '2001-01-12' and '2001-02-11' group by i_item_id ,i_item_desc ,i_category ,i_class ,i_current_price order by i_category ,i_class ,i_item_id ,i_item_desc ,revenueratio limit 100 {code} Explain {code} STAGE PLANS: Stage: Stage-1 Tez Edges: Map 1 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE) Reducer 2 - Map 1 (SIMPLE_EDGE) Reducer 3 - Reducer 2 (SIMPLE_EDGE) Reducer 4 - Reducer 3 (SIMPLE_EDGE) DagName: mmokhtar_20141019164343_854cb757-01bd-40cb-843e-9ada7c5e6f38:1 Vertices: Map 1 Map Operator Tree: TableScan alias: web_sales filterExpr: ws_item_sk is not null (type: boolean) Statistics: Num rows: 21594638446 Data size: 2850189889652 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ws_item_sk is not null (type: boolean) Statistics: Num rows: 21594638446 Data size: 172746300152 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ws_item_sk (type: int), ws_ext_sales_price (type: float), ws_sold_date_sk (type: int) outputColumnNames: _col0, _col1, _col2
[jira] [Commented] (HIVE-8689) handle overflows in statistics better
[ https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193394#comment-14193394 ] Prasanth J commented on HIVE-8689: -- [~sershe] HIVE-8671 committed now. Can you rebase this patch now? Also can you fix Mostafa's change to reducer estimation. It will estimate one reducer less than the previous code. For example: if totalInputFileSize is 140 and bytesPerReducer is 100 then current change will just say 1 reducer. We should either have Math.ceil or Math.max(totalInputFileSize, totalInputFileSize + bytesPerReducer - 1)/bytesPerReducer.. handle overflows in statistics better - Key: HIVE-8689 URL: https://issues.apache.org/jira/browse/HIVE-8689 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.14.0 Attachments: HIVE-8689.01.patch, HIVE-8689.patch Improve overflow checks in StatsAnnotation optimizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193395#comment-14193395 ] Hive QA commented on HIVE-8685: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678688/HIVE-8685.3.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6609 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1597/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1597/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1597/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678688 - PreCommit-HIVE-TRUNK-Build DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8694) every WebHCat e2e test should specify statusdir parameter
Eugene Koifman created HIVE-8694: Summary: every WebHCat e2e test should specify statusdir parameter Key: HIVE-8694 URL: https://issues.apache.org/jira/browse/HIVE-8694 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman e.g. 'statusdir=TestSqoop_:TNUM:' This captures stdout/stderr for job submission and helps diagnosing failures. See if it's easy to add something to the test harness to collect all the info in these dirs to make it available after cluster shutdown. NO _PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter
[ https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4329: - Priority: Major (was: Critical) HCatalog should use getHiveRecordWriter rather than getRecordWriter --- Key: HIVE-4329 URL: https://issues.apache.org/jira/browse/HIVE-4329 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.14.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sean Busbey Assignee: David Chen Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, HIVE-4329.3.patch, HIVE-4329.4.patch, HIVE-4329.5.patch Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter
[ https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193398#comment-14193398 ] Gunther Hagleitner commented on HIVE-4329: -- Setting Major because with HIVE-8687 it's not critical for hive .14 anymore. HCatalog should use getHiveRecordWriter rather than getRecordWriter --- Key: HIVE-4329 URL: https://issues.apache.org/jira/browse/HIVE-4329 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.14.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sean Busbey Assignee: David Chen Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, HIVE-4329.3.patch, HIVE-4329.4.patch, HIVE-4329.5.patch Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails
[ https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193402#comment-14193402 ] Ashutosh Chauhan commented on HIVE-8656: +1 [~julianhyde] Please update CALCITE-448 with correct description. CBO: auto_join_filters fails Key: HIVE-8656 URL: https://issues.apache.org/jira/browse/HIVE-8656 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Sergey Shelukhin Assignee: Ashutosh Chauhan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8656.patch Haven't looked why yet -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193409#comment-14193409 ] Eugene Koifman commented on HIVE-8685: -- the 2 test failures are not related testNegativeTokenAuth has been failing for many builds now org.apache.hive.hcatalog.streaming.TestStreaming is failing intermittently, for example, http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1594/testReport/junit/org.apache.hive.hcatalog.streaming/TestStreaming/testRemainingTransactions/ has exactly the same stack trace DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8693) Separate out fair scheduler dependency from hadoop 0.23 shim
[ https://issues.apache.org/jira/browse/HIVE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8693: --- Attachment: HIVE-8693.1.patch Separate out fair scheduler dependency from hadoop 0.23 shim Key: HIVE-8693 URL: https://issues.apache.org/jira/browse/HIVE-8693 Project: Hive Issue Type: Bug Components: HiveServer2, Shims Affects Versions: 0.14.0, 0.15.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-8693.1.patch, HIVE-8693.1.patch As part of HIVE-8424 HiveServer2 uses Fair scheduler APIs to determine resource queue allocation for non-impersonation case. This adds a hard dependency of Yarn server jars for Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails
[ https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193423#comment-14193423 ] Ashutosh Chauhan commented on HIVE-8656: [~hagleitn] ok for 0.14 ? CBO: auto_join_filters fails Key: HIVE-8656 URL: https://issues.apache.org/jira/browse/HIVE-8656 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Sergey Shelukhin Assignee: Ashutosh Chauhan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8656.patch Haven't looked why yet -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8675) Increase thrift server protocol test coverage
[ https://issues.apache.org/jira/browse/HIVE-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8675: --- Attachment: HIVE-8675.patch Increase thrift server protocol test coverage - Key: HIVE-8675 URL: https://issues.apache.org/jira/browse/HIVE-8675 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.14.0 Attachments: HIVE-8675.patch, HIVE-8675.patch, HIVE-8675.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8688) serialized plan OutputStream is not being closed
[ https://issues.apache.org/jira/browse/HIVE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193427#comment-14193427 ] Gunther Hagleitner commented on HIVE-8688: -- +1 for hive.14 serialized plan OutputStream is not being closed Key: HIVE-8688 URL: https://issues.apache.org/jira/browse/HIVE-8688 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8688.1.patch The OutputStream to which serialized plan is not being closed in several places. This can result in plan not getting written correctly. I have seen intermittent issues in deserializing the plan, and I think this could be the/a cause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8688) serialized plan OutputStream is not being closed
[ https://issues.apache.org/jira/browse/HIVE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-8688: Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed to trunk and 0.14 branch. Thanks for the review Jason and Gunther! serialized plan OutputStream is not being closed Key: HIVE-8688 URL: https://issues.apache.org/jira/browse/HIVE-8688 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8688.1.patch The OutputStream to which serialized plan is not being closed in several places. This can result in plan not getting written correctly. I have seen intermittent issues in deserializing the plan, and I think this could be the/a cause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7576) Add PartitionSpec support in HCatClient API
[ https://issues.apache.org/jira/browse/HIVE-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-7576: --- Attachment: HIVE-7576.2.patch After verbal confirmation from Mithun that he's okay with me adding InterfaceAudience.LimitedPrivate(Hive) and InterfaceStability.Evolving on all the new methods using PartitionSpec, I updated his patch with them. Add PartitionSpec support in HCatClient API --- Key: HIVE-7576 URL: https://issues.apache.org/jira/browse/HIVE-7576 Project: Hive Issue Type: Bug Components: HCatalog, Metastore Affects Versions: 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-7576.1.patch, HIVE-7576.2.patch HIVE-7223 adds support for PartitionSpecs in Hive Metastore. The HCatClient API must add support to fetch partitions, add partitions, etc. using PartitionSpec semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8680) Set Max Message for Binary Thrift endpoints
[ https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8680: --- Attachment: HIVE-8680.patch Set Max Message for Binary Thrift endpoints --- Key: HIVE-8680 URL: https://issues.apache.org/jira/browse/HIVE-8680 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-8680.patch, HIVE-8680.patch Thrift has a configuration open to restrict incoming message size. If we configure this we'll stop OOM'ing when someone sends us an HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7576) Add PartitionSpec support in HCatClient API
[ https://issues.apache.org/jira/browse/HIVE-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193433#comment-14193433 ] Sushanth Sowmyan commented on HIVE-7576: (Submitted to pre-commit queue manually - http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1601 will test it.) Add PartitionSpec support in HCatClient API --- Key: HIVE-7576 URL: https://issues.apache.org/jira/browse/HIVE-7576 Project: Hive Issue Type: Bug Components: HCatalog, Metastore Affects Versions: 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-7576.1.patch, HIVE-7576.2.patch HIVE-7223 adds support for PartitionSpecs in Hive Metastore. The HCatClient API must add support to fetch partitions, add partitions, etc. using PartitionSpec semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8666) hive.metastore.server.max.threads default is too high
[ https://issues.apache.org/jira/browse/HIVE-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8666: --- Resolution: Fixed Fix Version/s: 0.15.0 Status: Resolved (was: Patch Available) hive.metastore.server.max.threads default is too high - Key: HIVE-8666 URL: https://issues.apache.org/jira/browse/HIVE-8666 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.15.0 Attachments: HIVE-8666.patch {{hive.metastore.server.max.threads}} defaults to 100K. Each thread requires a 1024KB stack which is 100GB. We should move the default to something more sensible like 1000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Resolution: Fixed Fix Version/s: 0.15.0 0.14.0 Status: Resolved (was: Patch Available) Committed to 0.14 and 0.15. Thanks [~thejas] for reivew DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8687) Support Avro through HCatalog
[ https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-8687: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch and trunk. Thanks [~sushanth] Support Avro through HCatalog - Key: HIVE-8687 URL: https://issues.apache.org/jira/browse/HIVE-8687 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.14.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.4.patch, HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, HIVE-8687.branch-0.14.patch, HIVE-8687.patch Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26968/#review59500 --- Hi, This approach looks great! I think we should try and avoid creating FilterPredicateType which duplicates Type. We can update Type and make the associated changes in ORC as needed. Additionally the latest parquet supports Timestamp and Decimal. Thanks!! serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java https://reviews.apache.org/r/26968/#comment100778 Perhaps we can change the Type enum to seperate out the types we need and then alter Orc to perform an type == String || type == CHAR || type VARCHAR? - Brock Noland On Oct. 21, 2014, 8:13 a.m., cheng xu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26968/ --- (Updated Oct. 21, 2014, 8:13 a.m.) Review request for hive. Repository: hive-git Description --- HIVE-8122: convert ExprNode to Parquet supported FilterPredict Diffs - pom.xml c69498004cdf93d3955c863031858a2dde2d8ccc ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d392d8ac5f5589f66c37d567b1d3bd8843 ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641545ed0ad69f3bbc9a8383697fc7efe37d ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java 831ef8c8ec64c270ef62d5336b4cc78d9e34b398 serde/pom.xml 98e55061b6b3abe18030b0b8d3f511bd98bee5f7 serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6dbd1ec71ad178f41e8666bad2500e68e151 serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f0148e2a995534a4c1369fc4c542cd0b4e6ab Diff: https://reviews.apache.org/r/26968/diff/ Testing --- local UT passed Thanks, cheng xu
[jira] [Commented] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000
[ https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193452#comment-14193452 ] Hive QA commented on HIVE-8461: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678693/HIVE-8461.05.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6640 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1598/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1598/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1598/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678693 - PreCommit-HIVE-TRUNK-Build Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... . - Key: HIVE-8461 URL: https://issues.apache.org/jira/browse/HIVE-8461 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, HIVE-8461.03.patch, HIVE-8461.04.patch, HIVE-8461.05.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails
[ https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193474#comment-14193474 ] Gunther Hagleitner commented on HIVE-8656: -- Yes, please. +1 for 0.14. CBO: auto_join_filters fails Key: HIVE-8656 URL: https://issues.apache.org/jira/browse/HIVE-8656 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Sergey Shelukhin Assignee: Ashutosh Chauhan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8656.patch Haven't looked why yet -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000
[ https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-8461: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to .14 and trunk. (Test failures are unrelated). Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... . - Key: HIVE-8461 URL: https://issues.apache.org/jira/browse/HIVE-8461 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, HIVE-8461.03.patch, HIVE-8461.04.patch, HIVE-8461.05.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7111) Extend join transitivity PPD to non-column expressions
[ https://issues.apache.org/jira/browse/HIVE-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193481#comment-14193481 ] Ashutosh Chauhan commented on HIVE-7111: +1 New logic is much cleaner than earlier as it doesn't refer to parse info and ASTs. Good work, Navis! Extend join transitivity PPD to non-column expressions -- Key: HIVE-7111 URL: https://issues.apache.org/jira/browse/HIVE-7111 Project: Hive Issue Type: Task Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-7111.1.patch.txt, HIVE-7111.2.patch.txt, HIVE-7111.2.patch.txt, HIVE-7111.3.patch.txt, HIVE-7111.4.patch.txt Join transitive in PPD only supports column expressions, but it's possible to extend this to generic expressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7576) Add PartitionSpec support in HCatClient API
[ https://issues.apache.org/jira/browse/HIVE-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193484#comment-14193484 ] Mithun Radhakrishnan commented on HIVE-7576: Thanks for adding the Interface annotations. That's a good idea. Add PartitionSpec support in HCatClient API --- Key: HIVE-7576 URL: https://issues.apache.org/jira/browse/HIVE-7576 Project: Hive Issue Type: Bug Components: HCatalog, Metastore Affects Versions: 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-7576.1.patch, HIVE-7576.2.patch HIVE-7223 adds support for PartitionSpecs in Hive Metastore. The HCatClient API must add support to fetch partitions, add partitions, etc. using PartitionSpec semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8656) CBO: auto_join_filters fails
[ https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8656: --- Resolution: Fixed Status: Resolved (was: Patch Available) CBO: auto_join_filters fails Key: HIVE-8656 URL: https://issues.apache.org/jira/browse/HIVE-8656 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Sergey Shelukhin Assignee: Julian Hyde Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8656.patch Haven't looked why yet -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8656) CBO: auto_join_filters fails
[ https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8656: --- Assignee: Julian Hyde (was: Ashutosh Chauhan) CBO: auto_join_filters fails Key: HIVE-8656 URL: https://issues.apache.org/jira/browse/HIVE-8656 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Sergey Shelukhin Assignee: Julian Hyde Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8656.patch Haven't looked why yet -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails
[ https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193490#comment-14193490 ] Ashutosh Chauhan commented on HIVE-8656: Committed to trunk 0.14. Thanks, Julian ! CBO: auto_join_filters fails Key: HIVE-8656 URL: https://issues.apache.org/jira/browse/HIVE-8656 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Sergey Shelukhin Assignee: Julian Hyde Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8656.patch Haven't looked why yet -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7803) Enable Hadoop speculative execution may cause corrupt output directory (dynamic partition)
[ https://issues.apache.org/jira/browse/HIVE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-7803: --- Resolution: Fixed Status: Resolved (was: Patch Available) (Closing as duplicate without committing, since this functionality is subsumed and improved by HIVE-8394) Enable Hadoop speculative execution may cause corrupt output directory (dynamic partition) -- Key: HIVE-7803 URL: https://issues.apache.org/jira/browse/HIVE-7803 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.1 Environment: Reporter: Selina Zhang Assignee: Selina Zhang Priority: Critical Attachments: HIVE-7803.1.patch, HIVE-7803.2.patch One of our users reports they see intermittent failures due to attempt directories in the input paths. We found with speculative execution turned on, two mappers tried to commit task at the same time using the same committed task path, which cause the corrupt output directory. The original Pig script: {code} STORE AdvertiserDataParsedClean INTO '$DB_NAME.$ADVERTISER_META_TABLE_NAME' USING org.apache.hcatalog.pig.HCatStorer(); {code} Two mappers attempt_1405021984947_5394024_m_000523_0: KILLED attempt_1405021984947_5394024_m_000523_1: SUCCEEDED attempt_1405021984947_5394024_m_000523_0 was killed right after the commit. As a result, it created corrupt directory as /projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523/ containing part-m-00523 (from attempt_1405021984947_5394024_m_000523_0) and attempt_1405021984947_5394024_m_000523_1/part-m-00523 Namenode Audit log == 1. 2014-08-05 05:04:36,811 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 cmd=create src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0/part-m-00523 dst=null perm=user:group:rw-r- 2. 2014-08-05 05:04:53,112 INFO FSNamesystem.audit: ugi=* ip=ipaddress2 cmd=create src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1/part-m-00523 dst=null perm=user:group:rw-r- 3. 2014-08-05 05:05:13,001 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 cmd=rename src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0 dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523 perm=user:group:rwxr-x--- 4. 2014-08-05 05:05:13,004 INFO FSNamesystem.audit: ugi=* ip=ipaddress2 cmd=rename src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1 dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523 perm=user:group:rwxr-x--- After consulting our Hadoop core team, we was pointed out some HCat code does not participating in the two-phase commit protocol, for example in FileRecordWriterContainer.close(): {code} for (Map.EntryString, org.apache.hadoop.mapred.OutputCommitter entry : baseDynamicCommitters.entrySet()) { org.apache.hadoop.mapred.TaskAttemptContext currContext = dynamicContexts.get(entry.getKey()); OutputCommitter baseOutputCommitter = entry.getValue(); if (baseOutputCommitter.needsTaskCommit(currContext)) { baseOutputCommitter.commitTask(currContext); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8693) Separate out fair scheduler dependency from hadoop 0.23 shim
[ https://issues.apache.org/jira/browse/HIVE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193500#comment-14193500 ] Hive QA commented on HIVE-8693: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678709/HIVE-8693.1.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6637 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1599/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1599/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1599/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678709 - PreCommit-HIVE-TRUNK-Build Separate out fair scheduler dependency from hadoop 0.23 shim Key: HIVE-8693 URL: https://issues.apache.org/jira/browse/HIVE-8693 Project: Hive Issue Type: Bug Components: HiveServer2, Shims Affects Versions: 0.14.0, 0.15.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-8693.1.patch, HIVE-8693.1.patch As part of HIVE-8424 HiveServer2 uses Fair scheduler APIs to determine resource queue allocation for non-impersonation case. This adds a hard dependency of Yarn server jars for Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8689) handle overflows in statistics better
[ https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193503#comment-14193503 ] Sergey Shelukhin commented on HIVE-8689: [~hagleitn] 14? I will address the latest comment handle overflows in statistics better - Key: HIVE-8689 URL: https://issues.apache.org/jira/browse/HIVE-8689 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.14.0 Attachments: HIVE-8689.01.patch, HIVE-8689.patch Improve overflow checks in StatsAnnotation optimizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
[ https://issues.apache.org/jira/browse/HIVE-8313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193513#comment-14193513 ] Mithun Radhakrishnan commented on HIVE-8313: FWIW, the test failure doesn't look related to this change. Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator --- Key: HIVE-8313 URL: https://issues.apache.org/jira/browse/HIVE-8313 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-8313.1.patch, HIVE-8313.2.patch Consider the following query: {code:sql} SELECT foo, bar, goo, id FROM myTable WHERE id IN ( 'A', 'B', 'C', 'D', ... , 'ZZ' ); {code} One finds that when the IN clause has several thousand elements (and the table has several million rows), the query above takes orders-of-magnitude longer to run on Hive 0.12 than say Hive 0.10. I have a possibly incomplete fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8689) handle overflows in statistics better
[ https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-8689: --- Attachment: HIVE-8689.02.patch rebased, added ceil handle overflows in statistics better - Key: HIVE-8689 URL: https://issues.apache.org/jira/browse/HIVE-8689 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.14.0 Attachments: HIVE-8689.01.patch, HIVE-8689.02.patch, HIVE-8689.patch Improve overflow checks in StatsAnnotation optimizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8689) handle overflows in statistics better
[ https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193521#comment-14193521 ] Prasanth J commented on HIVE-8689: -- +1 handle overflows in statistics better - Key: HIVE-8689 URL: https://issues.apache.org/jira/browse/HIVE-8689 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.14.0 Attachments: HIVE-8689.01.patch, HIVE-8689.02.patch, HIVE-8689.patch Improve overflow checks in StatsAnnotation optimizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8689) handle overflows in statistics better
[ https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193522#comment-14193522 ] Prasanth J commented on HIVE-8689: -- [~sershe] minor nit: Can you remove the getMaxIfOverflow() method? Since we are using safeAdd, safeMultiply methods we don't need that anymore. handle overflows in statistics better - Key: HIVE-8689 URL: https://issues.apache.org/jira/browse/HIVE-8689 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.14.0 Attachments: HIVE-8689.01.patch, HIVE-8689.02.patch, HIVE-8689.patch Improve overflow checks in StatsAnnotation optimizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8675) Increase thrift server protocol test coverage
[ https://issues.apache.org/jira/browse/HIVE-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193531#comment-14193531 ] Hive QA commented on HIVE-8675: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678711/HIVE-8675.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6669 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1600/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1600/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1600/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678711 - PreCommit-HIVE-TRUNK-Build Increase thrift server protocol test coverage - Key: HIVE-8675 URL: https://issues.apache.org/jira/browse/HIVE-8675 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.14.0 Attachments: HIVE-8675.patch, HIVE-8675.patch, HIVE-8675.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats
[ https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-8671: - Fix Version/s: (was: 0.15.0) Overflow in estimate row count and data size with fetch column stats Key: HIVE-8671 URL: https://issues.apache.org/jira/browse/HIVE-8671 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Prasanth J Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8671.1.patch, HIVE-8671.2.patch, HIVE-8671.3.patch, HIVE-8671.4.patch, HIVE-8671.5.patch Overflow in row counts and data size for several TPC-DS queries. Interestingly the operators which have overflow end up running with a small parallelism. For instance Reducer 2 has an overflow but it only runs with parallelism of 2. {code} Reducer 2 Reduce Operator Tree: Group By Operator aggregations: sum(VALUE._col0) keys: KEY._col0 (type: string), KEY._col1 (type: string), KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col3 (type: string), _col3 (type: string) sort order: ++ Map-reduce partition columns: _col3 (type: string) Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: float), _col5 (type: double) Execution mode: vectorized {code} {code} VERTEX TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS INPUT_RECORDS OUTPUT_RECORDS Map 1 62 26.41 1,779,510 211,978,502 60,628,390 Map 5 14.28 6,950 138,098 138,098 Map 6 12.44 3,910 31 31 Reducer 2 2 22.69 61,320 60,628,390 69,182 Reducer 3 12.63 3,910 69,182 100 Reducer 4 11.01 1,180 100 100 {code} Query {code} explain select i_item_desc ,i_category ,i_class ,i_current_price ,i_item_id ,sum(ws_ext_sales_price) as itemrevenue ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over (partition by i_class) as revenueratio from web_sales ,item ,date_dim where web_sales.ws_item_sk = item.i_item_sk and item.i_category in ('Jewelry', 'Sports', 'Books') and web_sales.ws_sold_date_sk = date_dim.d_date_sk and date_dim.d_date between '2001-01-12' and '2001-02-11' group by i_item_id ,i_item_desc ,i_category ,i_class ,i_current_price order by i_category ,i_class ,i_item_id ,i_item_desc ,revenueratio limit 100 {code} Explain {code} STAGE PLANS: Stage: Stage-1 Tez Edges: Map 1 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE) Reducer 2 - Map 1 (SIMPLE_EDGE) Reducer 3 - Reducer 2 (SIMPLE_EDGE) Reducer 4 - Reducer 3 (SIMPLE_EDGE) DagName: mmokhtar_20141019164343_854cb757-01bd-40cb-843e-9ada7c5e6f38:1 Vertices: Map 1 Map Operator Tree: TableScan alias: web_sales filterExpr: ws_item_sk is not null (type: boolean) Statistics: Num rows: 21594638446 Data size: 2850189889652 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ws_item_sk is not null (type: boolean) Statistics: Num rows: 21594638446 Data size: 172746300152 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ws_item_sk (type: int), ws_ext_sales_price (type: float), ws_sold_date_sk (type: int) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 21594638446 Data size: 172746300152 Basic stats:
[jira] [Commented] (HIVE-7576) Add PartitionSpec support in HCatClient API
[ https://issues.apache.org/jira/browse/HIVE-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193534#comment-14193534 ] Hive QA commented on HIVE-7576: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678712/HIVE-7576.2.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1601/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1601/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1601/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-1601/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestSSL.java' Reverted 'itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java' Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java' Reverted 'service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java' Reverted 'service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java' Reverted 'service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java' ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/core/target hcatalog/streaming/target hcatalog/server-extensions/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target accumulo-handler/target hwi/target common/target common/src/gen service/target contrib/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1636066. At revision 1636066. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12678712 - PreCommit-HIVE-TRUNK-Build Add PartitionSpec support in HCatClient API --- Key: HIVE-7576 URL: https://issues.apache.org/jira/browse/HIVE-7576 Project: Hive Issue Type: Bug Components: HCatalog, Metastore Affects Versions: 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-7576.1.patch, HIVE-7576.2.patch HIVE-7223 adds support for PartitionSpecs in Hive Metastore. The HCatClient API must add support to fetch partitions, add partitions, etc.
[jira] [Commented] (HIVE-8689) handle overflows in statistics better
[ https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193537#comment-14193537 ] Gunther Hagleitner commented on HIVE-8689: -- +1 for .14 handle overflows in statistics better - Key: HIVE-8689 URL: https://issues.apache.org/jira/browse/HIVE-8689 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.14.0 Attachments: HIVE-8689.01.patch, HIVE-8689.02.patch, HIVE-8689.patch Improve overflow checks in StatsAnnotation optimizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
[ https://issues.apache.org/jira/browse/HIVE-8313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193540#comment-14193540 ] Gopal V commented on HIVE-8313: --- [~mithun]: are you planning to include this for 0.14? This would be a good addition. Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator --- Key: HIVE-8313 URL: https://issues.apache.org/jira/browse/HIVE-8313 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-8313.1.patch, HIVE-8313.2.patch Consider the following query: {code:sql} SELECT foo, bar, goo, id FROM myTable WHERE id IN ( 'A', 'B', 'C', 'D', ... , 'ZZ' ); {code} One finds that when the IN clause has several thousand elements (and the table has several million rows), the query above takes orders-of-magnitude longer to run on Hive 0.12 than say Hive 0.10. I have a possibly incomplete fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8695) TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages
Xiaobing Zhou created HIVE-8695: --- Summary: TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages Key: HIVE-8695 URL: https://issues.apache.org/jira/browse/HIVE-8695 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Xiaobing Zhou -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8695) TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages
[ https://issues.apache.org/jira/browse/HIVE-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HIVE-8695: Description: repo steps: {noformat} run mvn test -Phadoop-2 -Dtest=TestJdbcWithMiniKdc#testNegativeTokenAuth {noformat} , it fails since '*Failed to validate proxy privilege*' is expected error message and cause message, however, '*Error retrieving delegation token for user*' and '*is not allowed to impersonate*' are the returned exception. TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages -- Key: HIVE-8695 URL: https://issues.apache.org/jira/browse/HIVE-8695 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Xiaobing Zhou repo steps: {noformat} run mvn test -Phadoop-2 -Dtest=TestJdbcWithMiniKdc#testNegativeTokenAuth {noformat} , it fails since '*Failed to validate proxy privilege*' is expected error message and cause message, however, '*Error retrieving delegation token for user*' and '*is not allowed to impersonate*' are the returned exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
[ https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-8394: --- Status: Open (was: Patch Available) Ok, HIVE-8394.2.patch assumes FileOutputCommitters. Must switch to using the {{baseDynamicCommitters}} list instead. HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss. - Key: HIVE-8394 URL: https://issues.apache.org/jira/browse/HIVE-8394 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.1, 0.12.0, 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Priority: Critical Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch We've found situations in production where Pig queries using {{HCatStorer}}, dynamic partitioning and {{opt.multiquery=true}} that produce partitions in the output table, but the corresponding directories have no data files (in spite of Pig reporting non-zero records written to HDFS). I don't yet have a distilled test-case for this. Here's the code from FileOutputCommitterContainer after HIVE-7803: {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE} @Override public void commitTask(TaskAttemptContext context) throws IOException { String jobInfoStr = context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO); if (!dynamicPartitioningUsed) { //See HCATALOG-499 FileOutputFormatContainer.setWorkOutputPath(context); getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context)); } else if (jobInfoStr != null) { ArrayListString jobInfoList = (ArrayListString)HCatUtil.deserialize(jobInfoStr); org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = HCatMapRedUtil.createTaskAttemptContext(context); for (String jobStr : jobInfoList) { OutputJobInfo localJobInfo = (OutputJobInfo)HCatUtil.deserialize(jobStr); FileOutputCommitter committer = new FileOutputCommitter(new Path(localJobInfo.getLocation()), currTaskContext); committer.commitTask(currTaskContext); } } } {code} The serialized jobInfoList can't be retrieved, and hence the commit never completes. This is because Pig's MapReducePOStoreImpl deliberately clones both the TaskAttemptContext and the contained Configuration instance, thus separating the Configuration instances passed to {{FileOutputCommitterContainer::commitTask()}} and {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is unavailable to the Committer. One approach would have been to store state in the FileOutputFormatContainer. But that won't work since this is constructed via reflection in HCatOutputFormat (itself constructed via reflection by PigOutputFormat via HCatStorer). There's no guarantee that the instance is preserved. My only recourse seems to be to use a Singleton to store shared state. I'm loath to indulge in this brand of shenanigans. (Statics and container-reuse in Tez might not play well together, for instance.) It might work if we're careful about tearing down the singleton. Any other ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
[ https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-8394: --- Attachment: HIVE-8394.3.patch Updated patch. HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss. - Key: HIVE-8394 URL: https://issues.apache.org/jira/browse/HIVE-8394 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0, 0.14.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Priority: Critical Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch, HIVE-8394.3.patch We've found situations in production where Pig queries using {{HCatStorer}}, dynamic partitioning and {{opt.multiquery=true}} that produce partitions in the output table, but the corresponding directories have no data files (in spite of Pig reporting non-zero records written to HDFS). I don't yet have a distilled test-case for this. Here's the code from FileOutputCommitterContainer after HIVE-7803: {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE} @Override public void commitTask(TaskAttemptContext context) throws IOException { String jobInfoStr = context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO); if (!dynamicPartitioningUsed) { //See HCATALOG-499 FileOutputFormatContainer.setWorkOutputPath(context); getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context)); } else if (jobInfoStr != null) { ArrayListString jobInfoList = (ArrayListString)HCatUtil.deserialize(jobInfoStr); org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = HCatMapRedUtil.createTaskAttemptContext(context); for (String jobStr : jobInfoList) { OutputJobInfo localJobInfo = (OutputJobInfo)HCatUtil.deserialize(jobStr); FileOutputCommitter committer = new FileOutputCommitter(new Path(localJobInfo.getLocation()), currTaskContext); committer.commitTask(currTaskContext); } } } {code} The serialized jobInfoList can't be retrieved, and hence the commit never completes. This is because Pig's MapReducePOStoreImpl deliberately clones both the TaskAttemptContext and the contained Configuration instance, thus separating the Configuration instances passed to {{FileOutputCommitterContainer::commitTask()}} and {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is unavailable to the Committer. One approach would have been to store state in the FileOutputFormatContainer. But that won't work since this is constructed via reflection in HCatOutputFormat (itself constructed via reflection by PigOutputFormat via HCatStorer). There's no guarantee that the instance is preserved. My only recourse seems to be to use a Singleton to store shared state. I'm loath to indulge in this brand of shenanigans. (Statics and container-reuse in Tez might not play well together, for instance.) It might work if we're careful about tearing down the singleton. Any other ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8695) TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages
[ https://issues.apache.org/jira/browse/HIVE-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HIVE-8695: Attachment: HIVE-8695.1.patch After check, this is a result of HIVE-8557. Made a patch. Can anyone please review it? Thanks! TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages -- Key: HIVE-8695 URL: https://issues.apache.org/jira/browse/HIVE-8695 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Xiaobing Zhou Attachments: HIVE-8695.1.patch repo steps: {noformat} run mvn test -Phadoop-2 -Dtest=TestJdbcWithMiniKdc#testNegativeTokenAuth {noformat} , it fails since '*Failed to validate proxy privilege*' is expected error message and cause message, however, '*Error retrieving delegation token for user*' and '*is not allowed to impersonate*' are the returned exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8695) TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages
[ https://issues.apache.org/jira/browse/HIVE-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193572#comment-14193572 ] Xiaobing Zhou commented on HIVE-8695: - [~thejas] is it safe to do this change in this patch, since you were working on HIVE-8557? Thanks! TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages -- Key: HIVE-8695 URL: https://issues.apache.org/jira/browse/HIVE-8695 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Xiaobing Zhou Attachments: HIVE-8695.1.patch repo steps: {noformat} run mvn test -Phadoop-2 -Dtest=TestJdbcWithMiniKdc#testNegativeTokenAuth {noformat} , it fails since '*Failed to validate proxy privilege*' is expected error message and cause message, however, '*Error retrieving delegation token for user*' and '*is not allowed to impersonate*' are the returned exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7576) Add PartitionSpec support in HCatClient API
[ https://issues.apache.org/jira/browse/HIVE-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-7576: --- Attachment: HIVE-7576.3.patch Looks like there were a couple more changes that went in to TestHCatClient yesterday that made a rebase necessary. Rebased and reuploaded .3.patch. Add PartitionSpec support in HCatClient API --- Key: HIVE-7576 URL: https://issues.apache.org/jira/browse/HIVE-7576 Project: Hive Issue Type: Bug Components: HCatalog, Metastore Affects Versions: 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Priority: Critical Fix For: 0.14.0 Attachments: HIVE-7576.1.patch, HIVE-7576.2.patch, HIVE-7576.3.patch HIVE-7223 adds support for PartitionSpecs in Hive Metastore. The HCatClient API must add support to fetch partitions, add partitions, etc. using PartitionSpec semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails
[ https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193577#comment-14193577 ] Julian Hyde commented on HIVE-8656: --- I have updated CALCITE-448's description, and have (finally) found a repro case in pure Calcite. I think it is a minor issue, now that TypeConverter has been fixed in Hive. CBO: auto_join_filters fails Key: HIVE-8656 URL: https://issues.apache.org/jira/browse/HIVE-8656 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Sergey Shelukhin Assignee: Julian Hyde Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8656.patch Haven't looked why yet -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8584) Setting hive.exec.orc.default.compress to ZLIB will lead to orc file size delta byte(s) shorter on Windows than Linux
[ https://issues.apache.org/jira/browse/HIVE-8584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou resolved HIVE-8584. - Resolution: Invalid Setting hive.exec.orc.default.compress to ZLIB will lead to orc file size delta byte(s) shorter on Windows than Linux - Key: HIVE-8584 URL: https://issues.apache.org/jira/browse/HIVE-8584 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Environment: Windows Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Priority: Minor Attachments: HIVE-8584.1.patch, orc-win-none-1.dump, orc-win-none-2.dump, orc-win-snappy-1.dump, orc-win-snappy-2.dump, orc-win-zlib-1.dump, orc-win-zlib-2.dump, orc_analyze.q repo steps: 1. run query orc_analyze.q 2. hive --orcfiledump target_orc_file_generated run 1 and 2 on PST timezone on Linux, and one more time on other timezone e.g. CST on Windows. Compare two target orc file dumping. Windows orc file is 1 byte shorter than Linux one. That's the case even if running 1 and 2 on Windows for different timezones, however, no problem on Linux. The issue only exists by using ZLIB mode, eventually OS native compression lib is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8584) Setting hive.exec.orc.default.compress to ZLIB will lead to orc file size delta byte(s) shorter on Windows than Linux
[ https://issues.apache.org/jira/browse/HIVE-8584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193581#comment-14193581 ] Xiaobing Zhou commented on HIVE-8584: - Thanks all for comments. After deep investigation, ZLIB mode actually works fine for both platform if qtest output is exact same on both. There are other reasons led to output diff, which will be tracked by other JIRA. I'd mark this as invalid. Setting hive.exec.orc.default.compress to ZLIB will lead to orc file size delta byte(s) shorter on Windows than Linux - Key: HIVE-8584 URL: https://issues.apache.org/jira/browse/HIVE-8584 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Environment: Windows Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Priority: Minor Attachments: HIVE-8584.1.patch, orc-win-none-1.dump, orc-win-none-2.dump, orc-win-snappy-1.dump, orc-win-snappy-2.dump, orc-win-zlib-1.dump, orc-win-zlib-2.dump, orc_analyze.q repo steps: 1. run query orc_analyze.q 2. hive --orcfiledump target_orc_file_generated run 1 and 2 on PST timezone on Linux, and one more time on other timezone e.g. CST on Windows. Compare two target orc file dumping. Windows orc file is 1 byte shorter than Linux one. That's the case even if running 1 and 2 on Windows for different timezones, however, no problem on Linux. The issue only exists by using ZLIB mode, eventually OS native compression lib is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8680) Set Max Message for Binary Thrift endpoints
[ https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193582#comment-14193582 ] Hive QA commented on HIVE-8680: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12678713/HIVE-8680.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6668 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1602/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1602/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1602/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12678713 - PreCommit-HIVE-TRUNK-Build Set Max Message for Binary Thrift endpoints --- Key: HIVE-8680 URL: https://issues.apache.org/jira/browse/HIVE-8680 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-8680.patch, HIVE-8680.patch Thrift has a configuration open to restrict incoming message size. If we configure this we'll stop OOM'ing when someone sends us an HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
[ https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-8394: --- Attachment: (was: HIVE-8394.3.patch) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss. - Key: HIVE-8394 URL: https://issues.apache.org/jira/browse/HIVE-8394 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0, 0.14.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Priority: Critical Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch We've found situations in production where Pig queries using {{HCatStorer}}, dynamic partitioning and {{opt.multiquery=true}} that produce partitions in the output table, but the corresponding directories have no data files (in spite of Pig reporting non-zero records written to HDFS). I don't yet have a distilled test-case for this. Here's the code from FileOutputCommitterContainer after HIVE-7803: {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE} @Override public void commitTask(TaskAttemptContext context) throws IOException { String jobInfoStr = context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO); if (!dynamicPartitioningUsed) { //See HCATALOG-499 FileOutputFormatContainer.setWorkOutputPath(context); getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context)); } else if (jobInfoStr != null) { ArrayListString jobInfoList = (ArrayListString)HCatUtil.deserialize(jobInfoStr); org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = HCatMapRedUtil.createTaskAttemptContext(context); for (String jobStr : jobInfoList) { OutputJobInfo localJobInfo = (OutputJobInfo)HCatUtil.deserialize(jobStr); FileOutputCommitter committer = new FileOutputCommitter(new Path(localJobInfo.getLocation()), currTaskContext); committer.commitTask(currTaskContext); } } } {code} The serialized jobInfoList can't be retrieved, and hence the commit never completes. This is because Pig's MapReducePOStoreImpl deliberately clones both the TaskAttemptContext and the contained Configuration instance, thus separating the Configuration instances passed to {{FileOutputCommitterContainer::commitTask()}} and {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is unavailable to the Committer. One approach would have been to store state in the FileOutputFormatContainer. But that won't work since this is constructed via reflection in HCatOutputFormat (itself constructed via reflection by PigOutputFormat via HCatStorer). There's no guarantee that the instance is preserved. My only recourse seems to be to use a Singleton to store shared state. I'm loath to indulge in this brand of shenanigans. (Statics and container-reuse in Tez might not play well together, for instance.) It might work if we're careful about tearing down the singleton. Any other ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-7276) BaseSemanticAnalyzer.unescapeSQLString fails to parse Windows like path
[ https://issues.apache.org/jira/browse/HIVE-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou resolved HIVE-7276. - Resolution: Cannot Reproduce Resolved it since it's not reproducible any more. BaseSemanticAnalyzer.unescapeSQLString fails to parse Windows like path --- Key: HIVE-7276 URL: https://issues.apache.org/jira/browse/HIVE-7276 Project: Hive Issue Type: Bug Components: Query Processor, Windows Affects Versions: 0.13.0 Environment: Windows Server 2008 R2 Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Priority: Critical BaseSemanticAnalyzer.unescapeSQLString fails to parse windows-like path, e.g. C:\Users\xzhou\hworks. This will cause a large quantity of queries on windows to fail. For example, 'C:\Users\xzhou\hworks\workspace\hwx-hive-ws\hive\hcatalog\core\target\tmp\hive-junit-960740885870900' will be parsed as 'C:Usersxzhouhworksworkspacehwx-hive-wshivehcatalogcore arget mphive-junit-960740885870900', since \ is interpreted as start char in unicode string, e.g. \002 for delimiter, and thus swallowed. \0, \b, \n, \r, \t, \Z, and so on within normal Windows like path will also be swallowed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
[ https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-8394: --- Attachment: HIVE-8394.3.patch Minor logging adjustment. HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss. - Key: HIVE-8394 URL: https://issues.apache.org/jira/browse/HIVE-8394 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0, 0.14.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Priority: Critical Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch, HIVE-8394.3.patch We've found situations in production where Pig queries using {{HCatStorer}}, dynamic partitioning and {{opt.multiquery=true}} that produce partitions in the output table, but the corresponding directories have no data files (in spite of Pig reporting non-zero records written to HDFS). I don't yet have a distilled test-case for this. Here's the code from FileOutputCommitterContainer after HIVE-7803: {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE} @Override public void commitTask(TaskAttemptContext context) throws IOException { String jobInfoStr = context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO); if (!dynamicPartitioningUsed) { //See HCATALOG-499 FileOutputFormatContainer.setWorkOutputPath(context); getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context)); } else if (jobInfoStr != null) { ArrayListString jobInfoList = (ArrayListString)HCatUtil.deserialize(jobInfoStr); org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = HCatMapRedUtil.createTaskAttemptContext(context); for (String jobStr : jobInfoList) { OutputJobInfo localJobInfo = (OutputJobInfo)HCatUtil.deserialize(jobStr); FileOutputCommitter committer = new FileOutputCommitter(new Path(localJobInfo.getLocation()), currTaskContext); committer.commitTask(currTaskContext); } } } {code} The serialized jobInfoList can't be retrieved, and hence the commit never completes. This is because Pig's MapReducePOStoreImpl deliberately clones both the TaskAttemptContext and the contained Configuration instance, thus separating the Configuration instances passed to {{FileOutputCommitterContainer::commitTask()}} and {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is unavailable to the Committer. One approach would have been to store state in the FileOutputFormatContainer. But that won't work since this is constructed via reflection in HCatOutputFormat (itself constructed via reflection by PigOutputFormat via HCatStorer). There's no guarantee that the instance is preserved. My only recourse seems to be to use a Singleton to store shared state. I'm loath to indulge in this brand of shenanigans. (Statics and container-reuse in Tez might not play well together, for instance.) It might work if we're careful about tearing down the singleton. Any other ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8680) Set Max Message for Binary Thrift endpoints
[ https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193588#comment-14193588 ] Szehon Ho commented on HIVE-8680: - +1 thanks Brock Set Max Message for Binary Thrift endpoints --- Key: HIVE-8680 URL: https://issues.apache.org/jira/browse/HIVE-8680 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-8680.patch, HIVE-8680.patch Thrift has a configuration open to restrict incoming message size. If we configure this we'll stop OOM'ing when someone sends us an HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7511) Hive: output is incorrect if there are UTF-8 characters in where clause of a hive select query.
[ https://issues.apache.org/jira/browse/HIVE-7511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193589#comment-14193589 ] Xiaobing Zhou commented on HIVE-7511: - This can be resolved by applying java options, like -Dfile.encoding=UTF-8. Setting it as env variable(_JAVA_OPTIONS=-Dfile.encoding=UTF-8) or passing as java start argument both work fine. Hive: output is incorrect if there are UTF-8 characters in where clause of a hive select query. --- Key: HIVE-7511 URL: https://issues.apache.org/jira/browse/HIVE-7511 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Environment: Windows Server 2008 R2 Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Priority: Critical Attachments: HIVE-7511.1.patch When we put UTF-8 characters in where clause of a hive query the results are empty for where content like '%丄%' and results contain all rows for where content not like '%丄%'; even when few rows contain this character. Steps to reproduce: 1. Save a file called data.txt in the root container. The contents of the files are as follows. 190 丄f齄啊c狛䶴h䶴c狝 899 d狜狜㐁geg阿狚ea䶴eead狜e 137 齄鼾h狝ge㐀狛g狚阿 21﨩﨩e㐀c狛鼾d䶴﨨 767 﨩c﨩g狜㐁狜狛齄阿﨩狚齄﨨䶵狝﨨 281 﨨㐀啊aga啊c狝e鼾鼾 573 㐁䶴hc﨨b狝㐁﨩䶴狜丄hc齄 966 䶴丄狜﨨e狝eb狜㐁c㐀鼾﨩丄ga狚丄 565 䶵㐀﨩㐀bb狛ehd丄ea丄㐀 778 﨩㐁阿﨨狚bbea丄䶵丄狚鼾狚a䶵 363 gd齄a鼾a䶴b㐁㐁fg鼾 822 a阿狜䶵h䶵e狛h﨩gac狜阿㐀啊b 338 b齄㐁ff阿e狜e㐀ba齄 2. Execute the following queries to setup the table. a. CREATE TABLE hivetable(row INT, content STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ' t' LOCATION '/hivetable'; b. LOAD DATA INPATH 'wasb:///data.txt' OVERWRITE INTO TABLE hivetable; 3. create a query file query.hql with following contents INSERT OVERWRITE DIRECTORY 'wasb:///hiveoutput' select * from hivetable where content like '%丄%'; 4. even though few rows contains this character the output is empty. 5. change the contents of query.hql to INSERT OVERWRITE DIRECTORY 'wasb:///hiveoutput' select * from hivetable where content not like '%丄%'; 6. The output contains all rows including those containing the given character. 7. Similar results are observed when using where content = '丄f齄啊c狛䶴h䶴c狝'; 8. We get expected results when using where content like '%a%'; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8695) TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages
[ https://issues.apache.org/jira/browse/HIVE-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-8695: Status: Patch Available (was: Open) TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages -- Key: HIVE-8695 URL: https://issues.apache.org/jira/browse/HIVE-8695 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Xiaobing Zhou Attachments: HIVE-8695.1.patch repo steps: {noformat} run mvn test -Phadoop-2 -Dtest=TestJdbcWithMiniKdc#testNegativeTokenAuth {noformat} , it fails since '*Failed to validate proxy privilege*' is expected error message and cause message, however, '*Error retrieving delegation token for user*' and '*is not allowed to impersonate*' are the returned exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
[ https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-8394: --- Attachment: HIVE-8394.4.patch Now with more logging, and ASF header. HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss. - Key: HIVE-8394 URL: https://issues.apache.org/jira/browse/HIVE-8394 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0, 0.14.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Priority: Critical Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch, HIVE-8394.3.patch, HIVE-8394.4.patch We've found situations in production where Pig queries using {{HCatStorer}}, dynamic partitioning and {{opt.multiquery=true}} that produce partitions in the output table, but the corresponding directories have no data files (in spite of Pig reporting non-zero records written to HDFS). I don't yet have a distilled test-case for this. Here's the code from FileOutputCommitterContainer after HIVE-7803: {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE} @Override public void commitTask(TaskAttemptContext context) throws IOException { String jobInfoStr = context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO); if (!dynamicPartitioningUsed) { //See HCATALOG-499 FileOutputFormatContainer.setWorkOutputPath(context); getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context)); } else if (jobInfoStr != null) { ArrayListString jobInfoList = (ArrayListString)HCatUtil.deserialize(jobInfoStr); org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = HCatMapRedUtil.createTaskAttemptContext(context); for (String jobStr : jobInfoList) { OutputJobInfo localJobInfo = (OutputJobInfo)HCatUtil.deserialize(jobStr); FileOutputCommitter committer = new FileOutputCommitter(new Path(localJobInfo.getLocation()), currTaskContext); committer.commitTask(currTaskContext); } } } {code} The serialized jobInfoList can't be retrieved, and hence the commit never completes. This is because Pig's MapReducePOStoreImpl deliberately clones both the TaskAttemptContext and the contained Configuration instance, thus separating the Configuration instances passed to {{FileOutputCommitterContainer::commitTask()}} and {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is unavailable to the Committer. One approach would have been to store state in the FileOutputFormatContainer. But that won't work since this is constructed via reflection in HCatOutputFormat (itself constructed via reflection by PigOutputFormat via HCatStorer). There's no guarantee that the instance is preserved. My only recourse seems to be to use a Singleton to store shared state. I'm loath to indulge in this brand of shenanigans. (Statics and container-reuse in Tez might not play well together, for instance.) It might work if we're careful about tearing down the singleton. Any other ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
Mithun Radhakrishnan created HIVE-8696: -- Summary: HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient. Key: HIVE-8696 URL: https://issues.apache.org/jira/browse/HIVE-8696 Project: Hive Issue Type: Bug Components: HCatalog, Metastore Affects Versions: 0.13.1, 0.12.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the HCatClient API that log in through keytabs will fail without retry, when their TGTs expire. The fix is inbound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)