[jira] [Commented] (HIVE-12847) ORC file footer cache should be memory sensitive
[ https://issues.apache.org/jira/browse/HIVE-12847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103066#comment-15103066 ] Nemon Lou commented on HIVE-12847: -- The serialized size of ProtoBuf object is much smaller than the size of actual java object. Only 10% of actual java object in my test cases,so need to decrease the maximum weight of cache. > ORC file footer cache should be memory sensitive > > > Key: HIVE-12847 > URL: https://issues.apache.org/jira/browse/HIVE-12847 > Project: Hive > Issue Type: Improvement > Components: File Formats, ORC >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Nemon Lou > Attachments: HIVE-12847.patch > > > The size based footer cache can not control memory usage properly. > Having seen a HiveServer2 hang due to ORC file footer cache taking up too > much heap memory. > A simple query like "select * from orc_table limit 1" can make HiveServer2 > hang. > The input table has about 1000 ORC files and each ORC file owns about 2500 > stripes. > {noformat} > num #instances #bytes class name > -- >1: 21465360125758432120 > org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics >3: 122233301 8800797672 > org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics >5: 89439001 6439608072 > org.apache.hadoop.hive.ql.io.orc.OrcProto$IntegerStatistics >7: 2981300 262354400 > org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeInformation >9: 2981300 143102400 > org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics > 12: 2983691 71608584 > org.apache.hadoop.hive.ql.io.orc.ReaderImpl$StripeInformationImpl > 15: 809297121752 > org.apache.hadoop.hive.ql.io.orc.OrcProto$Type > 17:1032825783792 > org.apache.hadoop.mapreduce.lib.input.FileSplit > 20: 516413305024 > org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit > 21: 516413305024 org.apache.hadoop.hive.ql.io.orc.OrcSplit > 31: 1 413152 > [Lorg.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit; > 100: 1122 26928 org.apache.hadoop.hive.ql.io.orc.Metadata > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12827) Vectorization: VectorCopyRow/VectorAssignRow/VectorDeserializeRow assign needs explicit isNull[offset] modification
[ https://issues.apache.org/jira/browse/HIVE-12827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103060#comment-15103060 ] Hive QA commented on HIVE-12827: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782483/HIVE-12827.2.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10004 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6644/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6644/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6644/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782483 - PreCommit-HIVE-TRUNK-Build > Vectorization: VectorCopyRow/VectorAssignRow/VectorDeserializeRow assign > needs explicit isNull[offset] modification > --- > > Key: HIVE-12827 > URL: https://issues.apache.org/jira/browse/HIVE-12827 > Project: Hive > Issue Type: Bug >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-12827.2.patch > > > Some scenarios do set Double.NaN instead of isNull=true, but all types aren't > consistent. > Examples of un-set isNull for the valid values are > {code} > private class FloatReader extends AbstractDoubleReader { > FloatReader(int columnIndex) { > super(columnIndex); > } > @Override > void apply(VectorizedRowBatch batch, int batchIndex) throws IOException { > DoubleColumnVector colVector = (DoubleColumnVector) > batch.cols[columnIndex]; > if (deserializeRead.readCheckNull()) { > VectorizedBatchUtil.setNullColIsNullValue(colVector, batchIndex); > } else { > float value = deserializeRead.readFloat(); > colVector.vector[batchIndex] = (double) value; > } > } > } > {code} > {code} > private class DoubleCopyRow extends CopyRow { > DoubleCopyRow(int inColumnIndex, int outColumnIndex) { > super(inColumnIndex, outColumnIndex); > } > @Override > void copy(VectorizedRowBatch inBatch, int inBatchIndex, > VectorizedRowBatch outBatch, int outBatchIndex) { > DoubleColumnVector inColVector = (DoubleColumnVector) > inBatch.cols[inColumnIndex]; > DoubleColumnVector outColVector = (DoubleColumnVector) > outBatch.cols[outColumnIndex]; > if (inColVector.isRepeating) { > if (inColVector.noNulls || !inColVector.isNull[0]) { > outColVector.vector[outBatchIndex] = inColVector.vector[0]; > } else { > VectorizedBatchUtil.setNullColIsNullValue(outColVector, > outBatchIndex); > } > } else { > if (inColVector.noNulls || !inColVector.isNull[inBatchIndex]) { > outColVector.vector[outBatchIndex] = > inColVector.vector[inBatchIndex]; > } else { > VectorizedBatchUtil.setNullColIsNullValue(outColVector, > outBatchIndex); > } > } > } > } > {code} > {code} > private static abstract class VectorDoubleColumnAssign > extends VectorColumnAssignVectorBase { > protected void assignDouble(double value, int destIndex) { > outCol.vector[destIndex] = value; > } > } > {code} > The pattern to imitate would be the earlier code from VectorBatchUtil > {code} > case DOUBLE: { > DoubleColumnVector dcv = (DoubleColumnVector) batch.cols[offset + > colIndex]; > if (writableCol != null) { > dcv.vector[rowIndex] = ((DoubleWritable) writableCol).get(); > dcv.isNull[rowIndex] = false; > } else { >
[jira] [Updated] (HIVE-12847) ORC file footer cache should be memory sensitive
[ https://issues.apache.org/jira/browse/HIVE-12847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated HIVE-12847: - Description: The size based footer cache can not control memory usage properly. Having seen a HiveServer2 hang (full GC all the time) due to ORC file footer cache taking up too much heap memory. A simple query like "select * from orc_table limit 1" can make HiveServer2 hang. The input table has about 1000 ORC files and each ORC file owns about 2500 stripes. {noformat} num #instances #bytes class name -- 1: 21465360125758432120 org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics 3: 122233301 8800797672 org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics 5: 89439001 6439608072 org.apache.hadoop.hive.ql.io.orc.OrcProto$IntegerStatistics 7: 2981300 262354400 org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeInformation 9: 2981300 143102400 org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics 12: 2983691 71608584 org.apache.hadoop.hive.ql.io.orc.ReaderImpl$StripeInformationImpl 15: 809297121752 org.apache.hadoop.hive.ql.io.orc.OrcProto$Type 17:1032825783792 org.apache.hadoop.mapreduce.lib.input.FileSplit 20: 516413305024 org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit 21: 516413305024 org.apache.hadoop.hive.ql.io.orc.OrcSplit 31: 1 413152 [Lorg.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit; 100: 1122 26928 org.apache.hadoop.hive.ql.io.orc.Metadata {noformat} was: The size based footer cache can not control memory usage properly. Having seen a HiveServer2 hang due to ORC file footer cache taking up too much heap memory. A simple query like "select * from orc_table limit 1" can make HiveServer2 hang. The input table has about 1000 ORC files and each ORC file owns about 2500 stripes. {noformat} num #instances #bytes class name -- 1: 21465360125758432120 org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics 3: 122233301 8800797672 org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics 5: 89439001 6439608072 org.apache.hadoop.hive.ql.io.orc.OrcProto$IntegerStatistics 7: 2981300 262354400 org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeInformation 9: 2981300 143102400 org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics 12: 2983691 71608584 org.apache.hadoop.hive.ql.io.orc.ReaderImpl$StripeInformationImpl 15: 809297121752 org.apache.hadoop.hive.ql.io.orc.OrcProto$Type 17:1032825783792 org.apache.hadoop.mapreduce.lib.input.FileSplit 20: 516413305024 org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit 21: 516413305024 org.apache.hadoop.hive.ql.io.orc.OrcSplit 31: 1 413152 [Lorg.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit; 100: 1122 26928 org.apache.hadoop.hive.ql.io.orc.Metadata {noformat} > ORC file footer cache should be memory sensitive > > > Key: HIVE-12847 > URL: https://issues.apache.org/jira/browse/HIVE-12847 > Project: Hive > Issue Type: Improvement > Components: File Formats, ORC >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Nemon Lou > Attachments: HIVE-12847.patch > > > The size based footer cache can not control memory usage properly. > Having seen a HiveServer2 hang (full GC all the time) due to ORC file footer > cache taking up too much heap memory. > A simple query like "select * from orc_table limit 1" can make HiveServer2 > hang. > The input table has about 1000 ORC files and each ORC file owns about 2500 > stripes. > {noformat} > num #instances #bytes class name > -- >1: 21465360125758432120 > org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics >3: 122233301 8800797672 > org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics >5: 89439001 6439608072 > org.apache.hadoop.hive.ql.io.orc.OrcProto$IntegerStatistics >7: 2981300 262354400 > org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeInformation >9: 2981300 143102400 > org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics > 12: 2983691 71608584 > org.apache.hadoop.hive.ql.io.orc.ReaderImpl$StripeInformationImpl > 15: 809297121752 > org.apache.hadoop.hive.ql.io.orc.OrcProto$Type
[jira] [Comment Edited] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103489#comment-15103489 ] Vaibhav Gumashta edited comment on HIVE-12049 at 1/16/16 11:38 PM: --- A wip patch (this one has the new serde) while I'm cleaning up local commits to generate an end to end patch. I think it'll be easier to run UT if this jira and HIVE-12428 are merged eventually, however keeping them separate might be easier for review. [~rohitdholakia] [~thejas] what do you think? was (Author: vgumashta): A wip patch while I'm cleaning up local commits to generate an end to end patch. I think it'll be easier to run UT if this jira and HIVE-12428 are merged eventually, however keeping them separate might be easier for review. [~rohitdholakia] [~thejas] what do you think? > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-12049: Attachment: HIVE-12049.1.patch A wip patch while I'm cleaning up local commits to generate an end to end patch. I think it'll be easier to run UT if this jira and HIVE-12428 are merged eventually, however keeping them separate might be easier for review. [~rohitdholakia] [~thejas] what do you think? > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12481) Occasionally "Request is a replay" will be thrown from HS2
[ https://issues.apache.org/jira/browse/HIVE-12481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103497#comment-15103497 ] Lefty Leverenz commented on HIVE-12481: --- Update: Patch to master was reverted, so no doc needed (yet). > Occasionally "Request is a replay" will be thrown from HS2 > -- > > Key: HIVE-12481 > URL: https://issues.apache.org/jira/browse/HIVE-12481 > Project: Hive > Issue Type: Improvement > Components: Authentication >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-12481.2.patch, HIVE-12481.patch > > > We have seen the following exception thrown from HS2 in secured cluster when > many queries are running simultaneously on single HS2 instance. > The cause I can guess is that it happens that two queries are submitted at > the same time and have the same timestamp. For such case, we can add a retry > for the query. > > {noformat} > 2015-11-18 16:12:33,117 ERROR org.apache.thrift.transport.TSaslTransport: > SASL negotiation failure > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: Failure unspecified at GSS-API level (Mechanism level: Request > is a replay (34))] > at > com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:177) > at > org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539) > at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:356) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism > level: Request is a replay (34)) > at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:788) > at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342) > at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285) > at > com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:155) > ... 14 more > Caused by: KrbException: Request is a replay (34) > at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:308) > at sun.security.krb5.KrbApReq.(KrbApReq.java:144) > at > sun.security.jgss.krb5.InitSecContextToken.(InitSecContextToken.java:108) > at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:771) > ... 17 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12853) LLAP: localize permanent UDF jars to daemon and add them to classloader
[ https://issues.apache.org/jira/browse/HIVE-12853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103564#comment-15103564 ] Hive QA commented on HIVE-12853: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782642/HIVE-12853.02.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10021 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6653/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6653/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6653/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782642 - PreCommit-HIVE-TRUNK-Build > LLAP: localize permanent UDF jars to daemon and add them to classloader > --- > > Key: HIVE-12853 > URL: https://issues.apache.org/jira/browse/HIVE-12853 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12853.01.patch, HIVE-12853.02.patch, > HIVE-12853.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
[ https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103505#comment-15103505 ] Hive QA commented on HIVE-12661: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782611/HIVE-12661.12.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 9993 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-tez_bmj_schema_evolution.q-orc_merge5.q-vectorization_limit.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.TestTxnCommands2.testOrcPPD org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.hcatalog.listener.TestDbNotificationListener.sqlInsertPartition org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6651/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6651/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6651/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782611 - PreCommit-HIVE-TRUNK-Build > StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly > --- > > Key: HIVE-12661 > URL: https://issues.apache.org/jira/browse/HIVE-12661 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12661.01.patch, HIVE-12661.02.patch, > HIVE-12661.03.patch, HIVE-12661.04.patch, HIVE-12661.05.patch, > HIVE-12661.06.patch, HIVE-12661.07.patch, HIVE-12661.08.patch, > HIVE-12661.09.patch, HIVE-12661.10.patch, HIVE-12661.11.patch, > HIVE-12661.12.patch > > > PROBLEM: > Hive stats are autogathered properly till an 'analyze table [tablename] > compute statistics for columns' is run. Then it does not auto-update the > stats till the command is run again. repo: > {code} > set hive.stats.autogather=true; > set hive.stats.atomic=false ; > set hive.stats.collect.rawdatasize=true ; > set hive.stats.collect.scancols=false ; > set hive.stats.collect.tablekeys=false ; > set hive.stats.fetch.column.stats=true; > set hive.stats.fetch.partition.stats=true ; > set hive.stats.reliable=false ; > set hive.compute.query.using.stats=true; > CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( > 'orc.compress'='NONE') ; > insert into calendar values (2010), (2011), (2012); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > ++--+ > select max(year) from calendar; > | 2012 | > insert into calendar values (2013); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > | 2013 | > ++--+ > select max(year) from calendar; > | 2013 | > insert into calendar values (2014); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > insert into calendar values (2015); > select max(year) from calendar; > | 2014 | > insert into calendar values (2016), (2017), (2018); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > select max(year) from calendar; > | 2018 | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
[ https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12661: --- Attachment: HIVE-12661.final.patch > StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly > --- > > Key: HIVE-12661 > URL: https://issues.apache.org/jira/browse/HIVE-12661 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12661.01.patch, HIVE-12661.02.patch, > HIVE-12661.03.patch, HIVE-12661.04.patch, HIVE-12661.05.patch, > HIVE-12661.06.patch, HIVE-12661.07.patch, HIVE-12661.08.patch, > HIVE-12661.09.patch, HIVE-12661.10.patch, HIVE-12661.11.patch, > HIVE-12661.12.patch, HIVE-12661.final.patch > > > PROBLEM: > Hive stats are autogathered properly till an 'analyze table [tablename] > compute statistics for columns' is run. Then it does not auto-update the > stats till the command is run again. repo: > {code} > set hive.stats.autogather=true; > set hive.stats.atomic=false ; > set hive.stats.collect.rawdatasize=true ; > set hive.stats.collect.scancols=false ; > set hive.stats.collect.tablekeys=false ; > set hive.stats.fetch.column.stats=true; > set hive.stats.fetch.partition.stats=true ; > set hive.stats.reliable=false ; > set hive.compute.query.using.stats=true; > CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( > 'orc.compress'='NONE') ; > insert into calendar values (2010), (2011), (2012); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > ++--+ > select max(year) from calendar; > | 2012 | > insert into calendar values (2013); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > | 2013 | > ++--+ > select max(year) from calendar; > | 2013 | > insert into calendar values (2014); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > insert into calendar values (2015); > select max(year) from calendar; > | 2014 | > insert into calendar values (2016), (2017), (2018); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > select max(year) from calendar; > | 2018 | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12883) Support basic stats and column stats in table properties in HBaseStore
[ https://issues.apache.org/jira/browse/HIVE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12883: --- Attachment: HIVE-12883.01.patch > Support basic stats and column stats in table properties in HBaseStore > -- > > Key: HIVE-12883 > URL: https://issues.apache.org/jira/browse/HIVE-12883 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12883.01.patch > > > Need to add support for HBase store too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12875) Verify sem.getInputs() and sem.getOutputs()
[ https://issues.apache.org/jira/browse/HIVE-12875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103114#comment-15103114 ] Hive QA commented on HIVE-12875: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782506/HIVE-12875.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10004 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6645/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6645/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6645/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782506 - PreCommit-HIVE-TRUNK-Build > Verify sem.getInputs() and sem.getOutputs() > --- > > Key: HIVE-12875 > URL: https://issues.apache.org/jira/browse/HIVE-12875 > Project: Hive > Issue Type: Bug >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-12875.patch > > > For every partition entity object present in sem.getInputs() and > sem.getOutputs(), we must verify the appropriate Table in the list of > Entities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10632) Make sure TXN_COMPONENTS gets cleaned up if table is dropped before compaction.
[ https://issues.apache.org/jira/browse/HIVE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103288#comment-15103288 ] Eugene Koifman commented on HIVE-10632: --- presence of values_tmp_table_N is probably not triggering actual compactions because it has not delta files... since it's not an Acid table it shouldn't be in TXN_COMPONENTS in the first place. Need to investigate this. > Make sure TXN_COMPONENTS gets cleaned up if table is dropped before > compaction. > --- > > Key: HIVE-10632 > URL: https://issues.apache.org/jira/browse/HIVE-10632 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > > The compaction process will clean up entries in TXNS, > COMPLETED_TXN_COMPONENTS, TXN_COMPONENTS. If the table/partition is dropped > before compaction is complete there will be data left in these tables. Need > to investigate if there are other situations where this may happen and > address it. > see HIVE-10595 for additional info -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
[ https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103338#comment-15103338 ] Daniel Dai commented on HIVE-12863: --- This is broken by HIVE-12590. [~ashutoshc], can you check if this is the right fix? > fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union > - > > Key: HIVE-12863 > URL: https://issues.apache.org/jira/browse/HIVE-12863 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12863.01.patch, HIVE-12863.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
[ https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103349#comment-15103349 ] Daniel Dai commented on HIVE-12863: --- Missed the previous comments. If it is intended by HIVE-12590, and toLower should happen in storage layer, then I am fine with the fix. > fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union > - > > Key: HIVE-12863 > URL: https://issues.apache.org/jira/browse/HIVE-12863 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12863.01.patch, HIVE-12863.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103432#comment-15103432 ] Eugene Koifman commented on HIVE-12366: --- committed to master https://github.com/apache/hive/commit/aa0f8e062827245b05c353d02537e51b9957bf36 patch doesn't apply to branch-1/branch-2.0 > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, > HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, > HIVE-12366.15.patch, HIVE-12366.2.patch, HIVE-12366.3.patch, > HIVE-12366.4.patch, HIVE-12366.5.patch, HIVE-12366.6.patch, > HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103382#comment-15103382 ] Eugene Koifman commented on HIVE-12366: --- +1 patch 15 > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, > HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, > HIVE-12366.15.patch, HIVE-12366.2.patch, HIVE-12366.3.patch, > HIVE-12366.4.patch, HIVE-12366.5.patch, HIVE-12366.6.patch, > HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
[ https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103443#comment-15103443 ] Hive QA commented on HIVE-12657: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782609/HIVE-12657.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9974 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-tez_joins_explain.q-vector_decimal_aggregate.q-vector_groupby_mapjoin.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-tez_self_join.q-schema_evol_text_nonvec_mapwork_table.q-orc_vectorization_ppd.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6650/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6650/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6650/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782609 - PreCommit-HIVE-TRUNK-Build > selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8 > --- > > Key: HIVE-12657 > URL: https://issues.apache.org/jira/browse/HIVE-12657 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Sergey Shelukhin > Attachments: HIVE-12657.01.patch, HIVE-12657.patch > > > Encountered this issue when analysing test failures of HIVE-12609. > selectDistinctStar.q produces the following diff when I ran with java version > "1.7.0_55" and java version "1.8.0_60" > {code} > < 128 val_128 128 > --- > > 128 128 val_128 > 1770c1770 > < 224 val_224 224 > --- > > 224 224 val_224 > 1776c1776 > < 369 val_369 369 > --- > > 369 369 val_369 > 1799,1810c1799,1810 > < 146 val_146 146 val_146 146 val_146 2008-04-08 11 > < 150 val_150 150 val_150 150 val_150 2008-04-08 11 > < 213 val_213 213 val_213 213 val_213 2008-04-08 11 > < 238 val_238 238 val_238 238 val_238 2008-04-08 11 > < 255 val_255 255 val_255 255 val_255 2008-04-08 11 > < 273 val_273 273 val_273 273 val_273 2008-04-08 11 > < 278 val_278 278 val_278 278 val_278 2008-04-08 11 > < 311 val_311 311 val_311 311 val_311 2008-04-08 11 > < 401 val_401 401 val_401 401 val_401 2008-04-08 11 > < 406 val_406 406 val_406 406 val_406 2008-04-08 11 > < 66val_66 66 val_66 66 val_66 2008-04-08 11 > < 98val_98 98 val_98 98 val_98 2008-04-08 11 > --- > > 146 val_146 2008-04-08 11 146 val_146 146 val_146 > > 150 val_150 2008-04-08 11 150 val_150 150 val_150 > > 213 val_213 2008-04-08 11 213 val_213 213 val_213 > > 238 val_238 2008-04-08 11 238 val_238 238 val_238 > > 255 val_255 2008-04-08 11 255 val_255 255 val_255 > > 273 val_273 2008-04-08 11 273 val_273 273 val_273 > > 278 val_278 2008-04-08 11 278 val_278 278 val_278 > > 311 val_311 2008-04-08 11 311 val_311 311 val_311 > > 401 val_401 2008-04-08 11 401 val_401 401 val_401 > > 406 val_406 2008-04-08 11 406 val_406 406 val_406 > > 66val_66 2008-04-08 11 66 val_66 66 val_66 > > 98val_98 2008-04-08 11 98 val_98 98 val_98 > 4212c4212 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-12366: - Attachment: HIVE-12366.branch-2.0.patch HIVE-12366.branch-1.patch [~ekoifman] Please find the attached patches for branch-1 and branch-2.0 > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, > HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, > HIVE-12366.15.patch, HIVE-12366.2.patch, HIVE-12366.3.patch, > HIVE-12366.4.patch, HIVE-12366.5.patch, HIVE-12366.6.patch, > HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch, > HIVE-12366.branch-1.patch, HIVE-12366.branch-2.0.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11538) Add an option to skip init script while running tests
[ https://issues.apache.org/jira/browse/HIVE-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103507#comment-15103507 ] Lefty Leverenz commented on HIVE-11538: --- Removing the TODOC2.0 label, on the assumption that the documentation is correct and does not need any explanation of a profile-not-found warning emitted for "-Phadoop-2" -- that could be added easily enough, but the information is available here via the JIRA link in the doc. * [How do I modify the init script when testing? | https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ-HowdoImodifytheinitscriptwhentesting?] Thanks [~sladymon], and thanks for the clarification [~sershe]. > Add an option to skip init script while running tests > - > > Key: HIVE-11538 > URL: https://issues.apache.org/jira/browse/HIVE-11538 > Project: Hive > Issue Type: Improvement > Components: Testing Infrastructure >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 2.0.0 > > Attachments: HIVE-11538.2.patch, HIVE-11538.3.patch, HIVE-11538.patch > > > {{q_test_init.sql}} has grown over time. Now, it takes substantial amount of > time. When debugging a particular query which doesn't need such > initialization, this delay is annoyance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11538) Add an option to skip init script while running tests
[ https://issues.apache.org/jira/browse/HIVE-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-11538: -- Labels: (was: TODOC2.0) > Add an option to skip init script while running tests > - > > Key: HIVE-11538 > URL: https://issues.apache.org/jira/browse/HIVE-11538 > Project: Hive > Issue Type: Improvement > Components: Testing Infrastructure >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 2.0.0 > > Attachments: HIVE-11538.2.patch, HIVE-11538.3.patch, HIVE-11538.patch > > > {{q_test_init.sql}} has grown over time. Now, it takes substantial amount of > time. When debugging a particular query which doesn't need such > initialization, this delay is annoyance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12809) Vectorization: fast-path for coalesce if input.noNulls = true
[ https://issues.apache.org/jira/browse/HIVE-12809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103543#comment-15103543 ] Hive QA commented on HIVE-12809: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782629/HIVE-12809.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10021 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6652/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6652/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6652/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782629 - PreCommit-HIVE-TRUNK-Build > Vectorization: fast-path for coalesce if input.noNulls = true > - > > Key: HIVE-12809 > URL: https://issues.apache.org/jira/browse/HIVE-12809 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-12809.1.patch, HIVE-12809.2.patch > > > Coalesce can skip processing other columns, if all the input columns are > non-null. > Possibly retaining, isRepeating=true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103553#comment-15103553 ] Eugene Koifman commented on HIVE-12366: --- committed to branch-1 [~sershe] this wasn't marked for 2.0 but would be good to add. Is that OK? > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, > HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, > HIVE-12366.15.patch, HIVE-12366.2.patch, HIVE-12366.3.patch, > HIVE-12366.4.patch, HIVE-12366.5.patch, HIVE-12366.6.patch, > HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch, > HIVE-12366.branch-1.patch, HIVE-12366.branch-2.0.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
[ https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103599#comment-15103599 ] Pengcheng Xiong commented on HIVE-12661: address golden file diff in encryption_join_with_different_encryption_keys and TestDbNotificationListener in the final patch. TestTxnCommands2 is not reproducible. Checked in master and branch-2.0. Will address follow up issues in the subtasks. Thanks [~ashutoshc] for the review. > StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly > --- > > Key: HIVE-12661 > URL: https://issues.apache.org/jira/browse/HIVE-12661 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12661.01.patch, HIVE-12661.02.patch, > HIVE-12661.03.patch, HIVE-12661.04.patch, HIVE-12661.05.patch, > HIVE-12661.06.patch, HIVE-12661.07.patch, HIVE-12661.08.patch, > HIVE-12661.09.patch, HIVE-12661.10.patch, HIVE-12661.11.patch, > HIVE-12661.12.patch > > > PROBLEM: > Hive stats are autogathered properly till an 'analyze table [tablename] > compute statistics for columns' is run. Then it does not auto-update the > stats till the command is run again. repo: > {code} > set hive.stats.autogather=true; > set hive.stats.atomic=false ; > set hive.stats.collect.rawdatasize=true ; > set hive.stats.collect.scancols=false ; > set hive.stats.collect.tablekeys=false ; > set hive.stats.fetch.column.stats=true; > set hive.stats.fetch.partition.stats=true ; > set hive.stats.reliable=false ; > set hive.compute.query.using.stats=true; > CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( > 'orc.compress'='NONE') ; > insert into calendar values (2010), (2011), (2012); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > ++--+ > select max(year) from calendar; > | 2012 | > insert into calendar values (2013); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > | 2013 | > ++--+ > select max(year) from calendar; > | 2013 | > insert into calendar values (2014); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > insert into calendar values (2015); > select max(year) from calendar; > | 2014 | > insert into calendar values (2016), (2017), (2018); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > select max(year) from calendar; > | 2018 | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
[ https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103607#comment-15103607 ] Hive QA commented on HIVE-12863: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782668/HIVE-12863.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 10006 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_decimal_round.q-cbo_windowing.q-tez_schema_evolution.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.alterTable org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.createTable org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.dropTable org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.skewInfo org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreCached.alterTable org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreCached.createTable org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreCached.dropTable org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.alterTable org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.createTable org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.dropTable org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.getAllDbs org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.getAllTables org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.getDbsRegex org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.grantRevokeTablePrivileges org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.listTableGrants org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.tableStatistics org.apache.hadoop.hive.metastore.hbase.TestStorageDescriptorSharing.createManyPartitions org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6654/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6654/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6654/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 23 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782668 - PreCommit-HIVE-TRUNK-Build > fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union > - > > Key: HIVE-12863 > URL: https://issues.apache.org/jira/browse/HIVE-12863 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12863.01.patch, HIVE-12863.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-12366: -- Labels: TODOC1.3 (was: ) > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Wei Zheng >Assignee: Wei Zheng > Labels: TODOC1.3 > Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, > HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, > HIVE-12366.15.patch, HIVE-12366.2.patch, HIVE-12366.3.patch, > HIVE-12366.4.patch, HIVE-12366.5.patch, HIVE-12366.6.patch, > HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch, > HIVE-12366.branch-1.patch, HIVE-12366.branch-2.0.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6806) CREATE TABLE should support STORED AS AVRO
[ https://issues.apache.org/jira/browse/HIVE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wziyong updated HIVE-6806: -- Assignee: Jeremy Beard (was: wziyong) > CREATE TABLE should support STORED AS AVRO > -- > > Key: HIVE-6806 > URL: https://issues.apache.org/jira/browse/HIVE-6806 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Affects Versions: 0.12.0 >Reporter: Jeremy Beard >Assignee: Jeremy Beard >Priority: Minor > Labels: Avro > Fix For: 0.14.0 > > Attachments: HIVE-6806.1.patch, HIVE-6806.2.patch, HIVE-6806.3.patch, > HIVE-6806.patch > > > Avro is well established and widely used within Hive, however creating > Avro-backed tables requires the messy listing of the SerDe, InputFormat and > OutputFormat classes. > Similarly to HIVE-5783 for Parquet, Hive would be easier to use if it had > native Avro support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
[ https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12661: --- Fix Version/s: 2.1.0 2.0.0 > StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly > --- > > Key: HIVE-12661 > URL: https://issues.apache.org/jira/browse/HIVE-12661 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0, 1.2.1 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.0.0, 2.1.0 > > Attachments: HIVE-12661.01.patch, HIVE-12661.02.patch, > HIVE-12661.03.patch, HIVE-12661.04.patch, HIVE-12661.05.patch, > HIVE-12661.06.patch, HIVE-12661.07.patch, HIVE-12661.08.patch, > HIVE-12661.09.patch, HIVE-12661.10.patch, HIVE-12661.11.patch, > HIVE-12661.12.patch, HIVE-12661.final.patch > > > PROBLEM: > Hive stats are autogathered properly till an 'analyze table [tablename] > compute statistics for columns' is run. Then it does not auto-update the > stats till the command is run again. repo: > {code} > set hive.stats.autogather=true; > set hive.stats.atomic=false ; > set hive.stats.collect.rawdatasize=true ; > set hive.stats.collect.scancols=false ; > set hive.stats.collect.tablekeys=false ; > set hive.stats.fetch.column.stats=true; > set hive.stats.fetch.partition.stats=true ; > set hive.stats.reliable=false ; > set hive.compute.query.using.stats=true; > CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( > 'orc.compress'='NONE') ; > insert into calendar values (2010), (2011), (2012); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > ++--+ > select max(year) from calendar; > | 2012 | > insert into calendar values (2013); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > | 2013 | > ++--+ > select max(year) from calendar; > | 2013 | > insert into calendar values (2014); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > insert into calendar values (2015); > select max(year) from calendar; > | 2014 | > insert into calendar values (2016), (2017), (2018); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > select max(year) from calendar; > | 2018 | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103626#comment-15103626 ] Lefty Leverenz commented on HIVE-12366: --- Doc note: This adds *hive.txn.heartbeat.threadpool.size* to HiveConf.java in 1.3.0 and 2.1.0 (and perhaps 2.0.0) so it needs to be documented in Configuration Properties and the Hive Transactions doc. * [Configuration Properties -- Transactions and Compactor | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TransactionsandCompactor] * [Hive Transactions -- Configuration | https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration] > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Wei Zheng >Assignee: Wei Zheng > Labels: TODOC1.3 > Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, > HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, > HIVE-12366.15.patch, HIVE-12366.2.patch, HIVE-12366.3.patch, > HIVE-12366.4.patch, HIVE-12366.5.patch, HIVE-12366.6.patch, > HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch, > HIVE-12366.branch-1.patch, HIVE-12366.branch-2.0.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12851) Add slider security setting support to LLAP packager
[ https://issues.apache.org/jira/browse/HIVE-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103299#comment-15103299 ] Hive QA commented on HIVE-12851: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782677/HIVE-12851.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10019 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6649/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6649/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6649/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782677 - PreCommit-HIVE-TRUNK-Build > Add slider security setting support to LLAP packager > > > Key: HIVE-12851 > URL: https://issues.apache.org/jira/browse/HIVE-12851 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12851.2.patch, HIVE-12851.patch > > > {noformat} > "slider.hdfs.keytab.dir": "...", > "slider.am.login.keytab.name": "...", > "slider.keytab.principal.name": "..." > {noformat} > should be emitted into appConfig.json for Slider AM. Right now, they have to > be added manually on a secure cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)