[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf
[ https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909298#comment-13909298 ] Lefty Leverenz commented on HIVE-6037: -- But wait! There's more: * HIVE-3635: hive.lazysimple.extended_boolean_literal Synchronize HiveConf with hive-default.xml.template and support show conf - Key: HIVE-6037 URL: https://issues.apache.org/jira/browse/HIVE-6037 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Assignee: Navis Priority: Minor Fix For: 0.13.0 Attachments: CHIVE-6037.3.patch.txt, HIVE-6037.1.patch.txt, HIVE-6037.10.patch.txt, HIVE-6037.11.patch.txt, HIVE-6037.12.patch.txt, HIVE-6037.14.patch.txt, HIVE-6037.15.patch.txt, HIVE-6037.2.patch.txt, HIVE-6037.4.patch.txt, HIVE-6037.5.patch.txt, HIVE-6037.6.patch.txt, HIVE-6037.7.patch.txt, HIVE-6037.8.patch.txt, HIVE-6037.9.patch.txt, HIVE-6037.patch see HIVE-5879 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5954) SQL std auth - get_privilege_set should check role hierarchy
[ https://issues.apache.org/jira/browse/HIVE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5954: Attachment: HIVE-5954.1.patch HIVE-5954.1.patch - With this change, the current roles used in SQL standard auth also now includes roles in hierarchy. SQL std auth - get_privilege_set should check role hierarchy Key: HIVE-5954 URL: https://issues.apache.org/jira/browse/HIVE-5954 Project: Hive Issue Type: Sub-task Components: Authorization Reporter: Thejas M Nair Attachments: HIVE-5954.1.patch Original Estimate: 24h Remaining Estimate: 24h A role can belong to another role. But get_privilege_set in hive metastore api checks only the privileges of the immediate roles a user belongs to. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5954) SQL std auth - get_privilege_set should check role hierarchy
[ https://issues.apache.org/jira/browse/HIVE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5954: Status: Patch Available (was: Open) SQL std auth - get_privilege_set should check role hierarchy Key: HIVE-5954 URL: https://issues.apache.org/jira/browse/HIVE-5954 Project: Hive Issue Type: Sub-task Components: Authorization Reporter: Thejas M Nair Attachments: HIVE-5954.1.patch Original Estimate: 24h Remaining Estimate: 24h A role can belong to another role. But get_privilege_set in hive metastore api checks only the privileges of the immediate roles a user belongs to. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6203) Privileges of role granted indrectily to user is not applied
[ https://issues.apache.org/jira/browse/HIVE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909304#comment-13909304 ] Thejas M Nair commented on HIVE-6203: - [~navis] I have uploaded a patch to HIVE-5954 that updates get_privilege_set to indirect privileges through roles. Can you please review that one ? We won't need an additional thrift API with that patch. Privileges of role granted indrectily to user is not applied Key: HIVE-6203 URL: https://issues.apache.org/jira/browse/HIVE-6203 Project: Hive Issue Type: Bug Components: Authorization Reporter: Navis Assignee: Navis Attachments: HIVE-6203.1.patch.txt, HIVE-6203.2.patch.txt, HIVE-6203.3.patch.txt, HIVE-6203.4.patch.txt For example, {noformat} create role r1; create role r2; grant select on table eq to role r1; grant role r1 to role r2; grant role r2 to user admin; select * from eq limit 5; {noformat} admin - r2 - r1 - SEL on table eq but user admin fails to access table eq -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Hive 9 error in HiveServer and DataNucleus
Hi, We ran into this problem with Hive 0.9.0 trying to get tables info from HiveMetastore via HiveServer/Thrift. The problem seems to be fixed in Hive 11+, but our effort to locate the Jira has not been successfully. I appreciate if someone familiar with this problem can point out the Jira/Patch to us. We consistently hit this error when the number of calls to the HiveMetastore hits the hive.metastore.server.min.threads setting in the server's hive-site.xml . Error : Hive Client side error:Exception in thread main MetaException(message:Got exception: org.apache.thrift.transport.TTransportException null) at org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:785) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:615) Hive Server side error: 2014-02-13 08:28:27,866 INFO metastore.HiveMetaStore (HiveMetaStore.java:logInfo(385)) - 47: get_all_databases 2014-02-13 08:28:27,870 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run(182)) - Error occurred during processing of message. javax.jdo.JDOFatalUserException: Persistence Manager has been closed at org.datanucleus.jdo.JDOPersistenceManager.assertIsOpen(JDOPersistenceManager.java:2088) at org.datanucleus.jdo.JDOPersistenceManager.currentTransaction(JDOPersistenceManager.java:305) at org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:294) at org.apache.hadoop.hive.metastore.ObjectStore.getDatabases(ObjectStore.java:488) at org.apache.hadoop.hive.metastore.ObjectStore.getAllDatabases(ObjectStore.java:522) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at java.lang.reflect.Method.invoke(Method.java:611) at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111) at $Proxy1.getAllDatabases(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_all_databases(HiveMetaStore.java:660) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_all_databases.getResult(ThriftHiveMetastore.java:4749) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_all_databases.getResult(ThriftHiveMetastore.java:4737) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919) at java.lang.Thread.run(Thread.java:738) Thanks for your help, Tuong
[jira] [Updated] (HIVE-5954) SQL std auth - get_privilege_set should check role hierarchy
[ https://issues.apache.org/jira/browse/HIVE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5954: Attachment: HIVE-5954.2.patch HIVE-5954.2.patch - sorting show roles and show current roles output for deterministic results. SQL std auth - get_privilege_set should check role hierarchy Key: HIVE-5954 URL: https://issues.apache.org/jira/browse/HIVE-5954 Project: Hive Issue Type: Sub-task Components: Authorization Reporter: Thejas M Nair Attachments: HIVE-5954.1.patch, HIVE-5954.2.patch Original Estimate: 24h Remaining Estimate: 24h A role can belong to another role. But get_privilege_set in hive metastore api checks only the privileges of the immediate roles a user belongs to. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5235) Infinite loop with ORC file and Hive 0.11
[ https://issues.apache.org/jira/browse/HIVE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909321#comment-13909321 ] Iván de Prado commented on HIVE-5235: - The same issue appeared in a cluster with the following software: {noformat} Linux version 2.6.32-431.3.1.el6.x86_64 (mockbu...@x86-003.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Dec 13 06:58:20 EST 2013 java version 1.6.0_30 OpenJDK Runtime Environment (IcedTea6 1.13.1) (rhel-3.1.13.1.el6_5-x86_64) OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode) CDH 5 beta2 {noformat} The exeption I got when reading is: {noformat} 2014-02-22 09:47:44,912 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 3 kind DATA position: 10739597 length: 10739597 range: 0 offset: 24245452 limit: 24245452 range 0 = 0 to 10739597 uncompressed: 62451 to 62451 at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:46) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:287) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:422) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1193) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2240) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:105) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:56) at org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:182) at com.datasalt.pangool.tuplemr.mapred.lib.input.HCatTupleInputFormat$1.nextKeyValue(HCatTupleInputFormat.java:159) at com.datasalt.pangool.tuplemr.mapred.lib.input.DelegatingRecordReader.nextKeyValue(DelegatingRecordReader.java:89) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at com.datasalt.pangool.tuplemr.mapred.lib.input.DelegatingMapper.run(DelegatingMapper.java:50) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) {noformat} Infinite loop with ORC file and Hive 0.11 - Key: HIVE-5235 URL: https://issues.apache.org/jira/browse/HIVE-5235 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Environment: Gentoo linux with Hortonworks Hadoop hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d Reporter: Iván de Prado Priority: Blocker Attachments: gendata.py We are using Hive 0.11 with ORC file format and we get some tasks blocked in some kind of infinite loop. They keep working indefinitely when we set a huge task expiry timeout. If we the expiry time to 600 second, the taks fail because of not reporting progress, and finally, the Job fails. That is not consistent, and some times between jobs executions the behavior changes. It happen for different queries. We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks that is blocked keeps consuming 100% of CPU usage, and the stack trace is always the same consistently. Everything points to some kind of infinite loop. My guessing is that it has some relation to the ORC file. Maybe some pointer is not right when writing generating some kind of infinite loop when reading. Or maybe there is a bug in the reading stage. More information below. The stack trace: {noformat} main prio=10 tid=0x7f20a000a800 nid=0x1ed2 runnable [0x7f20a8136000] java.lang.Thread.State: RUNNABLE at java.util.zip.Inflater.inflateBytes(Native Method) at java.util.zip.Inflater.inflate(Inflater.java:256) - locked 0xf42a6ca0 (a java.util.zip.ZStreamRef) at org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64) at
[jira] [Commented] (HIVE-5235) Infinite loop with ORC file and Hive 0.11
[ https://issues.apache.org/jira/browse/HIVE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909323#comment-13909323 ] Iván de Prado commented on HIVE-5235: - Sorry, the Java version is: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) {noformat} Infinite loop with ORC file and Hive 0.11 - Key: HIVE-5235 URL: https://issues.apache.org/jira/browse/HIVE-5235 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Environment: Gentoo linux with Hortonworks Hadoop hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d Reporter: Iván de Prado Priority: Blocker Attachments: gendata.py We are using Hive 0.11 with ORC file format and we get some tasks blocked in some kind of infinite loop. They keep working indefinitely when we set a huge task expiry timeout. If we the expiry time to 600 second, the taks fail because of not reporting progress, and finally, the Job fails. That is not consistent, and some times between jobs executions the behavior changes. It happen for different queries. We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks that is blocked keeps consuming 100% of CPU usage, and the stack trace is always the same consistently. Everything points to some kind of infinite loop. My guessing is that it has some relation to the ORC file. Maybe some pointer is not right when writing generating some kind of infinite loop when reading. Or maybe there is a bug in the reading stage. More information below. The stack trace: {noformat} main prio=10 tid=0x7f20a000a800 nid=0x1ed2 runnable [0x7f20a8136000] java.lang.Thread.State: RUNNABLE at java.util.zip.Inflater.inflateBytes(Native Method) at java.util.zip.Inflater.inflate(Inflater.java:256) - locked 0xf42a6ca0 (a java.util.zip.ZStreamRef) at org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:128) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:143) at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVulong(SerializationUtils.java:54) at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVslong(SerializationUtils.java:65) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:66) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236) - eliminated 0xe1459700 (a org.apache.hadoop.mapred.MapTask$TrackedRecordReader) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216) - locked 0xe1459700 (a org.apache.hadoop.mapred.MapTask$TrackedRecordReader) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178) at org.apache.hadoop.mapred.Child.main(Child.java:249)
[jira] [Commented] (HIVE-5217) Add long polling to asynchronous execution in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909361#comment-13909361 ] Hive QA commented on HIVE-5217: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12630053/HIVE-5217.6.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5173 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hcatalog.hbase.TestHBaseDirectOutputFormat.org.apache.hcatalog.hbase.TestHBaseDirectOutputFormat {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1439/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1439/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12630053 Add long polling to asynchronous execution in HiveServer2 - Key: HIVE-5217 URL: https://issues.apache.org/jira/browse/HIVE-5217 Project: Hive Issue Type: Sub-task Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5217.2.patch, HIVE-5217.3.patch, HIVE-5217.4.patch, HIVE-5217.5.patch, HIVE-5217.6.patch, HIVE-5217.D12801.2.patch, HIVE-5217.D12801.3.patch, HIVE-5217.D12801.4.patch, HIVE-5217.D12801.5.patch, HIVE-5217.D12801.6.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. The client gets an operation handle which it can poll to check on the operation status. However, the polling frequency is entirely left to the client which can be resource inefficient. Long polling will solve this, by blocking the client request to check the operation status for a configurable amount of time (a new HS2 config) if the data is not available, but responding immediately if the data is available. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6487) PTest2 do not copy failed source directories
Brock Noland created HIVE-6487: -- Summary: PTest2 do not copy failed source directories Key: HIVE-6487 URL: https://issues.apache.org/jira/browse/HIVE-6487 Project: Hive Issue Type: Bug Reporter: Brock Noland Right now we copy the entire source directory for failed tests back to the master (up to 5). They are 10GB per so it takes a very long time. We should remove this feature. Remove the cp command from batch-exec.vm: https://github.com/apache/hive/blob/trunk/testutils/ptest2/src/main/resources/batch-exec.vm#L91 also don't publish the number of failed tests as a template variable: -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Precommit queue
Prices have been stable but do they do seem to be having capacity problems[1] as I have been unable to get machines when bidding $0.70/hr which is 10x the normal price. However, there is one item which is taking time for no good reason: https://issues.apache.org/jira/browse/HIVE-6487. I am going to put a break fix in for that this morning. Brock [1] An economically inclined observer would expect lower capacity to result in increased prices but that is not always the case. Whatever black magic they have running the spot market seems to have a lot of non-market logic. On Fri, Feb 21, 2014 at 10:40 PM, Thejas Nair the...@hortonworks.com wrote: Hi Brock, Do you know why the tests are taking almost twice as long in recent runs ? Is it related to the ec2 spot price spikes ? Thanks, Thejas On Fri, Feb 21, 2014 at 7:11 AM, Brock Noland br...@cloudera.com wrote: There was a ec2 spot price spike overnight which combined with everyone trying to get patches in for the branching has resulted in a massive queue: http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/ ~25 builds in the queue Brock -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
Re: Precommit queue
In other news, TestWithBeelineArgs created a 9GB hive.log, before failing, in the last run: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-1440/failed/TestBeeLineWithArgs/ On Sat, Feb 22, 2014 at 10:23 AM, Brock Noland br...@cloudera.com wrote: Prices have been stable but do they do seem to be having capacity problems[1] as I have been unable to get machines when bidding $0.70/hr which is 10x the normal price. However, there is one item which is taking time for no good reason: https://issues.apache.org/jira/browse/HIVE-6487. I am going to put a break fix in for that this morning. Brock [1] An economically inclined observer would expect lower capacity to result in increased prices but that is not always the case. Whatever black magic they have running the spot market seems to have a lot of non-market logic. On Fri, Feb 21, 2014 at 10:40 PM, Thejas Nair the...@hortonworks.com wrote: Hi Brock, Do you know why the tests are taking almost twice as long in recent runs ? Is it related to the ec2 spot price spikes ? Thanks, Thejas On Fri, Feb 21, 2014 at 7:11 AM, Brock Noland br...@cloudera.com wrote: There was a ec2 spot price spike overnight which combined with everyone trying to get patches in for the branching has resulted in a massive queue: http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/ ~25 builds in the queue Brock -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Commented] (HIVE-5232) Make JDBC use the new HiveServer2 async execution API by default
[ https://issues.apache.org/jira/browse/HIVE-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909426#comment-13909426 ] Hive QA commented on HIVE-5232: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12630024/HIVE-5232.3.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5175 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1440/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1440/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12630024 Make JDBC use the new HiveServer2 async execution API by default Key: HIVE-5232 URL: https://issues.apache.org/jira/browse/HIVE-5232 Project: Hive Issue Type: Sub-task Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5232.1.patch, HIVE-5232.2.patch, HIVE-5232.3.patch HIVE-4617 provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: HIVE-5217 HIVE-5229 HIVE-5230 HIVE-5441 There is also [HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel is done in a different thread - the original thread will now be able to detect the cancel, as opposed to the use of the blocking execute calls, in which statement.cancel will be a no-op. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6488) Investigate TestBeeLineWithArgs
Brock Noland created HIVE-6488: -- Summary: Investigate TestBeeLineWithArgs Key: HIVE-6488 URL: https://issues.apache.org/jira/browse/HIVE-6488 Project: Hive Issue Type: Bug Reporter: Brock Noland Priority: Blocker TestBeeLineWithArgs started taking many, many hours and eventually timing out which is one cause of precommit runs taking a long time. For now I have skipped it in for precommit tests so we should figure out what is going on so we can re-enable the test. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Precommit queue
Looking more into that test...I decided to skip it and open a blocker to look at it: https://issues.apache.org/jira/browse/HIVE-6488 On Sat, Feb 22, 2014 at 10:26 AM, Brock Noland br...@cloudera.com wrote: In other news, TestWithBeelineArgs created a 9GB hive.log, before failing, in the last run: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-1440/failed/TestBeeLineWithArgs/ On Sat, Feb 22, 2014 at 10:23 AM, Brock Noland br...@cloudera.com wrote: Prices have been stable but do they do seem to be having capacity problems[1] as I have been unable to get machines when bidding $0.70/hr which is 10x the normal price. However, there is one item which is taking time for no good reason: https://issues.apache.org/jira/browse/HIVE-6487. I am going to put a break fix in for that this morning. Brock [1] An economically inclined observer would expect lower capacity to result in increased prices but that is not always the case. Whatever black magic they have running the spot market seems to have a lot of non-market logic. On Fri, Feb 21, 2014 at 10:40 PM, Thejas Nair the...@hortonworks.com wrote: Hi Brock, Do you know why the tests are taking almost twice as long in recent runs ? Is it related to the ec2 spot price spikes ? Thanks, Thejas On Fri, Feb 21, 2014 at 7:11 AM, Brock Noland br...@cloudera.com wrote: There was a ec2 spot price spike overnight which combined with everyone trying to get patches in for the branching has resulted in a massive queue: http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/ ~25 builds in the queue Brock -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
Re: Review Request 18202: HiveServer2 running in http mode should support for doAs functionality
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18202/#review35218 --- service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java https://reviews.apache.org/r/18202/#comment65672 this check looks unnecessary. It is returning same object irrespective of value. - Thejas Nair On Feb. 18, 2014, 3:02 a.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18202/ --- (Updated Feb. 18, 2014, 3:02 a.m.) Review request for hive and Thejas Nair. Bugs: HIVE-6306 https://issues.apache.org/jira/browse/HIVE-6306 Repository: hive-git Description --- HiveServer2 running in http mode should support for doAs functionality Diffs - service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java d8ba3aa service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java PRE-CREATION service/src/java/org/apache/hive/service/auth/HttpCLIServiceProcessor.java PRE-CREATION service/src/java/org/apache/hive/service/auth/HttpCLIServiceUGIProcessor.java PRE-CREATION service/src/java/org/apache/hive/service/cli/session/SessionManager.java bfe0e7b service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java a6ff6ce service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java e77f043 shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 9e9a60d Diff: https://reviews.apache.org/r/18202/diff/ Testing --- Thanks, Vaibhav Gumashta
[jira] [Assigned] (HIVE-6487) PTest2 do not copy failed source directories
[ https://issues.apache.org/jira/browse/HIVE-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho reassigned HIVE-6487: --- Assignee: Szehon Ho PTest2 do not copy failed source directories Key: HIVE-6487 URL: https://issues.apache.org/jira/browse/HIVE-6487 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Szehon Ho Right now we copy the entire source directory for failed tests back to the master (up to 5). They are 10GB per so it takes a very long time. We should remove this feature. Remove the cp command from batch-exec.vm: https://github.com/apache/hive/blob/trunk/testutils/ptest2/src/main/resources/batch-exec.vm#L91 also don't publish the number of failed tests as a template variable: -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Precommit queue
FYI the changes 1) excluding TestWithBeelineArgs 2) not copying the source dir on failed tests has improved run times by ~1.5 hours. On Sat, Feb 22, 2014 at 10:34 AM, Brock Noland br...@cloudera.com wrote: Looking more into that test...I decided to skip it and open a blocker to look at it: https://issues.apache.org/jira/browse/HIVE-6488 On Sat, Feb 22, 2014 at 10:26 AM, Brock Noland br...@cloudera.com wrote: In other news, TestWithBeelineArgs created a 9GB hive.log, before failing, in the last run: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-1440/failed/TestBeeLineWithArgs/ On Sat, Feb 22, 2014 at 10:23 AM, Brock Noland br...@cloudera.com wrote: Prices have been stable but do they do seem to be having capacity problems[1] as I have been unable to get machines when bidding $0.70/hr which is 10x the normal price. However, there is one item which is taking time for no good reason: https://issues.apache.org/jira/browse/HIVE-6487. I am going to put a break fix in for that this morning. Brock [1] An economically inclined observer would expect lower capacity to result in increased prices but that is not always the case. Whatever black magic they have running the spot market seems to have a lot of non-market logic. On Fri, Feb 21, 2014 at 10:40 PM, Thejas Nair the...@hortonworks.com wrote: Hi Brock, Do you know why the tests are taking almost twice as long in recent runs ? Is it related to the ec2 spot price spikes ? Thanks, Thejas On Fri, Feb 21, 2014 at 7:11 AM, Brock Noland br...@cloudera.com wrote: There was a ec2 spot price spike overnight which combined with everyone trying to get patches in for the branching has resulted in a massive queue: http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/ ~25 builds in the queue Brock -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Commented] (HIVE-5761) Implement vectorized support for the DATE data type
[ https://issues.apache.org/jira/browse/HIVE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909509#comment-13909509 ] Hive QA commented on HIVE-5761: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12630055/HIVE-5761.3.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5192 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_date_funcs org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1443/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1443/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12630055 Implement vectorized support for the DATE data type --- Key: HIVE-5761 URL: https://issues.apache.org/jira/browse/HIVE-5761 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-5761.1.patch, HIVE-5761.2.patch, HIVE-5761.3.patch Add support to allow queries referencing DATE columns and expression results to run efficiently in vectorized mode. This should re-use the code for the the integer/timestamp types to the extent possible and beneficial. Include unit tests and end-to-end tests. Consider re-using or extending existing end-to-end tests for vectorized integer and/or timestamp operations. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf
[ https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909547#comment-13909547 ] Hive QA commented on HIVE-6459: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12630096/HIVE-6459.2.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5175 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1444/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1444/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12630096 Change the precison/scale for intermediate sum result in the avg() udf --- Key: HIVE-6459 URL: https://issues.apache.org/jira/browse/HIVE-6459 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-6459.1.patch, HIVE-6459.2.patch, HIVE-6459.patch The avg() udf, when applied to a decimal column, selects the precision/scale of the intermediate sum field as (p+4, s+4), which is the same for the precision/scale of the avg() result. However, the additional scale increase is unnecessary, and the problem of data overflow may occur. The requested change is that for the intermediate sum result, the precsion/scale is set to (p+10, s), which is consistent to sum() udf. The avg() result still keeps its precision/scale. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5217) Add long polling to asynchronous execution in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5217: Resolution: Fixed Release Note: Change to use Long polling as described in description. Adds hive.server2.long.polling.timeout configuration parameter, which can be used to configure how long the long poll waits. Most users would not need to bother about changing this configuration parameter. Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks for the contribution Vaibhav! Can you add see if the release note looks OK ? Please feel free to edit it. Add long polling to asynchronous execution in HiveServer2 - Key: HIVE-5217 URL: https://issues.apache.org/jira/browse/HIVE-5217 Project: Hive Issue Type: Sub-task Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5217.2.patch, HIVE-5217.3.patch, HIVE-5217.4.patch, HIVE-5217.5.patch, HIVE-5217.6.patch, HIVE-5217.D12801.2.patch, HIVE-5217.D12801.3.patch, HIVE-5217.D12801.4.patch, HIVE-5217.D12801.5.patch, HIVE-5217.D12801.6.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. The client gets an operation handle which it can poll to check on the operation status. However, the polling frequency is entirely left to the client which can be resource inefficient. Long polling will solve this, by blocking the client request to check the operation status for a configurable amount of time (a new HS2 config) if the data is not available, but responding immediately if the data is available. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6489) Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership
Joseph Warren Rao IV created HIVE-6489: -- Summary: Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership Key: HIVE-6489 URL: https://issues.apache.org/jira/browse/HIVE-6489 Project: Hive Issue Type: Bug Components: Authorization, Clients, Import/Export Affects Versions: 0.12.0, 0.11.0, 0.10.0, 0.9.0 Environment: OS and hardware are irrelevant. Tested and reproduced on multiple configurations, including SLES, RHEL, VM, Teradata Hadoop Appliance, HDP 1.1, HDP 1.3.2, HDP 2.0. Reporter: Joseph Warren Rao IV Priority: Minor Data uploaded by user via the Hive client with the LOAD DATA LOCAL INPATH method will have group ownership of the hdfs://tmp/hive-user instead of the primary group that user belongs to. The group ownership of the hdfs://tmp/hive-user is, by default, the group that the user running the hadoop daemons run under. This means that, on a Hadoop system with default file permissions of 770, any data loaded to hive via the LOAD DATA LOCAL INPATH method by one user cannot be seen by another user in the same group until the group ownership is manually changed in Hive's internal directory, or the group ownership is manually changed on hdfs://tmp/hive-user. This problem is not present with the LOAD DATA INPATH method, or by using regular HDFS loads. Steps to reproduce the problem on a pseudodistributed Hadoop cluster: - In hdfs-site.xml, modify the umask to 007 (meaning that default permissions on files are 770). The property changes names in Hadoop 2.0 but used to be called dfs.umaskmode. - Restart hdfs - Create a group called testgroup. - Create two users that have testgroup as their primary group. Call them testuser1 and testuser2 - Create a test file containing Hello World and call it test.txt. It should be stored on the local filesystem. - Create a table called testtable in Hive using testuser1. Give it a single string column, textfile format, comma delimited fields. - Have testuser1 use the LOAD DATA LOCAL INPATH command to load test.txt into testtable. - Attempt to read testtable using testuser2. The read will fail on a permissions error, when it should not. - Examine the contents of the hdfs://apps/hive/warehouse/testtable directory. The file will belong to the hadoop or users or analogous group, instead of the correct group testgroup. It will have correct permissions of 770. - Change the group ownership of the folder hdfs://tmp/hive-testuser1 to testgroup. - Repeat the data load. testuser2 will now be able to correctly read the data, and the file will have the correct group ownership. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6489) Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership
[ https://issues.apache.org/jira/browse/HIVE-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Rao updated HIVE-6489: -- Component/s: (was: Authorization) (was: Clients) Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership - Key: HIVE-6489 URL: https://issues.apache.org/jira/browse/HIVE-6489 Project: Hive Issue Type: Bug Components: Import/Export Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0 Environment: OS and hardware are irrelevant. Tested and reproduced on multiple configurations, including SLES, RHEL, VM, Teradata Hadoop Appliance, HDP 1.1, HDP 1.3.2, HDP 2.0. Reporter: Joe Rao Priority: Minor Original Estimate: 24h Remaining Estimate: 24h Data uploaded by user via the Hive client with the LOAD DATA LOCAL INPATH method will have group ownership of the hdfs://tmp/hive-user instead of the primary group that user belongs to. The group ownership of the hdfs://tmp/hive-user is, by default, the group that the user running the hadoop daemons run under. This means that, on a Hadoop system with default file permissions of 770, any data loaded to hive via the LOAD DATA LOCAL INPATH method by one user cannot be seen by another user in the same group until the group ownership is manually changed in Hive's internal directory, or the group ownership is manually changed on hdfs://tmp/hive-user. This problem is not present with the LOAD DATA INPATH method, or by using regular HDFS loads. Steps to reproduce the problem on a pseudodistributed Hadoop cluster: - In hdfs-site.xml, modify the umask to 007 (meaning that default permissions on files are 770). The property changes names in Hadoop 2.0 but used to be called dfs.umaskmode. - Restart hdfs - Create a group called testgroup. - Create two users that have testgroup as their primary group. Call them testuser1 and testuser2 - Create a test file containing Hello World and call it test.txt. It should be stored on the local filesystem. - Create a table called testtable in Hive using testuser1. Give it a single string column, textfile format, comma delimited fields. - Have testuser1 use the LOAD DATA LOCAL INPATH command to load test.txt into testtable. - Attempt to read testtable using testuser2. The read will fail on a permissions error, when it should not. - Examine the contents of the hdfs://apps/hive/warehouse/testtable directory. The file will belong to the hadoop or users or analogous group, instead of the correct group testgroup. It will have correct permissions of 770. - Change the group ownership of the folder hdfs://tmp/hive-testuser1 to testgroup. - Repeat the data load. testuser2 will now be able to correctly read the data, and the file will have the correct group ownership. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table
[ https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909582#comment-13909582 ] Hive QA commented on HIVE-6473: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12630117/HIVE-6473.0.patch.txt {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5178 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_bulk org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_handler_bulk org.apache.hadoop.hive.cli.TestHBaseMinimrCliDriver.testCliDriver_hbase_bulk org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver_generatehfiles_require_family_path {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1446/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1446/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12630117 Allow writing HFiles via HBaseStorageHandler table -- Key: HIVE-6473 URL: https://issues.apache.org/jira/browse/HIVE-6473 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HIVE-6473.0.patch.txt Generating HFiles for bulkload into HBase could be more convenient. Right now we require the user to register a new table with the appropriate output format. This patch allows the exact same functionality, but through an existing table managed by the HBaseStorageHandler. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14950/#review35234 --- jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java https://reviews.apache.org/r/14950/#comment65691 I think we should retry on exceptions that indicate an network connection error. TProtocolException seems to be the exception that is thrown in such cases. - Thejas Nair On Feb. 20, 2014, 9:18 a.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14950/ --- (Updated Feb. 20, 2014, 9:18 a.m.) Review request for hive and Thejas Nair. Repository: hive-git Description --- Should be applied on top of: HIVE-5217 [Add long polling to asynchronous execution in HiveServer2] HIVE-5229 [Better thread management for HiveServer2 async threads] HIVE-5230 [Better error reporting by async threads in HiveServer2] HIVE-5441 [Async query execution doesn't return resultset status] Diffs - jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 Diff: https://reviews.apache.org/r/14950/diff/ Testing --- TestJdbcDriver2 Thanks, Vaibhav Gumashta
[jira] [Commented] (HIVE-6475) Implement support for appending to mutable tables in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909617#comment-13909617 ] Hive QA commented on HIVE-6475: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12630129/HIVE-6475.patch {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 5179 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_scriptfile1 org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hive.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatExternalDynamicCustomLocation org.apache.hive.hcatalog.mapreduce.TestHCatMutableDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.mapreduce.TestHCatMutableDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hive.hcatalog.pig.TestHCatStorerWrapper.testStoreExternalTableWithExternalDir {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1447/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1447/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12630129 Implement support for appending to mutable tables in HCatalog - Key: HIVE-6475 URL: https://issues.apache.org/jira/browse/HIVE-6475 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6475.patch Part of HIVE-6405, this is the implementation of the append feature on the HCatalog side. If a table is mutable, we must support being able to append to existing data instead of erroring out as a duplicate publish. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6393) Support unqualified column references in Joining conditions
[ https://issues.apache.org/jira/browse/HIVE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6393: Status: Patch Available (was: Open) Support unqualified column references in Joining conditions --- Key: HIVE-6393 URL: https://issues.apache.org/jira/browse/HIVE-6393 Project: Hive Issue Type: Improvement Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-6393.1.patch, HIVE-6393.2.patch Support queries of the form: {noformat} create table r1(a int); create table r2(b); select a, b from r1 join r2 on a = b {noformat} This becomes more useful in old style syntax: {noformat} select a, b from r1, r2 where a = b {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6393) Support unqualified column references in Joining conditions
[ https://issues.apache.org/jira/browse/HIVE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6393: Status: Open (was: Patch Available) Support unqualified column references in Joining conditions --- Key: HIVE-6393 URL: https://issues.apache.org/jira/browse/HIVE-6393 Project: Hive Issue Type: Improvement Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-6393.1.patch, HIVE-6393.2.patch Support queries of the form: {noformat} create table r1(a int); create table r2(b); select a, b from r1 join r2 on a = b {noformat} This becomes more useful in old style syntax: {noformat} select a, b from r1, r2 where a = b {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6393) Support unqualified column references in Joining conditions
[ https://issues.apache.org/jira/browse/HIVE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6393: Attachment: HIVE-6393.2.patch Support unqualified column references in Joining conditions --- Key: HIVE-6393 URL: https://issues.apache.org/jira/browse/HIVE-6393 Project: Hive Issue Type: Improvement Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-6393.1.patch, HIVE-6393.2.patch Support queries of the form: {noformat} create table r1(a int); create table r2(b); select a, b from r1 join r2 on a = b {noformat} This becomes more useful in old style syntax: {noformat} select a, b from r1, r2 where a = b {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18293: HIVE-6393: Support unqualified column references in Joining conditions
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18293/ --- (Updated Feb. 23, 2014, 3:28 a.m.) Review request for hive, Ashutosh Chauhan and Gunther Hagleitner. Changes --- fix .out files Bugs: HIVE-6393 https://issues.apache.org/jira/browse/HIVE-6393 Repository: hive-git Description --- Support queries of the form: create table r1(a int); create table r2(b); select a, b from r1 join r2 on a = b This becomes more useful in old style syntax: select a, b from r1, r2 where a = b Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java a01aa0e ql/src/test/org/apache/hadoop/hive/ql/parse/TestQBJoinTreeApplyPredicate.java 9e77949 ql/src/test/queries/clientpositive/join_cond_pushdown_unqual1.q PRE-CREATION ql/src/test/queries/clientpositive/join_cond_pushdown_unqual2.q PRE-CREATION ql/src/test/queries/clientpositive/join_cond_pushdown_unqual3.q PRE-CREATION ql/src/test/queries/clientpositive/join_cond_pushdown_unqual4.q PRE-CREATION ql/src/test/queries/clientpositive/subquery_unqualcolumnrefs.q PRE-CREATION ql/src/test/results/clientpositive/join_cond_pushdown_unqual1.q.out PRE-CREATION ql/src/test/results/clientpositive/join_cond_pushdown_unqual2.q.out PRE-CREATION ql/src/test/results/clientpositive/join_cond_pushdown_unqual3.q.out PRE-CREATION ql/src/test/results/clientpositive/join_cond_pushdown_unqual4.q.out PRE-CREATION ql/src/test/results/clientpositive/subquery_unqualcolumnrefs.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18293/diff/ Testing --- added new tests ran all existing join tests Thanks, Harish Butani
[jira] [Updated] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf
[ https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6459: -- Attachment: HIVE-6459.3.patch Patch #3 again fixed one of test failures. The other one test has been flaky and unrelated. Change the precison/scale for intermediate sum result in the avg() udf --- Key: HIVE-6459 URL: https://issues.apache.org/jira/browse/HIVE-6459 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-6459.1.patch, HIVE-6459.2.patch, HIVE-6459.3.patch, HIVE-6459.patch The avg() udf, when applied to a decimal column, selects the precision/scale of the intermediate sum field as (p+4, s+4), which is the same for the precision/scale of the avg() result. However, the additional scale increase is unnecessary, and the problem of data overflow may occur. The requested change is that for the intermediate sum result, the precsion/scale is set to (p+10, s), which is consistent to sum() udf. The avg() result still keeps its precision/scale. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6429) MapJoinKey has large memory overhead in typical cases
[ https://issues.apache.org/jira/browse/HIVE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909683#comment-13909683 ] Hive QA commented on HIVE-6429: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12630468/HIVE-6429.04.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5175 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2 org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1448/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1448/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12630468 MapJoinKey has large memory overhead in typical cases - Key: HIVE-6429 URL: https://issues.apache.org/jira/browse/HIVE-6429 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6429.01.patch, HIVE-6429.02.patch, HIVE-6429.03.patch, HIVE-6429.04.patch, HIVE-6429.WIP.patch, HIVE-6429.patch The only thing that MJK really needs it hashCode and equals (well, and construction), so there's no need to have array of writables in there. Assuming all the keys for a table have the same structure, for the common case where keys are primitive types, we can store something like a byte array combination of keys to reduce the memory usage. Will probably speed up compares too. -- This message was sent by Atlassian JIRA (v6.1.5#6160)