date:20140222


 [ 
https://issues.apache.org/jira/browse/HIVE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5954:


Attachment: HIVE-5954.1.patch

HIVE-5954.1.patch - With this change, the current roles used in SQL standard 
auth also now includes roles in hierarchy.

 SQL std auth - get_privilege_set should check role hierarchy
 

 Key: HIVE-5954
 URL: https://issues.apache.org/jira/browse/HIVE-5954
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
 Attachments: HIVE-5954.1.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 A role can belong to another role. But get_privilege_set in hive metastore 
 api checks only the privileges of the immediate roles a user belongs to.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5954) SQL std auth - get_privilege_set should check role hierarchy


 [ 
https://issues.apache.org/jira/browse/HIVE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5954:


Status: Patch Available  (was: Open)

 SQL std auth - get_privilege_set should check role hierarchy
 

 Key: HIVE-5954
 URL: https://issues.apache.org/jira/browse/HIVE-5954
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
 Attachments: HIVE-5954.1.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 A role can belong to another role. But get_privilege_set in hive metastore 
 api checks only the privileges of the immediate roles a user belongs to.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6203) Privileges of role granted indrectily to user is not applied


[ 
https://issues.apache.org/jira/browse/HIVE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909304#comment-13909304
 ] 

Thejas M Nair commented on HIVE-6203:
-

[~navis] I have uploaded a patch to HIVE-5954 that updates get_privilege_set to 
indirect privileges through roles. Can you please review that one ? We won't 
need an additional thrift API with that patch.



 Privileges of role granted indrectily to user is not applied
 

 Key: HIVE-6203
 URL: https://issues.apache.org/jira/browse/HIVE-6203
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6203.1.patch.txt, HIVE-6203.2.patch.txt, 
 HIVE-6203.3.patch.txt, HIVE-6203.4.patch.txt


 For example, 
 {noformat}
 create role r1;
 create role r2;
 grant select on table eq to role r1;
 grant role r1 to role r2;
 grant role r2 to user admin;
 select * from eq limit 5;
 {noformat}
 admin - r2 - r1 - SEL on table eq
 but user admin fails to access table eq



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Hive 9 error in HiveServer and DataNucleus

2014-02-22 Thread Tuong Tr.

Hi,

We ran into this problem with Hive 0.9.0 trying to get tables info from 
HiveMetastore via HiveServer/Thrift.  The problem seems to be fixed in Hive 
11+, but our effort to locate the Jira has not been successfully.  I appreciate 
if someone familiar with this problem can point out the Jira/Patch to us.

We consistently hit this error when the number of calls to the HiveMetastore 
hits the hive.metastore.server.min.threads setting in the server's 
hive-site.xml . 

Error :

Hive Client side error:Exception in thread main MetaException(message:Got 
exception: org.apache.thrift.transport.TTransportException null)
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:785)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:615)


Hive Server side error:
2014-02-13 08:28:27,866 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(385)) - 47: get_all_databases
2014-02-13 08:28:27,870 ERROR server.TThreadPoolServer 
(TThreadPoolServer.java:run(182)) - Error occurred during processing of message.
javax.jdo.JDOFatalUserException: Persistence Manager has been closed
at 
org.datanucleus.jdo.JDOPersistenceManager.assertIsOpen(JDOPersistenceManager.java:2088)
at 
org.datanucleus.jdo.JDOPersistenceManager.currentTransaction(JDOPersistenceManager.java:305)
at 
org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:294)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getDatabases(ObjectStore.java:488)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getAllDatabases(ObjectStore.java:522)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
at $Proxy1.getAllDatabases(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_all_databases(HiveMetaStore.java:660)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_all_databases.getResult(ThriftHiveMetastore.java:4749)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_all_databases.getResult(ThriftHiveMetastore.java:4737)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
at 
org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
at java.lang.Thread.run(Thread.java:738)



Thanks for your help,
Tuong

[jira] [Updated] (HIVE-5954) SQL std auth - get_privilege_set should check role hierarchy


 [ 
https://issues.apache.org/jira/browse/HIVE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5954:


Attachment: HIVE-5954.2.patch

HIVE-5954.2.patch - sorting show roles and show current roles output for 
deterministic results.


 SQL std auth - get_privilege_set should check role hierarchy
 

 Key: HIVE-5954
 URL: https://issues.apache.org/jira/browse/HIVE-5954
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
 Attachments: HIVE-5954.1.patch, HIVE-5954.2.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 A role can belong to another role. But get_privilege_set in hive metastore 
 api checks only the privileges of the immediate roles a user belongs to.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5235) Infinite loop with ORC file and Hive 0.11

2014-02-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909321#comment-13909321
 ] 

Iván de Prado commented on HIVE-5235:
-

The same issue appeared in a cluster with the following software:

{noformat}
Linux version 2.6.32-431.3.1.el6.x86_64 
(mockbu...@x86-003.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 
4.4.7-4) (GCC) ) #1 SMP Fri Dec 13 06:58:20 EST 2013
java version 1.6.0_30
OpenJDK Runtime Environment (IcedTea6 1.13.1) (rhel-3.1.13.1.el6_5-x86_64)
OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)
CDH 5 beta2
{noformat}

The exeption I got when reading is:

{noformat}
2014-02-22 09:47:44,912 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.io.EOFException: Read past end of RLE integer 
from compressed stream Stream for column 3 kind DATA position: 10739597 length: 
10739597 range: 0 offset: 24245452 limit: 24245452 range 0 = 0 to 10739597 
uncompressed: 62451 to 62451
at 
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:46)
at 
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:287)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:422)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1193)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2240)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:105)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:56)
at 
org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:182)
at 
com.datasalt.pangool.tuplemr.mapred.lib.input.HCatTupleInputFormat$1.nextKeyValue(HCatTupleInputFormat.java:159)
at 
com.datasalt.pangool.tuplemr.mapred.lib.input.DelegatingRecordReader.nextKeyValue(DelegatingRecordReader.java:89)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at 
com.datasalt.pangool.tuplemr.mapred.lib.input.DelegatingMapper.run(DelegatingMapper.java:50)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
{noformat}

 Infinite loop with ORC file and Hive 0.11
 -

 Key: HIVE-5235
 URL: https://issues.apache.org/jira/browse/HIVE-5235
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
 Environment: Gentoo linux with Hortonworks Hadoop 
 hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d
Reporter: Iván de Prado
Priority: Blocker
 Attachments: gendata.py


 We are using Hive 0.11 with ORC file format and we get some tasks blocked in 
 some kind of infinite loop. They keep working indefinitely when we set a huge 
 task expiry timeout. If we the expiry time to 600 second, the taks fail 
 because of not reporting progress, and finally, the Job fails. 
 That is not consistent, and some times between jobs executions the behavior 
 changes. It happen for different queries.
 We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks 
 that is blocked keeps consuming 100% of CPU usage, and the stack trace is 
 always the same consistently. Everything points to some kind of infinite 
 loop. My guessing is that it has some relation to the ORC file. Maybe some 
 pointer is not right when writing generating some kind of infinite loop when 
 reading.  Or maybe there is a bug in the reading stage.
 More information below. The stack trace:
 {noformat} 
 main prio=10 tid=0x7f20a000a800 nid=0x1ed2 runnable [0x7f20a8136000]
java.lang.Thread.State: RUNNABLE
   at java.util.zip.Inflater.inflateBytes(Native Method)
   at java.util.zip.Inflater.inflate(Inflater.java:256)
   - locked 0xf42a6ca0 (a java.util.zip.ZStreamRef)
   at 
 org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64)
   at

[jira] [Commented] (HIVE-5235) Infinite loop with ORC file and Hive 0.11

2014-02-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909323#comment-13909323
 ] 

Iván de Prado commented on HIVE-5235:
-

Sorry, the Java version is:

{noformat}
java version 1.6.0_31
Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
{noformat}

 Infinite loop with ORC file and Hive 0.11
 -

 Key: HIVE-5235
 URL: https://issues.apache.org/jira/browse/HIVE-5235
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
 Environment: Gentoo linux with Hortonworks Hadoop 
 hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d
Reporter: Iván de Prado
Priority: Blocker
 Attachments: gendata.py


 We are using Hive 0.11 with ORC file format and we get some tasks blocked in 
 some kind of infinite loop. They keep working indefinitely when we set a huge 
 task expiry timeout. If we the expiry time to 600 second, the taks fail 
 because of not reporting progress, and finally, the Job fails. 
 That is not consistent, and some times between jobs executions the behavior 
 changes. It happen for different queries.
 We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks 
 that is blocked keeps consuming 100% of CPU usage, and the stack trace is 
 always the same consistently. Everything points to some kind of infinite 
 loop. My guessing is that it has some relation to the ORC file. Maybe some 
 pointer is not right when writing generating some kind of infinite loop when 
 reading.  Or maybe there is a bug in the reading stage.
 More information below. The stack trace:
 {noformat} 
 main prio=10 tid=0x7f20a000a800 nid=0x1ed2 runnable [0x7f20a8136000]
java.lang.Thread.State: RUNNABLE
   at java.util.zip.Inflater.inflateBytes(Native Method)
   at java.util.zip.Inflater.inflate(Inflater.java:256)
   - locked 0xf42a6ca0 (a java.util.zip.ZStreamRef)
   at 
 org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:128)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:143)
   at 
 org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVulong(SerializationUtils.java:54)
   at 
 org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVslong(SerializationUtils.java:65)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:66)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46)
   at 
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
   at 
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
   at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300)
   at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
   - eliminated 0xe1459700 (a 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
   - locked 0xe1459700 (a 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)

[jira] [Commented] (HIVE-5217) Add long polling to asynchronous execution in HiveServer2


[ 
https://issues.apache.org/jira/browse/HIVE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909361#comment-13909361
 ] 

Hive QA commented on HIVE-5217:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630053/HIVE-5217.6.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5173 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hcatalog.hbase.TestHBaseDirectOutputFormat.org.apache.hcatalog.hbase.TestHBaseDirectOutputFormat
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1439/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1439/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630053

 Add long polling to asynchronous execution in HiveServer2
 -

 Key: HIVE-5217
 URL: https://issues.apache.org/jira/browse/HIVE-5217
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-5217.2.patch, HIVE-5217.3.patch, HIVE-5217.4.patch, 
 HIVE-5217.5.patch, HIVE-5217.6.patch, HIVE-5217.D12801.2.patch, 
 HIVE-5217.D12801.3.patch, HIVE-5217.D12801.4.patch, HIVE-5217.D12801.5.patch, 
 HIVE-5217.D12801.6.patch


 [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
 for async execution in HS2. The client gets an operation handle which it can 
 poll to check on the operation status. However, the polling frequency is 
 entirely left to the client which can be resource inefficient. Long polling 
 will solve this, by blocking the client request to check the operation status 
 for a configurable amount of time (a new HS2 config) if the data is not 
 available, but responding immediately if the data is available.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6487) PTest2 do not copy failed source directories

2014-02-22 Thread Brock Noland (JIRA)

Brock Noland created HIVE-6487:
--

 Summary: PTest2 do not copy failed source directories
 Key: HIVE-6487
 URL: https://issues.apache.org/jira/browse/HIVE-6487
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland


Right now we copy the entire source directory for failed tests back to the 
master (up to 5). They are 10GB per so it takes a very long time. We should 
remove this feature.

Remove the cp command from batch-exec.vm:
https://github.com/apache/hive/blob/trunk/testutils/ptest2/src/main/resources/batch-exec.vm#L91
also don't publish the number of failed tests as a template variable:




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Precommit queue

Prices have been stable but do they do seem to be having capacity
problems[1] as I have been unable to get machines when bidding
$0.70/hr which is 10x the normal price. However, there is one item
which is taking time for no good reason:
https://issues.apache.org/jira/browse/HIVE-6487. I am going to put a
break fix in for that this morning.

Brock

[1] An economically inclined observer would expect lower capacity to
result in increased prices but that is not always the case. Whatever
black magic they have running the spot market seems to have a lot of
non-market logic.


On Fri, Feb 21, 2014 at 10:40 PM, Thejas Nair the...@hortonworks.com wrote:
 Hi Brock,
 Do you know why the tests are taking almost twice as long in recent runs ?
 Is it related to the ec2 spot price spikes ?
 Thanks,
 Thejas


 On Fri, Feb 21, 2014 at 7:11 AM, Brock Noland br...@cloudera.com wrote:

 There was a ec2 spot price spike overnight which combined with
 everyone trying to get patches in for the branching has resulted in a
 massive queue:

 http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/

 ~25 builds in the queue

 Brock


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Re: Precommit queue

In other news, TestWithBeelineArgs created a 9GB hive.log, before
failing, in the last run:

http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-1440/failed/TestBeeLineWithArgs/

On Sat, Feb 22, 2014 at 10:23 AM, Brock Noland br...@cloudera.com wrote:
 Prices have been stable but do they do seem to be having capacity
 problems[1] as I have been unable to get machines when bidding
 $0.70/hr which is 10x the normal price. However, there is one item
 which is taking time for no good reason:
 https://issues.apache.org/jira/browse/HIVE-6487. I am going to put a
 break fix in for that this morning.

 Brock

 [1] An economically inclined observer would expect lower capacity to
 result in increased prices but that is not always the case. Whatever
 black magic they have running the spot market seems to have a lot of
 non-market logic.


 On Fri, Feb 21, 2014 at 10:40 PM, Thejas Nair the...@hortonworks.com wrote:
 Hi Brock,
 Do you know why the tests are taking almost twice as long in recent runs ?
 Is it related to the ec2 spot price spikes ?
 Thanks,
 Thejas


 On Fri, Feb 21, 2014 at 7:11 AM, Brock Noland br...@cloudera.com wrote:

 There was a ec2 spot price spike overnight which combined with
 everyone trying to get patches in for the branching has resulted in a
 massive queue:

 http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/

 ~25 builds in the queue

 Brock


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



 --
 Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

[jira] [Commented] (HIVE-5232) Make JDBC use the new HiveServer2 async execution API by default


[ 
https://issues.apache.org/jira/browse/HIVE-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909426#comment-13909426
 ] 

Hive QA commented on HIVE-5232:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630024/HIVE-5232.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5175 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1440/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1440/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630024

 Make JDBC use the new HiveServer2 async execution API by default
 

 Key: HIVE-5232
 URL: https://issues.apache.org/jira/browse/HIVE-5232
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-5232.1.patch, HIVE-5232.2.patch, HIVE-5232.3.patch


 HIVE-4617 provides support for async execution in HS2. There are some 
 proposed improvements in followup JIRAs:
 HIVE-5217
 HIVE-5229
 HIVE-5230
 HIVE-5441
 There is also [HIVE-5060] which assumes that execute to be asynchronous by 
 default.
  
 Once they are in, we can think of using the async API as the default for 
 JDBC. This can enable the server to report back error sooner to the client. 
 It can also be useful in cases where a statement.cancel is done in a 
 different thread - the original thread will now be able to detect the cancel, 
 as opposed to the use of the blocking execute calls, in which 
 statement.cancel will be a no-op. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6488) Investigate TestBeeLineWithArgs

2014-02-22 Thread Brock Noland (JIRA)

Brock Noland created HIVE-6488:
--

 Summary: Investigate TestBeeLineWithArgs
 Key: HIVE-6488
 URL: https://issues.apache.org/jira/browse/HIVE-6488
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Priority: Blocker


TestBeeLineWithArgs started taking many, many hours and eventually timing out 
which is one cause of precommit runs taking a long time. For now I have skipped 
it in for precommit tests so we should figure out what is going on so we can 
re-enable the test.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Precommit queue

Looking more into that test...I decided to skip it and open a blocker
to look at it: https://issues.apache.org/jira/browse/HIVE-6488

On Sat, Feb 22, 2014 at 10:26 AM, Brock Noland br...@cloudera.com wrote:
 In other news, TestWithBeelineArgs created a 9GB hive.log, before
 failing, in the last run:

 http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-1440/failed/TestBeeLineWithArgs/

 On Sat, Feb 22, 2014 at 10:23 AM, Brock Noland br...@cloudera.com wrote:
 Prices have been stable but do they do seem to be having capacity
 problems[1] as I have been unable to get machines when bidding
 $0.70/hr which is 10x the normal price. However, there is one item
 which is taking time for no good reason:
 https://issues.apache.org/jira/browse/HIVE-6487. I am going to put a
 break fix in for that this morning.

 Brock

 [1] An economically inclined observer would expect lower capacity to
 result in increased prices but that is not always the case. Whatever
 black magic they have running the spot market seems to have a lot of
 non-market logic.


 On Fri, Feb 21, 2014 at 10:40 PM, Thejas Nair the...@hortonworks.com wrote:
 Hi Brock,
 Do you know why the tests are taking almost twice as long in recent runs ?
 Is it related to the ec2 spot price spikes ?
 Thanks,
 Thejas


 On Fri, Feb 21, 2014 at 7:11 AM, Brock Noland br...@cloudera.com wrote:

 There was a ec2 spot price spike overnight which combined with
 everyone trying to get patches in for the branching has resulted in a
 massive queue:

 http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/

 ~25 builds in the queue

 Brock


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



 --
 Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org



 --
 Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Re: Review Request 18202: HiveServer2 running in http mode should support for doAs functionality

2014-02-22 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18202/#review35218
---



service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java
https://reviews.apache.org/r/18202/#comment65672

this check looks unnecessary. It is returning same object irrespective of 
value.


- Thejas Nair


On Feb. 18, 2014, 3:02 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18202/
 ---
 
 (Updated Feb. 18, 2014, 3:02 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-6306
 https://issues.apache.org/jira/browse/HIVE-6306
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HiveServer2 running in http mode should support for doAs functionality
 
 
 Diffs
 -
 
   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java d8ba3aa 
   service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java 
 PRE-CREATION 
   service/src/java/org/apache/hive/service/auth/HttpCLIServiceProcessor.java 
 PRE-CREATION 
   
 service/src/java/org/apache/hive/service/auth/HttpCLIServiceUGIProcessor.java 
 PRE-CREATION 
   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
 bfe0e7b 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java 
 a6ff6ce 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java 
 e77f043 
   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
 9e9a60d 
 
 Diff: https://reviews.apache.org/r/18202/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Vaibhav Gumashta

[jira] [Assigned] (HIVE-6487) PTest2 do not copy failed source directories

2014-02-22 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-6487:
---

Assignee: Szehon Ho

 PTest2 do not copy failed source directories
 

 Key: HIVE-6487
 URL: https://issues.apache.org/jira/browse/HIVE-6487
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Szehon Ho

 Right now we copy the entire source directory for failed tests back to the 
 master (up to 5). They are 10GB per so it takes a very long time. We should 
 remove this feature.
 Remove the cp command from batch-exec.vm:
 https://github.com/apache/hive/blob/trunk/testutils/ptest2/src/main/resources/batch-exec.vm#L91
 also don't publish the number of failed tests as a template variable:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Precommit queue

FYI the changes 1) excluding TestWithBeelineArgs 2) not copying the
source dir on failed tests has improved run times by ~1.5 hours.

On Sat, Feb 22, 2014 at 10:34 AM, Brock Noland br...@cloudera.com wrote:
 Looking more into that test...I decided to skip it and open a blocker
 to look at it: https://issues.apache.org/jira/browse/HIVE-6488

 On Sat, Feb 22, 2014 at 10:26 AM, Brock Noland br...@cloudera.com wrote:
 In other news, TestWithBeelineArgs created a 9GB hive.log, before
 failing, in the last run:

 http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-1440/failed/TestBeeLineWithArgs/

 On Sat, Feb 22, 2014 at 10:23 AM, Brock Noland br...@cloudera.com wrote:
 Prices have been stable but do they do seem to be having capacity
 problems[1] as I have been unable to get machines when bidding
 $0.70/hr which is 10x the normal price. However, there is one item
 which is taking time for no good reason:
 https://issues.apache.org/jira/browse/HIVE-6487. I am going to put a
 break fix in for that this morning.

 Brock

 [1] An economically inclined observer would expect lower capacity to
 result in increased prices but that is not always the case. Whatever
 black magic they have running the spot market seems to have a lot of
 non-market logic.


 On Fri, Feb 21, 2014 at 10:40 PM, Thejas Nair the...@hortonworks.com 
 wrote:
 Hi Brock,
 Do you know why the tests are taking almost twice as long in recent runs ?
 Is it related to the ec2 spot price spikes ?
 Thanks,
 Thejas


 On Fri, Feb 21, 2014 at 7:11 AM, Brock Noland br...@cloudera.com wrote:

 There was a ec2 spot price spike overnight which combined with
 everyone trying to get patches in for the branching has resulted in a
 massive queue:

 http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/

 ~25 builds in the queue

 Brock


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



 --
 Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org



 --
 Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org



 --
 Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

[jira] [Commented] (HIVE-5761) Implement vectorized support for the DATE data type


[ 
https://issues.apache.org/jira/browse/HIVE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909509#comment-13909509
 ] 

Hive QA commented on HIVE-5761:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630055/HIVE-5761.3.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5192 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_date_funcs
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1443/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1443/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630055

 Implement vectorized support for the DATE data type
 ---

 Key: HIVE-5761
 URL: https://issues.apache.org/jira/browse/HIVE-5761
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-5761.1.patch, HIVE-5761.2.patch, HIVE-5761.3.patch


 Add support to allow queries referencing DATE columns and expression results 
 to run efficiently in vectorized mode. This should re-use the code for the 
 the integer/timestamp types to the extent possible and beneficial. Include 
 unit tests and end-to-end tests. Consider re-using or extending existing 
 end-to-end tests for vectorized integer and/or timestamp operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf


[ 
https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909547#comment-13909547
 ] 

Hive QA commented on HIVE-6459:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630096/HIVE-6459.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5175 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1444/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1444/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630096

 Change the precison/scale for intermediate sum result in the avg() udf 
 ---

 Key: HIVE-6459
 URL: https://issues.apache.org/jira/browse/HIVE-6459
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6459.1.patch, HIVE-6459.2.patch, HIVE-6459.patch


 The avg() udf, when applied to a decimal column, selects the precision/scale 
 of the intermediate sum field as (p+4, s+4), which is the same for the 
 precision/scale of the avg() result. However, the additional scale increase 
 is unnecessary, and the problem of data overflow may occur. The requested 
 change is that for the intermediate sum result,  the precsion/scale is set to 
 (p+10, s), which is consistent to sum() udf. The avg() result still keeps its 
 precision/scale.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5217) Add long polling to asynchronous execution in HiveServer2

[
https://issues.apache.org/jira/browse/HIVE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thejas M Nair updated HIVE-5217:

Resolution: Fixed
Release Note:
Change to use Long polling as described in description.
Adds hive.server2.long.polling.timeout configuration parameter, which can be
used to configure how long the long poll waits. Most users would not need to
bother about changing this configuration parameter.
Status: Resolved (was: Patch Available)

Patch committed to trunk.
Thanks for the contribution Vaibhav!

Can you add see if the release note looks OK ? Please feel free to edit it.

Add long polling to asynchronous execution in HiveServer2
-

Key: HIVE-5217
URL: https://issues.apache.org/jira/browse/HIVE-5217
Project: Hive
Issue Type: Sub-task
Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Fix For: 0.13.0

Attachments: HIVE-5217.2.patch, HIVE-5217.3.patch, HIVE-5217.4.patch,
HIVE-5217.5.patch, HIVE-5217.6.patch, HIVE-5217.D12801.2.patch,
HIVE-5217.D12801.3.patch, HIVE-5217.D12801.4.patch, HIVE-5217.D12801.5.patch,
HIVE-5217.D12801.6.patch

[HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support
for async execution in HS2. The client gets an operation handle which it can
poll to check on the operation status. However, the polling frequency is
entirely left to the client which can be resource inefficient. Long polling
will solve this, by blocking the client request to check the operation status
for a configurable amount of time (a new HS2 config) if the data is not
available, but responding immediately if the data is available.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6489) Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership

2014-02-22 Thread Joseph Warren Rao IV (JIRA)

Joseph Warren Rao IV created HIVE-6489:
--

 Summary: Data loaded with LOAD DATA LOCAL INPATH has incorrect 
group ownership
 Key: HIVE-6489
 URL: https://issues.apache.org/jira/browse/HIVE-6489
 Project: Hive
  Issue Type: Bug
  Components: Authorization, Clients, Import/Export
Affects Versions: 0.12.0, 0.11.0, 0.10.0, 0.9.0
 Environment: OS and hardware are irrelevant.  Tested and reproduced on 
multiple configurations, including SLES, RHEL, VM, Teradata Hadoop Appliance, 
HDP 1.1, HDP 1.3.2, HDP 2.0.
Reporter: Joseph Warren Rao IV
Priority: Minor


Data uploaded by user via the Hive client with the LOAD DATA LOCAL INPATH 
method will have group ownership of the hdfs://tmp/hive-user instead of the 
primary group that user belongs to.  The group ownership of the 
hdfs://tmp/hive-user is, by default, the group that the user running the 
hadoop daemons run under.  This means that, on a Hadoop system with default 
file permissions of 770, any data loaded to hive via the LOAD DATA LOCAL INPATH 
method by one user cannot be seen by another user in the same group until the 
group ownership is manually changed in Hive's internal directory, or the group 
ownership is manually changed on hdfs://tmp/hive-user.  This problem is not 
present with the LOAD DATA INPATH method, or by using regular HDFS loads.

Steps to reproduce the problem on a pseudodistributed Hadoop cluster:
- In hdfs-site.xml, modify the umask to 007 (meaning that default permissions 
on files are 770).  The property changes names in Hadoop 2.0 but used to be 
called dfs.umaskmode.
- Restart hdfs
- Create a group called testgroup.
- Create two users that have testgroup as their primary group.  Call them 
testuser1 and testuser2
- Create a test file containing Hello World and call it test.txt.  It 
should be stored on the local filesystem.
- Create a table called testtable in Hive using testuser1.  Give it a single 
string column, textfile format, comma delimited fields.
- Have testuser1 use the LOAD DATA LOCAL INPATH command to load test.txt into 
testtable.
- Attempt to read testtable using testuser2.  The read will fail on a 
permissions error, when it should not.
- Examine the contents of the hdfs://apps/hive/warehouse/testtable directory.  
The file will belong to the hadoop or users or analogous group, instead of 
the correct group testgroup.  It will have correct permissions of 770.
- Change the group ownership of the folder hdfs://tmp/hive-testuser1 to 
testgroup.
- Repeat the data load.  testuser2 will now be able to correctly read the data, 
and the file will have the correct group ownership.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6489) Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership

2014-02-22 Thread Joe Rao (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Joe Rao updated HIVE-6489:
--

Component/s: (was: Authorization)
(was: Clients)

Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership
-

Key: HIVE-6489
URL: https://issues.apache.org/jira/browse/HIVE-6489
Project: Hive
Issue Type: Bug
Components: Import/Export
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0
Environment: OS and hardware are irrelevant. Tested and reproduced
on multiple configurations, including SLES, RHEL, VM, Teradata Hadoop
Appliance, HDP 1.1, HDP 1.3.2, HDP 2.0.
Reporter: Joe Rao
Priority: Minor
Original Estimate: 24h
Remaining Estimate: 24h

Data uploaded by user via the Hive client with the LOAD DATA LOCAL INPATH
method will have group ownership of the hdfs://tmp/hive-user instead of the
primary group that user belongs to. The group ownership of the
hdfs://tmp/hive-user is, by default, the group that the user running the
hadoop daemons run under. This means that, on a Hadoop system with default
file permissions of 770, any data loaded to hive via the LOAD DATA LOCAL
INPATH method by one user cannot be seen by another user in the same group
until the group ownership is manually changed in Hive's internal directory,
or the group ownership is manually changed on hdfs://tmp/hive-user. This
problem is not present with the LOAD DATA INPATH method, or by using regular
HDFS loads.
Steps to reproduce the problem on a pseudodistributed Hadoop cluster:
- In hdfs-site.xml, modify the umask to 007 (meaning that default permissions
on files are 770). The property changes names in Hadoop 2.0 but used to be
called dfs.umaskmode.
- Restart hdfs
- Create a group called testgroup.
- Create two users that have testgroup as their primary group. Call them
testuser1 and testuser2
- Create a test file containing Hello World and call it test.txt. It
should be stored on the local filesystem.
- Create a table called testtable in Hive using testuser1. Give it a
single string column, textfile format, comma delimited fields.
- Have testuser1 use the LOAD DATA LOCAL INPATH command to load test.txt
into testtable.
- Attempt to read testtable using testuser2. The read will fail on a
permissions error, when it should not.
- Examine the contents of the hdfs://apps/hive/warehouse/testtable directory.
The file will belong to the hadoop or users or analogous group, instead
of the correct group testgroup. It will have correct permissions of 770.
- Change the group ownership of the folder hdfs://tmp/hive-testuser1 to
testgroup.
- Repeat the data load. testuser2 will now be able to correctly read the
data, and the file will have the correct group ownership.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table


[ 
https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909582#comment-13909582
 ] 

Hive QA commented on HIVE-6473:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630117/HIVE-6473.0.patch.txt

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5178 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_bulk
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_handler_bulk
org.apache.hadoop.hive.cli.TestHBaseMinimrCliDriver.testCliDriver_hbase_bulk
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver_generatehfiles_require_family_path
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1446/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1446/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630117

 Allow writing HFiles via HBaseStorageHandler table
 --

 Key: HIVE-6473
 URL: https://issues.apache.org/jira/browse/HIVE-6473
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
 Attachments: HIVE-6473.0.patch.txt


 Generating HFiles for bulkload into HBase could be more convenient. Right now 
 we require the user to register a new table with the appropriate output 
 format. This patch allows the exact same functionality, but through an 
 existing table managed by the HBaseStorageHandler.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-22 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/#review35234
---



jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
https://reviews.apache.org/r/14950/#comment65691

I think we should retry on exceptions that indicate an network connection 
error. TProtocolException seems to be the exception that is thrown in such 
cases. 


- Thejas Nair


On Feb. 20, 2014, 9:18 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/14950/
 ---
 
 (Updated Feb. 20, 2014, 9:18 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Should be applied on top of:
 HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
 HIVE-5229 [Better thread management for HiveServer2 async threads]
 HIVE-5230 [Better error reporting by async threads in HiveServer2]
 HIVE-5441 [Async query execution doesn't return resultset status] 
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 
 
 Diff: https://reviews.apache.org/r/14950/diff/
 
 
 Testing
 ---
 
 TestJdbcDriver2
 
 
 Thanks,
 
 Vaibhav Gumashta

[jira] [Commented] (HIVE-6475) Implement support for appending to mutable tables in HCatalog


[ 
https://issues.apache.org/jira/browse/HIVE-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909617#comment-13909617
 ] 

Hive QA commented on HIVE-6475:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630129/HIVE-6475.patch

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 5179 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_scriptfile1
org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable
org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask
org.apache.hive.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTable
org.apache.hive.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask
org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatDynamicPartitionedTable
org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask
org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatExternalDynamicCustomLocation
org.apache.hive.hcatalog.mapreduce.TestHCatMutableDynamicPartitioned.testHCatDynamicPartitionedTable
org.apache.hive.hcatalog.mapreduce.TestHCatMutableDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask
org.apache.hive.hcatalog.pig.TestHCatStorerWrapper.testStoreExternalTableWithExternalDir
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1447/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1447/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630129

 Implement support for appending to mutable tables in HCatalog
 -

 Key: HIVE-6475
 URL: https://issues.apache.org/jira/browse/HIVE-6475
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6475.patch


 Part of HIVE-6405, this is the implementation of the append feature on the 
 HCatalog side. If a table is mutable, we must support being able to append to 
 existing data instead of erroring out as  a duplicate publish.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6393) Support unqualified column references in Joining conditions

2014-02-22 Thread Harish Butani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6393:


Status: Patch Available  (was: Open)

 Support unqualified column references in Joining conditions
 ---

 Key: HIVE-6393
 URL: https://issues.apache.org/jira/browse/HIVE-6393
 Project: Hive
  Issue Type: Improvement
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6393.1.patch, HIVE-6393.2.patch


 Support queries of the form:
 {noformat}
 create table r1(a int);
 create table r2(b);
 select a, b
 from r1 join r2 on a = b
 {noformat}
 This becomes more useful in old style syntax:
 {noformat}
 select a, b
 from r1, r2
 where a = b
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6393) Support unqualified column references in Joining conditions

2014-02-22 Thread Harish Butani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6393:


Status: Open  (was: Patch Available)

 Support unqualified column references in Joining conditions
 ---

 Key: HIVE-6393
 URL: https://issues.apache.org/jira/browse/HIVE-6393
 Project: Hive
  Issue Type: Improvement
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6393.1.patch, HIVE-6393.2.patch


 Support queries of the form:
 {noformat}
 create table r1(a int);
 create table r2(b);
 select a, b
 from r1 join r2 on a = b
 {noformat}
 This becomes more useful in old style syntax:
 {noformat}
 select a, b
 from r1, r2
 where a = b
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6393) Support unqualified column references in Joining conditions

2014-02-22 Thread Harish Butani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6393:


Attachment: HIVE-6393.2.patch

 Support unqualified column references in Joining conditions
 ---

 Key: HIVE-6393
 URL: https://issues.apache.org/jira/browse/HIVE-6393
 Project: Hive
  Issue Type: Improvement
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6393.1.patch, HIVE-6393.2.patch


 Support queries of the form:
 {noformat}
 create table r1(a int);
 create table r2(b);
 select a, b
 from r1 join r2 on a = b
 {noformat}
 This becomes more useful in old style syntax:
 {noformat}
 select a, b
 from r1, r2
 where a = b
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 18293: HIVE-6393: Support unqualified column references in Joining conditions

2014-02-22 Thread Harish Butani


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18293/
---

(Updated Feb. 23, 2014, 3:28 a.m.)


Review request for hive, Ashutosh Chauhan and Gunther Hagleitner.


Changes
---

fix .out files


Bugs: HIVE-6393
https://issues.apache.org/jira/browse/HIVE-6393


Repository: hive-git


Description
---

Support queries of the form:

create table r1(a int);
create table r2(b);
select a, b
from r1 join r2 on a = b

This becomes more useful in old style syntax:

select a, b
from r1, r2
where a = b


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java a01aa0e 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestQBJoinTreeApplyPredicate.java 
9e77949 
  ql/src/test/queries/clientpositive/join_cond_pushdown_unqual1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/join_cond_pushdown_unqual2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/join_cond_pushdown_unqual3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/join_cond_pushdown_unqual4.q PRE-CREATION 
  ql/src/test/queries/clientpositive/subquery_unqualcolumnrefs.q PRE-CREATION 
  ql/src/test/results/clientpositive/join_cond_pushdown_unqual1.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/join_cond_pushdown_unqual2.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/join_cond_pushdown_unqual3.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/join_cond_pushdown_unqual4.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_unqualcolumnrefs.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/18293/diff/


Testing
---

added new tests
ran all existing join tests


Thanks,

Harish Butani

[jira] [Updated] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf

2014-02-22 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-6459:
--

Attachment: HIVE-6459.3.patch

Patch #3 again fixed one of test failures. The other one test has been flaky 
and unrelated.

 Change the precison/scale for intermediate sum result in the avg() udf 
 ---

 Key: HIVE-6459
 URL: https://issues.apache.org/jira/browse/HIVE-6459
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6459.1.patch, HIVE-6459.2.patch, HIVE-6459.3.patch, 
 HIVE-6459.patch


 The avg() udf, when applied to a decimal column, selects the precision/scale 
 of the intermediate sum field as (p+4, s+4), which is the same for the 
 precision/scale of the avg() result. However, the additional scale increase 
 is unnecessary, and the problem of data overflow may occur. The requested 
 change is that for the intermediate sum result,  the precsion/scale is set to 
 (p+10, s), which is consistent to sum() udf. The avg() result still keeps its 
 precision/scale.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6429) MapJoinKey has large memory overhead in typical cases