[jira] [Created] (HIVE-21750) INSERT OVERWRITE with empty result set does not clear transactional table

2019-05-17 Thread Todd Lipcon (JIRA)
Todd Lipcon created HIVE-21750:
--

 Summary: INSERT OVERWRITE with empty result set does not clear 
transactional table
 Key: HIVE-21750
 URL: https://issues.apache.org/jira/browse/HIVE-21750
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 4.0.0
Reporter: Todd Lipcon


The following query:
{code}
INSERT OVERWRITE TABLE t SELECT 1 WHERE FALSE
{code}
should serve to truncate a table by producing an empty base data directory. In 
fact no new base directory is created, so the table is not cleared. (at least 
with an insert_only table, I didn't test full-ACID)

This bug does not seem to happen with non-transactional tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21749) ACID: Provide an option to run Cleaner thread from Hive client

2019-05-17 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21749:
---

 Summary: ACID: Provide an option to run Cleaner thread from Hive 
client
 Key: HIVE-21749
 URL: https://issues.apache.org/jira/browse/HIVE-21749
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 4.0.0
Reporter: Vaibhav Gumashta


In some cases, it could be useful to trigger the cleaner thread manually. We 
should provide an option for that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21748) HBase Operations Can Fail When Using MAPREDLOCAL

2019-05-17 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21748:
-

 Summary: HBase Operations Can Fail When Using MAPREDLOCAL
 Key: HIVE-21748
 URL: https://issues.apache.org/jira/browse/HIVE-21748
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor


https://github.com/apache/hive/blob/5634140b2beacdac20ceec8c73ff36bce5675ef8/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java#L258-L262

{code:java|title=HBaseStorageHandler.java}
if (this.configureInputJobProps) {
  LOG.info("Configuring input job properties");
...
  try {
addHBaseDelegationToken(jobConf);
  } catch (IOException | MetaException e) {
throw new IllegalStateException("Error while configuring input job 
properties", e);
  }
   }
  else {
LOG.info("Configuring output job properties");
...
  }
{code}

What we can see here is that the HBase Delegation Token is only created when 
there is an input job (reading from an HBase source).  For a particular stage 
of a query, if there is no HBASE input, only HBASE output, then the delegation 
token is not created and will cause a failure.

{code:none|title=Error Message in HS2 Log}
2019-05-17 10:24:55,036 ERROR org.apache.hive.service.cli.operation.Operation: 
[HiveServer2-Background-Pool: Thread-388]: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
at 
org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at 
org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}


You can tell it will fail because an HDFS Token will be created, but it will 
not report an HBASE token in the HS2 logs.  The following is an example of a 
proper setup.  If it is missing the HBASE_AUTH_TOKEN it will fail because it 
will try to initiate Kerberos handshake and fail.

{code:none|title=Logging of a Proper Run}
2019-05-17 10:36:15,593 INFO  org.apache.hadoop.mapreduce.JobSubmitter: 
[HiveServer2-Background-Pool: Thread-455]: Submitting tokens for job: 
job_1557858663665_0048
2019-05-17 10:36:15,593 INFO  org.apache.hadoop.mapreduce.JobSubmitter: 
[HiveServer2-Background-Pool: Thread-455]: Kind: HDFS_DELEGATION_TOKEN, 
Service: 10.17.101.237:8020, Ident: (token for hive: HDFS_DELEGATION_TOKEN 
owner=hive/host-10-17-102-135.coe.cloudera@example.com, renewer=yarn, 
realUser=, issueDate=1558114574357, maxDate=1558719374357, sequenceNumber=75, 
masterKeyId=4)
2019-05-17 10:36:15,593 INFO  org.apache.hadoop.mapreduce.JobSubmitter: 
[HiveServer2-Background-Pool: Thread-455]: Kind: HBASE_AUTH_TOKEN, Service: 
9b282733-7927-4785-92ea-dad419f6f055, Ident: 
(org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@b1)
2019-05-17 10:36:15,859 INFO  
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: 
[HiveServer2-Background-Pool: Thread-455]: Submitted application 
application_1557858663665_0048
{code}

Error message in the Local MapReduce log.

{code:none|title=Error message}
2019-05-10 07:43:24,875 WARN  [htable-pool2-t1]: security.UserGroupInformation 
(UserGroupInformation.java:doAs(1927)) - PriviledgedActionException as:hive 
(auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed 
[Caused by GSSException: No valid credentials provided (Mechanism level: Failed 
to find any Kerberos tgt)]
2019-05-10 07:43:24,876 WARN  [htable-pool2-t1]: ipc.RpcClientImpl 
(RpcClientImpl.java:run(675)) - Exception encountered while connecting to the 
server : javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2019-05-10 07:43:24,876 ERROR [htable-pool2-t1]: ipc.RpcClientImpl 
(RpcClientImpl.java:run(685)) - SASL authentication failed. The most likely 
cause is missing 

[jira] [Created] (HIVE-21747) Remove Dependency on org.cliffc.high_scale_lib.Counter

2019-05-17 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21747:
-

 Summary: Remove Dependency on org.cliffc.high_scale_lib.Counter
 Key: HIVE-21747
 URL: https://issues.apache.org/jira/browse/HIVE-21747
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor


[https://github.com/apache/hive/blob/5634140b2beacdac20ceec8c73ff36bce5675ef8/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java#L327]

 

{code:java}
  static {
try {
  counterClass = Class.forName("org.cliffc.high_scale_lib.Counter");
} catch (ClassNotFoundException cnfe) {
  // this dependency is removed for HBase 1.0
}
{code}

I think this _counterClass_ stuff can be removed now that Hive is firmly on 
HBase 1.0+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)