Hive Custom Simple Edge NullPointerException

2020-11-19 Thread Bernard Quizon
Hi.

I'm using Hive 3.1.0 (Tez Execution Engine) and I'm running into this NPE:

INFO  : Dag name: WITH event_agg AS (WITH outcome AS (SE...ASC (Stage-1)
ERROR : Failed to execute tez graph.
java.lang.NullPointerException: null
at 
org.apache.hadoop.hive.ql.exec.tez.DagUtils.setupQuickStart(DagUtils.java:1551)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at 
org.apache.hadoop.hive.ql.exec.tez.DagUtils.createEdge(DagUtils.java:459)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:487)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:209)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:210)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2701)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2372)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2048)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1746)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1740)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226)
~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at 
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:318)
~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_112]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
~[hadoop-common-3.1.1.3.0.1.0-187.jar:?]
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:331)
~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[?:1.8.0_112]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_112]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[?:1.8.0_112]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_112]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[?:1.8.0_112]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
~[?:1.8.0_112]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
ERROR : FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.tez.TezTask


Somehow conversion from TezWork to Tez DAG throws a NullPointerException.
Has someone experienced a similar error?

BTW the query is pretty complicated, one with multiple left joins and CTEs.
It works on LLAP but fails on default Hive Server2.

Thanks!


Re: Intermittent ArrayIndexOutOfBoundsException on Hive Merge

2020-07-15 Thread Bernard Quizon
Hi, Aaron.

Thank you, your suggestion might have solved this issue.
So far I haven't seen a failure after turning off vectorization.
Though I don't think this is the best solution since turning it off has
performance implications.

Thanks,
Bernard

On Tue, Jul 14, 2020 at 10:06 PM Aaron Grubb  wrote:

> This is just a suggestion but I recently ran into an issue with vectorized
> query execution and a map column type, specifically when inserting into an
> HBase table with a map to column family setup. Try using “set
> hive.vectorized.execution.enabled=false;”
>
>
>
> Thanks,
>
> Aaron
>
>
>
>
>
> *From:* Bernard Quizon 
> *Sent:* Tuesday, July 14, 2020 9:57 AM
> *To:* user@hive.apache.org
> *Subject:* Re: Intermittent ArrayIndexOutOfBoundsException on Hive Merge
>
>
>
> Hi.
>
> I see that this piece of code is the source of the error:
>
> final int maxSize =
> (vectorizedTestingReducerBatchSize > 0 ?
> Math.*min*(vectorizedTestingReducerBatchSize, batch.getMaxSize()) :
> batch.getMaxSize());
> Preconditions.*checkState*(maxSize > 0);
> int rowIdx = 0;
> int batchBytes = keyBytes.length;
> try {
>   for (Object value : values) {
> if (rowIdx >= maxSize ||
> (rowIdx > 0 && batchBytes >= BATCH_BYTES)) {
>
>   // Batch is full AND we have at least 1 more row...
>   batch.size = rowIdx;
>   if (handleGroupKey) {
> reducer.setNextVectorBatchGroupStatus(/* isLastGroupBatch */ false);
>   }
>   reducer.process(batch, tag);
>
>   // Reset just the value columns and value buffer.
>   for (int i = firstValueColumnOffset; i < batch.numCols; i++) {
> // Note that reset also resets the data buffer for bytes column 
> vectors.
> batch.cols[i].reset();
>   }
>   rowIdx = 0;
>   batchBytes = keyBytes.length;
> }
> if (valueLazyBinaryDeserializeToRow != null) {
>   // Deserialize value into vector row columns.
>   BytesWritable valueWritable = (BytesWritable) value;
>   byte[] valueBytes = valueWritable.getBytes();
>   int valueLength = valueWritable.getLength();
>   batchBytes += valueLength;
>
>   valueLazyBinaryDeserializeToRow.setBytes(valueBytes, 0, valueLength);
>   valueLazyBinaryDeserializeToRow.deserialize(batch, rowIdx);
> }
> rowIdx++;
>   }
>
>
>
>
>
> `*valueLazyBinaryDeserializeToRow.deserialize(batch, rowIdx)*` throws an
> exception due to `*rowIdx*` having a value of 1024, it should have a
> value of1023 at most.
>
> But it seems to me that `*maxSize*` will always be < 1024 then why would `
> *rowIdx*` on the expression 
> `*valueLazyBinaryDeserializeToRow.deserialize(batch,
> rowIdx)*` have anything >= 1024.
>
> Am I missing something here?
>
> Thanks,
> Bernard
>
>
>
> On Tue, Jul 14, 2020 at 5:44 PM Bernard Quizon <
> bernard.qui...@cheetahdigital.com> wrote:
>
> Hi.
>
> I'm using Hive 3.1.0 (Tez Execution Engine) and I'm running into
> intermittent errors when doing Hive Merge.
>
> Just to clarify, the Hive Merge query probably succeeds 60% of the time
> using the same source and destination table for the Hive Merge query.
>
> By the way, both the source and destination table has columns with complex
> data types such as ARRAY and MAP.
>
>
>
> Here's the error :
>
> TaskAttempt 0 failed, info=
> » Error: Error while running task ( failure ) :
> attempt_1594345704665_28139_1_06_07_0:java.lang.RuntimeException:
> java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
> processing vector batch (tag=0) (vectorizedVertexNum 4)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.ja

Re: Intermittent ArrayIndexOutOfBoundsException on Hive Merge

2020-07-14 Thread Bernard Quizon
Hi.

I see that this piece of code is the source of the error:

final int maxSize =
(vectorizedTestingReducerBatchSize > 0 ?
Math.min(vectorizedTestingReducerBatchSize, batch.getMaxSize()) :
batch.getMaxSize());
Preconditions.checkState(maxSize > 0);
int rowIdx = 0;
int batchBytes = keyBytes.length;
try {
  for (Object value : values) {
if (rowIdx >= maxSize ||
(rowIdx > 0 && batchBytes >= BATCH_BYTES)) {

  // Batch is full AND we have at least 1 more row...
  batch.size = rowIdx;
  if (handleGroupKey) {
reducer.setNextVectorBatchGroupStatus(/* isLastGroupBatch */ false);
  }
  reducer.process(batch, tag);

  // Reset just the value columns and value buffer.
  for (int i = firstValueColumnOffset; i < batch.numCols; i++) {
// Note that reset also resets the data buffer for bytes column vectors.
batch.cols[i].reset();
  }
  rowIdx = 0;
  batchBytes = keyBytes.length;
}
if (valueLazyBinaryDeserializeToRow != null) {
  // Deserialize value into vector row columns.
  BytesWritable valueWritable = (BytesWritable) value;
  byte[] valueBytes = valueWritable.getBytes();
  int valueLength = valueWritable.getLength();
  batchBytes += valueLength;

  valueLazyBinaryDeserializeToRow.setBytes(valueBytes, 0, valueLength);
  valueLazyBinaryDeserializeToRow.deserialize(batch, rowIdx);
}
rowIdx++;
  }


`*valueLazyBinaryDeserializeToRow.deserialize(batch, rowIdx)*` throws an
exception due to `*rowIdx*` having a value of 1024, it should have a value
of1023 at most.
But it seems to me that `*maxSize*` will always be < 1024 then why would `
*rowIdx*` on the expression
`*valueLazyBinaryDeserializeToRow.deserialize(batch,
rowIdx)*` have anything >= 1024.
Am I missing something here?

Thanks,
Bernard

On Tue, Jul 14, 2020 at 5:44 PM Bernard Quizon <
bernard.qui...@cheetahdigital.com> wrote:

> Hi.
>
> I'm using Hive 3.1.0 (Tez Execution Engine) and I'm running into
> intermittent errors when doing Hive Merge.
>
> Just to clarify, the Hive Merge query probably succeeds 60% of the time
> using the same source and destination table for the Hive Merge query.
>
> By the way, both the source and destination table has columns with complex
> data types such as ARRAY and MAP.
>
>
> Here's the error :
>
> TaskAttempt 0 failed, info=
> » Error: Error while running task ( failure ) :
> attempt_1594345704665_28139_1_06_07_0:java.lang.RuntimeException:
> java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
> processing vector batch (tag=0) (vectorizedVertexNum 4)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
> at
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
> at
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
> processing vector batch (tag=0) (vectorizedVertexNum 4)
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:396)
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:249)
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318)
> at
> org.apache.hadoop.hive.ql

Intermittent ArrayIndexOutOfBoundsException on Hive Merge

2020-07-14 Thread Bernard Quizon
Hi.

I'm using Hive 3.1.0 (Tez Execution Engine) and I'm running into
intermittent errors when doing Hive Merge.

Just to clarify, the Hive Merge query probably succeeds 60% of the time
using the same source and destination table for the Hive Merge query.

By the way, both the source and destination table has columns with complex
data types such as ARRAY and MAP.


Here's the error :

TaskAttempt 0 failed, info=
» Error: Error while running task ( failure ) :
attempt_1594345704665_28139_1_06_07_0:java.lang.RuntimeException:
java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing vector batch (tag=0) (vectorizedVertexNum 4)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing vector batch (tag=0) (vectorizedVertexNum 4)
at
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:396)
at
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:249)
at
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
Error while processing vector batch (tag=0) (vectorizedVertexNum 4)
at
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:493)
at
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:387)
... 19 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
at
org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(BytesColumnVector.java:187)
at
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storePrimitiveRowColumn(VectorDeserializeRow.java:588)
at
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeComplexFieldRowColumn(VectorDeserializeRow.java:778)
at
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeMapRowColumn(VectorDeserializeRow.java:855)
at
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeRowColumn(VectorDeserializeRow.java:941)
at
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.java:1360)
at
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:470)
... 20 more

Would someone know a workaround for this?

Thanks,
Bernard


Error communicating with the metastore | org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset

2020-02-12 Thread Bernard Quizon
Hi, everyone.

We're getting this error on Hive 3.1.0:

org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with
the metastore
at
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
at org.apache.hadoop.hive.ql.Driver.recordValidWriteIds(Driver.java:1463)
at org.apache.hadoop.hive.ql.Driver.acquireLocks(Driver.java:1653)
at org.apache.hadoop.hive.ql.Driver.lockAndRespond(Driver.java:1832)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2002)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1746)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1740)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
at
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226)
at
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
at
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:318)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:331)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException:
java.net.SocketException: Connection reset
at
org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
at
org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:202)
at
org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest$GetValidWriteIdsRequestStandardScheme.write(GetValidWriteIdsRequest.java:489)
at
org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest$GetValidWriteIdsRequestStandardScheme.write(GetValidWriteIdsRequest.java:424)
at
org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.write(GetValidWriteIdsRequest.java:362)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_valid_write_ids(ThriftHiveMetastore.java:5435)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getValidWriteIds(HiveMetaStoreClient.java:2589)
at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
at com.sun.proxy.$Proxy37.getValidWriteIds(Unknown Source)
at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2934)
at com.sun.proxy.$Proxy37.getValidWriteIds(Unknown Source)
at
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:712)
... 21 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at
org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
... 44 more

It looks like while the thrift client is writing the request in the
OutputStream, the server resets the connection.
Is there any workaround to this? Or has anyone else experienced this?
This only started showing up recently. Thank you in advance.

Regards,
Bernard


Re: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large.

2020-02-11 Thread Bernard Quizon
Hi.

We fixed the issue by patching protobuf-java-2.5.0.jar, we
changed CodedInputStream.DEFAULT_SIZE_LIMIT to 1GB.
Uploaded the patched version on our servers and added the location of the
aforementioned jar to the *tez.cluster.additional.classpath.prefix*
(tez-site.xml)
to /path/to/patched/protobuf-java-2.5.0.jar:.
Please note that it should be the first jar on the
*tez.cluster.additional.classpath.prefix*.
Apparently, Tez was using the default 64MB protobuf message limit.

BTW, latest protobuf version was set to Integer.MAX_VALUE.
See
https://github.com/protocolbuffers/protobuf/blob/v3.11.3/java/core/src/main/java/com/google/protobuf/CodedInputStream.java#L62-L65
.

Regards,
Bernard

On Mon, Feb 10, 2020 at 8:23 PM Bernard Quizon <
bernard.qui...@cheetahdigital.com> wrote:

> Hi.
>
> We're using Hive 3.0.1 and we're currently experiencing this issue:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *Error while processing statement: FAILED: Execution Error, return code 2
> from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed,
> vertexName=Map 1, vertexId=vertex_1581309524541_0094_14_00,
> diagnostics=[Vertex vertex_1581309524541_0094_14_00 [Map 1] killed/failed
> due to:INIT_FAILURE, Fail to create InputInitializerManager,
> org.apache.tez.dag.api.TezReflectionException: Unable to instantiate class
> with 1 arguments: org.apache.hadoop.hive.ql.exec.tez.HiveSplitGeneratorat
> org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:71)at
> org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:89)at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:152)at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:148)at
> java.security.AccessController.doPrivileged(Native Method)at
> javax.security.auth.Subject.doAs(Subject.java:422)at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)at
> org.apache.tez.dag.app.dag.RootInputInitializerManager.createInitializer(RootInputInitializerManager.java:148)at
> org.apache.tez.dag.app.dag.RootInputInitializerManager.runInputInitializers(RootInputInitializerManager.java:121)at
> org.apache.tez.dag.app.dag.impl.VertexImpl.setupInputInitializerManager(VertexImpl.java:4122)at
> org.apache.tez.dag.app.dag.impl.VertexImpl.access$3100(VertexImpl.java:207)at
> org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.handleInitEvent(VertexImpl.java:2932)at
> org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2879)at
> org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2861)at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)at
> org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:59)at
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1957)at
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:206)at
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2317)at
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2303)at
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:180)at
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115)at
> java.lang.Thread.run(Thread.java:745)Caused by:
> java.lang.reflect.InvocationTargetExceptionat
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)at
> java.lang.reflect.Constructor.newInstance(Constructor.java:423)at
> org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68)...
> 25 moreCaused by: com.google.protobuf.InvalidProtocolBufferException:
> Protocol message was too large.  May be malicious.  Use
> CodedInputStream.setSizeLimit() to increase the size limit.at
> com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)at
> com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)at
> com.go

com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large.

2020-02-10 Thread Bernard Quizon
Hi.

We're using Hive 3.0.1 and we're currently experiencing this issue:




















































*Error while processing statement: FAILED: Execution Error, return code 2
from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed,
vertexName=Map 1, vertexId=vertex_1581309524541_0094_14_00,
diagnostics=[Vertex vertex_1581309524541_0094_14_00 [Map 1] killed/failed
due to:INIT_FAILURE, Fail to create InputInitializerManager,
org.apache.tez.dag.api.TezReflectionException: Unable to instantiate class
with 1 arguments: org.apache.hadoop.hive.ql.exec.tez.HiveSplitGeneratorat
org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:71)at
org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:89)at
org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:152)at
org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:148)at
java.security.AccessController.doPrivileged(Native Method)at
javax.security.auth.Subject.doAs(Subject.java:422)at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)at
org.apache.tez.dag.app.dag.RootInputInitializerManager.createInitializer(RootInputInitializerManager.java:148)at
org.apache.tez.dag.app.dag.RootInputInitializerManager.runInputInitializers(RootInputInitializerManager.java:121)at
org.apache.tez.dag.app.dag.impl.VertexImpl.setupInputInitializerManager(VertexImpl.java:4122)at
org.apache.tez.dag.app.dag.impl.VertexImpl.access$3100(VertexImpl.java:207)at
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.handleInitEvent(VertexImpl.java:2932)at
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2879)at
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2861)at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)at
org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)at
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:59)at
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1957)at
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:206)at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2317)at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2303)at
org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:180)at
org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115)at
java.lang.Thread.run(Thread.java:745)Caused by:
java.lang.reflect.InvocationTargetExceptionat
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)at
java.lang.reflect.Constructor.newInstance(Constructor.java:423)at
org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68)...
25 moreCaused by: com.google.protobuf.InvalidProtocolBufferException:
Protocol message was too large.  May be malicious.  Use
CodedInputStream.setSizeLimit() to increase the size limit.at
com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)at
com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)at
com.google.protobuf.CodedInputStream.isAtEnd(CodedInputStream.java:701)at
com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:99)at
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto.(DAGProtos.java:19294)at
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto.(DAGProtos.java:19258)at
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto$1.parsePartialFrom(DAGProtos.java:19360)at
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto$1.parsePartialFrom(DAGProtos.java:19355)at
com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)at
com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217)at
com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223)at
com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)at
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto.parseFrom(DAGProtos.java:19552)at
org.apache.tez.common.TezUtils.createConfFromByteString(TezUtils.java:116)at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.(HiveSplitGenerator.java:130)...
30 more]Vertex killed, vertexName=Reducer 2,
vertexId=vertex_1581309524541_0094_14_01, diagnostics=[Vertex received Kill
in NEW state., Vertex vertex_1581309524541_0094_14_01 [Reducer 2]
killed/failed due 

com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large.

2020-02-10 Thread Bernard Quizon
Hi.

We're using Hive 3.0.1 and we're currently experiencing this issue:




















































*Error while processing statement: FAILED: Execution Error, return code 2
from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed,
vertexName=Map 1, vertexId=vertex_1581309524541_0094_14_00,
diagnostics=[Vertex vertex_1581309524541_0094_14_00 [Map 1] killed/failed
due to:INIT_FAILURE, Fail to create InputInitializerManager,
org.apache.tez.dag.api.TezReflectionException: Unable to instantiate class
with 1 arguments: org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator at
org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:71)
at
org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:89)
at
org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:152)
at
org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:148)
at java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at
org.apache.tez.dag.app.dag.RootInputInitializerManager.createInitializer(RootInputInitializerManager.java:148)
at
org.apache.tez.dag.app.dag.RootInputInitializerManager.runInputInitializers(RootInputInitializerManager.java:121)
at
org.apache.tez.dag.app.dag.impl.VertexImpl.setupInputInitializerManager(VertexImpl.java:4122)
at
org.apache.tez.dag.app.dag.impl.VertexImpl.access$3100(VertexImpl.java:207)
at
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.handleInitEvent(VertexImpl.java:2932)
at
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2879)
at
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2861)
at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
at
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:59)
at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1957)
at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:206)
at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2317)
at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2303)
at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:180)
at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115) at
java.lang.Thread.run(Thread.java:745)Caused by:
java.lang.reflect.InvocationTargetException at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at
org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68)
... 25 moreCaused by: com.google.protobuf.InvalidProtocolBufferException:
Protocol message was too large.  May be malicious.  Use
CodedInputStream.setSizeLimit() to increase the size limit. at
com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
at
com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
at com.google.protobuf.CodedInputStream.isAtEnd(CodedInputStream.java:701)
at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:99)
at
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto.(DAGProtos.java:19294)
at
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto.(DAGProtos.java:19258)
at
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto$1.parsePartialFrom(DAGProtos.java:19360)
at
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto$1.parsePartialFrom(DAGProtos.java:19355)
at
com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) at
com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) at
com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) at
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto.parseFrom(DAGProtos.java:19552)
at
org.apache.tez.common.TezUtils.createConfFromByteString(TezUtils.java:116)
at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.(HiveSplitGenerator.java:130)
... 30 more]Vertex killed, vertexName=Reducer 2,
vertexId=vertex_1581309524541_0094_14_01, diagnostics=[Vertex received Kill
in NEW state., Vertex vertex_1581309

Hive on Tez vs LLAP Count difference

2019-07-16 Thread Bernard Quizon
Hi.

So we encountered this issue where counts on transactional tables are
sometimes different between Hive on Tez vs LLAP.
We're using Hive 3.1.0 BTW, maybe there's already a fix for this or a
workaround. Thanks in advance.

-Bernard


Re: Generic UDF with Map Return Type -> Class Not Found Error on JOINs

2019-07-04 Thread Bernard Quizon
Hi.

Just an update, it is working when I use the Default HiveServer JDBC URL.
The error occurs when I use LLAP.

Regards,
Bernard

On Fri, Jul 5, 2019 at 10:40 AM Bernard Quizon <
bernard.qui...@cheetahdigital.com> wrote:

> Hi.
>
> So I created a GenericUDF that returns a map, it works fine on simple
> SELECT statements.
> *For example:*
> SELECT member_id, map_merge(src_map, dest_map, array('key1')) from
> test_table limit 100;
>
> But returns an error when I use it on JOINs, for example:
>
> SELECT
> cust100.map_merge(e.map_1, t.map_1, array('key1'))
> FROM test_table t
> INNER JOIN ext_test_table e
> ON t.id = e.id
>
> Please see stack trace below:
>
> Serialization trace:
>
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
>
> colExprMap (org.apache.hadoop.hive.ql.plan.SelectDesc)
>
> conf (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
>
> childOperators
> (org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerStringOperator)
>
> childOperators (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
>
> childOperators (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator)
>
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
>
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
>
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
>
> at
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:185)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
>
> at
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:180)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:161)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:39)
>
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
>
> at
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:218)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
>
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
>
> at
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:218)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
>
> at
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:180)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
>
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
>
> at
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:218)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
>
> at
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:180)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
>
> at
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
>
> at org.apache.hiv

Generic UDF with Map Return Type -> Class Not Found Error on JOINs

2019-07-04 Thread Bernard Quizon
Hi.

So I created a GenericUDF that returns a map, it works fine on simple
SELECT statements.
*For example:*
SELECT member_id, map_merge(src_map, dest_map, array('key1')) from
test_table limit 100;

But returns an error when I use it on JOINs, for example:

SELECT
cust100.map_merge(e.map_1, t.map_1, array('key1'))
FROM test_table t
INNER JOIN ext_test_table e
ON t.id = e.id

Please see stack trace below:

Serialization trace:

genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)

colExprMap (org.apache.hadoop.hive.ql.plan.SelectDesc)

conf (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)

childOperators
(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerStringOperator)

childOperators (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)

childOperators (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator)

childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)

aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)

at
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156)

at
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)

at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)

at
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:185)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)

at
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)

at
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:180)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:161)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:39)

at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)

at
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:218)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)

at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)

at
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:218)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)

at
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)

at
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:180)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)

at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)

at
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:218)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)

at
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)

at
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:180)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)

at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)

at
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:218)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)

at
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)

at
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:180)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)

at
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)

at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708

Embedded Hive Server Error on Table Creation

2019-05-08 Thread Bernard Quizon
Hi,

So I'm using Hive 3.1.0 and I'm trying to get an embedded HiveServer2
instance to work but I'm running into an issue.

Here's my sample code:

System.setProperty("hive.execution.engine", "tez")
System.setProperty("javax.jdo.option.ConnectionURL",
"jdbc:derby:;databaseName=/tmp/hive/metastore_db;create=true")
System.setProperty("hive.metastore.warehouse.dir",
"file:tmp/hive/warehouse")
System.setProperty("hive.metastore.metadb.dir",
"file:///tmp/hive/metastore_db")
System.setProperty("hive.metastore.local", "true")
System.setProperty("fs.defaultFS", "file:///")

val config = new HiveConf()
hiveServer2 = new HiveServer2()
hiveServer2.init(config)
hiveServer2.start()
Thread.sleep(5000)

Try(Class.forName("org.apache.hive.jdbc.HiveDriver"))

val hiveConnection = DriverManager.getConnection(s"jdbc:hive2:///", "", "")
val stmt = hiveConnection.createStatement
stmt.execute(s"CREATE DATABASE IF NOT EXISTS tmp")
 stmt.execute(
s"""|CREATE TABLE IF NOT EXISTS tmp.test_table(
  |  col1 STRING,
  |  col2 STRING
  |)
  |COMMENT 'test'
  |STORED AS ORC
  |TBLPROPERTIES ('transactional'='true')""".stripMargin
)

Here's the actual error I'm getting:

2019-05-08 19:02:55,147 WARN  [main] thrift.ThriftCLIService
(ThriftCLIService.java:ExecuteStatement(571)) - Error executing statement:
org.apache.hive.service.cli.HiveSQLException: Error while compiling
statement: FAILED: IllegalStateException Unexpected Exception thrown:
Unable to fetch table test_table. Exception thrown when executing query :
SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MTable' AS
NUCLEUS_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.OWNER,A0.OWNER_TYPE,A0.RETENTION,A0.IS_REWRITE_ENABLED,A0.TBL_NAME,A0.TBL_TYPE,A0.WRITE_ID,A0.TBL_ID
FROM TBLS A0 LEFT OUTER JOIN DBS B0 ON A0.DB_ID = B0.DB_ID WHERE
A0.TBL_NAME = ? AND B0."NAME" = ? AND B0.CTLG_NAME = ?
at
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:335)
at
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:199)
at
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:541)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:527)
at
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)
at
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:562)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1572)
at com.sun.proxy.$Proxy22.ExecuteStatement(Unknown Source)
at
org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:323)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:265)
at
com.cheetahdigital.hive.HiveEmbeddedServer$.main(HiveEmbeddedServer.scala:54)
at com.cheetahdigital.hive.HiveEmbeddedServer.main(HiveEmbeddedServer.scala)

It's throwing an exception whenever I create a table, creating a database
works.

Thanks,
Bernard