Turns out I was using the wrong JAR to provide the base classes for LlapDaemon. 
Removing hadoop-client-* from the classpath and using hadoop-common instead 
fixed this problem.

From: Aaron Grubb
Sent: Monday, November 11, 2019 1:11 PM
To: user@hive.apache.org
Subject: LLAP/Protobuffers Error: Class Cannot Be Cast to Class

Hello all,

I'm running a LLAP daemon through YARN + ZK. The container for a Hive query 
begins to execute but there's a class cast error that I don't know how to 
debug. Here's the logs:

cat syslog_dag_<container_id>
---------------------------------------------------
...
2019-11-11 17:32:02,631 [INFO] [LlapScheduler] 
|tezplugins.LlapTaskSchedulerService|: Assigned #1, 
task=TaskInfo{task=attempt_1573233179705_0050_1_00_000000_0, priority=5, 
startTime=0, containerId=null, uniqueId=0, localityDelayTimeout=0} on 
node={hostname:43033, id=d84432aa-f08f-467d-8688-c9150430f05e, 
canAcceptTask=true, st=0, ac=12, commF=false, disabled=false}, to 
container=container_222212222_0050_01_000001
2019-11-11 17:32:02,631 [INFO] [LlapScheduler] |GuaranteedTasks|: Registering 
attempt_1573233179705_0050_1_00_000000_0; false
2019-11-11 17:32:02,648 [INFO] [TaskSchedulerAppCallbackExecutor #0] 
|node.PerSourceNodeTracker|: Adding new node hostname:43033 to nodeTracker 2
2019-11-11 17:32:02,680 [INFO] [Dispatcher thread {Central}] 
|tezplugins.LlapTaskCommunicator|: CurrentDagId set to: 1, name=select 
count(device_id) from ...'impression' (Stage-1), 
queryId=root_20191111173153_2e979533-4d13-4b66-a0a5-fd7d48c07e2f
2019-11-11 17:32:02,680 [INFO] [Dispatcher thread {Central}] 
|tezplugins.LlapTaskCommunicator|: Added new known node: hostname:43033
2019-11-11 17:32:02,721 [INFO] [Dispatcher thread {Central}] 
|HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:N/A][Event:CONTAINER_LAUNCHED]: 
containerId=container_222212222_0050_01_000001, launchTime=1573493522721
2019-11-11 17:32:02,722 [INFO] [TaskCommunicator # 0] 
|impl.LlapProtocolClientImpl|: Creating protocol proxy as null
2019-11-11 17:32:02,722 [INFO] [Dispatcher thread {Central}] 
|impl.TaskAttemptImpl|: TaskAttempt: [attempt_1573233179705_0050_1_00_000000_0] 
submitted. Is using containerId: [container_222212222_0050_01_000001] on NM: 
[hostname:43033]
2019-11-11 17:32:02,723 [INFO] [Dispatcher thread {Central}] 
|HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1573233179705_0050_1][Event:TASK_ATTEMPT_STARTED]: 
vertexName=Map 1, taskAttemptId=attempt_1573233179705_0050_1_00_000000_0, 
startTime=1573493522722, containerId=container_222212222_0050_01_000001, 
nodeId=hostname:43033
2019-11-11 17:32:02,823 [INFO] [TaskCommunicator # 0] 
|tezplugins.LlapTaskCommunicator|: Failed to run task: 
attempt_1573233179705_0050_1_00_000000_0 on containerId: 
container_222212222_0050_01_000001
org.apache.hadoop.ipc.RemoteException(java.lang.ClassCastException): 
org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos$LlapDaemonProtocol$2
 cannot be cast to org.apache.hadoop.shaded.com.google.protobuf.BlockingService
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:510)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)

        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545)
        at org.apache.hadoop.ipc.Client.call(Client.java:1491)
        at org.apache.hadoop.ipc.Client.call(Client.java:1388)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
        at com.sun.proxy.$Proxy50.submitWork(Unknown Source)
        at 
org.apache.hadoop.hive.llap.impl.LlapProtocolClientImpl.submitWork(LlapProtocolClientImpl.java:81)
        at 
org.apache.hadoop.hive.llap.tez.LlapProtocolClientProxy$SubmitWorkCallable.call(LlapProtocolClientProxy.java:99)
        at 
org.apache.hadoop.hive.llap.tez.LlapProtocolClientProxy$SubmitWorkCallable.call(LlapProtocolClientProxy.java:89)
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
        at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

2019-11-11 17:32:02,828 [INFO] [Dispatcher thread {Central}] 
|HistoryEventHandler.criticalEvents|: 
[HISTORY][DAG:dag_1573233179705_0050_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 1, taskAttemptId=attempt_1573233179705_0050_1_00_000000_0, 
creationTime=1573493522614, allocationTime=1573493522672, 
startTime=1573493522722, finishTime=1573493522826, timeTaken=104, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=UNKNOWN_ERROR, 
diagnostics=org.apache.hadoop.ipc.RemoteException(java.lang.ClassCastException):
 
org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos$LlapDaemonProtocol$2
 cannot be cast to org.apache.hadoop.shaded.com.google.protobuf.BlockingService
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:510)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)
, nodeHttpAddress=http://hostname:15002, counters=Counters: 1, 
org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
2019-11-11 17:32:02,832 [INFO] [Dispatcher thread {Central}] |impl.TaskImpl|: 
Scheduling new attempt for task: task_1573233179705_0050_1_00_000000, 
currentFailedAttempts: 1, maxFailedAttempts: 4
...
---------------------------------------------------

After which it fails on the 4th attempt. Is this a jar version mismatch or 
protobuffers mismatch or classpath error or...? Let me know what other 
information I should provide. Any help is much appreciated!

Software versions are:

Hadoop 3.2.1
Tez 0.9.2
Hive 3.1.2

Reply via email to