[ https://issues.apache.org/jira/browse/HBASE-22769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896426#comment-16896426 ]
Noah Banholzer commented on HBASE-22769: ---------------------------------------- Possibly related to [HBASE-17989|https://issues.apache.org/jira/browse/HBASE-17989] > Runtime Error on join (with filter) when using hbase-spark connector > -------------------------------------------------------------------- > > Key: HBASE-22769 > URL: https://issues.apache.org/jira/browse/HBASE-22769 > Project: HBase > Issue Type: Bug > Components: hbase-connectors > Affects Versions: connector-1.0.0 > Environment: Built using maven scala plugin on intellij IDEA with > Maven 3.3.9. Ran on Azure HDInsight Spark cluster using Yarn. > Spark version: 2.4.0 > Scala version: 2.11.12 > hbase-spark version: 1.0.0 > Reporter: Noah Banholzer > Priority: Blocker > > I am attempting to do a left outer join (though any join with a push down > filter causes this issue) between a Spark Structured Streaming DataFrame and > a DataFrame read from HBase. I get the following stack trace when running a > simple spark app that reads from a streaming source and attempts to left > outer join with a dataframe read from HBase: > {{19/07/30 18:30:25 INFO DAGScheduler: ShuffleMapStage 1 (start at > SparkAppTest.scala:88) failed in 3.575 s due to Job aborted due to stage > failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task > 0.3 in stage 1.0 (TID 10, > wn5-edpspa.hnyo2upsdeau1bffc34wwrkgwc.ex.internal.cloudapp.net, executor 2): > org.apache.hadoop.hbase.DoNotRetryIOException: > org.apache.hadoop.hbase.DoNotRetryIOException: > java.lang.reflect.InvocationTargetException at > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toFilter(ProtobufUtil.java:1609) > at > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:1154) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:2967) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3301) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42002) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413) at > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: java.lang.reflect.InvocationTargetException at > sun.reflect.GeneratedMethodAccessor15461.invoke(Unknown Source) at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toFilter(ProtobufUtil.java:1605) > }} > {{... 8 more }} > {{Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/hbase/spark/datasources/JavaBytesEncoder$ at > org.apache.hadoop.hbase.spark.datasources.JavaBytesEncoder.create(JavaBytesEncoder.scala) > at > org.apache.hadoop.hbase.spark.SparkSQLPushDownFilter.parseFrom(SparkSQLPushDownFilter.java:196) > }} > {{... 12 more }} > {{at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100) > at > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90) > at > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:359) > at > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:347) > at > org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:344) > at > org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:242) > at > org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:58) > at > org.apache.hadoop.hbase.client.RegionServerCallable.call(RegionServerCallable.java:127) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:387) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:361) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107) > at > org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748)}} > > It appears to be attempting to reference a file called > "JavaBytesEncoder$.class" resulting in a NoClassDefFoundError. Interestingly, > when I unzipped the jar I found that both "JavaBytesEncoder.class" and > "JavaBytesEncoder$.class" exist, but the latter is simply an empty file. This > might just be a case of me misunderstanding how Java links classes upon build > however. -- This message was sent by Atlassian JIRA (v7.6.14#76016)