[ https://issues.apache.org/jira/browse/SPARK-21139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053911#comment-16053911 ]
Sean Owen commented on SPARK-21139: ----------------------------------- That looks like an issue from the HBase client, not Spark. > java.util.concurrent.RejectedExecutionException: rejected from > java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, > active threads = 0, queued tasks = 0, completed tasks = 14109] > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-21139 > URL: https://issues.apache.org/jira/browse/SPARK-21139 > Project: Spark > Issue Type: Bug > Components: Input/Output > Affects Versions: 1.5.2 > Environment: use spark1.5.2 and hbase 1.1.2 > Reporter: shining > > We create two tables use Hive HBaseStorageHandler like: > CREATE EXTERNAL TABLE `yx_bw`( > `rowkey` string, > `occur_time` string, > `milli_second` string, > `yx_id` string , > `resp_area` string , > `st_id` string, > `bay_id` string, > `device_type_id` string, > `content` string, > ......) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.hbase.HBaseSerDe' > STORED BY > 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ( > > 'hbase.columns.mapping'=':key,f:OCCUR_TIME,f:MILLI_SECOND,f:YX_ID,f:RESP_AREA,f:ST_ID,f:BAY_ID,f:STATUS,f:CONTENT,f:VLTY_ID,f:MEAS_TYPE,f:RESTRAIN_FLAG,f:R > ESERV_INT1,f:RESERV_INT2,f:CUSTOMIZED_GROUP,f:CONFIRM_STATUS,f:CONFIRM_TIME,f:CONFIRM_USER_ID,f:CONFIRM_NODE_ID,f:IF_DISPLAY', > 'serialization.format'='1') > TBLPROPERTIES ( > 'hbase.table.name'='yx_bw'') > Then we use sparksql to run a join between two tables. > select * from xxgljxb a, yx_bw b where a.YX_ID = b.YX_ID; > When scan hbase table, we encounter the issue: > org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in > stage 1.0 failed 1 times, most recent failure: Lost task 2.0 in stage 1.0 > (TID 3, localhost): java.lang.RuntimeException: java.util.concurrent. > : Task > org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFutureQueueingFuture@37b2d978 > rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, > pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109] > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:208) > at > org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320) > at > org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403) > at > org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364) > at > org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:205) > at > org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:147) > at > org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.nextKeyValue(TableInputFormatBase.java:216) > at > org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:156) > at > org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:114) > at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:248) > at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:216) > at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) > at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:118) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:88) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.concurrent.RejectedExecutionException: Task > org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@37b2d978 > rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, > pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109] > at > java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) > at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372) > at > org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:142) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:269) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:165) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) > ... 27 more > Driver stacktrace: > at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > at > org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921) > at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:909) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:310) > at org.apache.spark.rdd.RDD.collect(RDD.scala:908) > at > org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:405) > at > org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) > at py4j.Gateway.invoke(Gateway.java:259) > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) > at py4j.commands.CallCommand.execute(CallCommand.java:79) > at py4j.GatewayConnection.run(GatewayConnection.java:207) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org