[ 
https://issues.apache.org/jira/browse/SPARK-21139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-21139.
----------------------------------
    Resolution: Invalid

I am resolving this as it looks there is no argument against ^. Please reopen 
if there are more symptoms that indicates this is a Spark issue. This does not 
look a Spark issue to me too.

> java.util.concurrent.RejectedExecutionException: rejected from 
> java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, 
> active threads = 0, queued tasks = 0, completed tasks = 14109]
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-21139
>                 URL: https://issues.apache.org/jira/browse/SPARK-21139
>             Project: Spark
>          Issue Type: Bug
>          Components: Input/Output
>    Affects Versions: 1.5.2
>         Environment: use spark1.5.2 and hbase 1.1.2
>            Reporter: shining
>
> We create two tables use Hive HBaseStorageHandler like:
> CREATE EXTERNAL TABLE `yx_bw`(
>   `rowkey` string, 
>   `occur_time` string, 
>   `milli_second` string, 
>   `yx_id` string , 
>   `resp_area` string , 
>   `st_id` string, 
>   `bay_id` string, 
>   `device_type_id` string, 
>   `content` string, 
>     ......)
> ROW FORMAT SERDE 
>   'org.apache.hadoop.hive.hbase.HBaseSerDe' 
> STORED BY 
>   'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ( 
>  
> 'hbase.columns.mapping'=':key,f:OCCUR_TIME,f:MILLI_SECOND,f:YX_ID,f:RESP_AREA,f:ST_ID,f:BAY_ID,f:STATUS,f:CONTENT,f:VLTY_ID,f:MEAS_TYPE,f:RESTRAIN_FLAG,f:R
> ESERV_INT1,f:RESERV_INT2,f:CUSTOMIZED_GROUP,f:CONFIRM_STATUS,f:CONFIRM_TIME,f:CONFIRM_USER_ID,f:CONFIRM_NODE_ID,f:IF_DISPLAY',
>    'serialization.format'='1')
> TBLPROPERTIES (
>   'hbase.table.name'='yx_bw'')
> Then we use sparksql to run a join between two tables.
> select * from xxgljxb a, yx_bw b where a.YX_ID = b.YX_ID; 
> When scan hbase table, we encounter the issue:
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in 
> stage 1.0 failed 1 times, most recent failure: Lost task 2.0 in stage 1.0 
> (TID 3, localhost): java.lang.RuntimeException: java.util.concurrent.         
>                                          : Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFutureQueueingFuture@37b2d978
>  rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109]
>       at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:208)
>       at 
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
>       at 
> org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403)
>       at 
> org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364)
>       at 
> org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:205)
>       at 
> org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:147)
>       at 
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.nextKeyValue(TableInputFormatBase.java:216)
>       at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:156)
>       at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:114)
>       at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:248)
>       at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:216)
>       at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>       at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>       at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>       at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>       at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
>       at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>       at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>       at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>       at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:118)
>       at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>       at org.apache.spark.scheduler.Task.run(Task.scala:88)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@37b2d978
>  rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109]
>       at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>       at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>       at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>       at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:142)
>       at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:269)
>       at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:165)
>       at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
>       at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>       ... 27 more
> Driver stacktrace:
>       at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
>       at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>       at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>       at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>       at scala.Option.foreach(Option.scala:236)
>       at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
>       at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>       at 
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
>       at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
>       at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
>       at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850)
>       at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)
>       at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:909)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>       at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
>       at org.apache.spark.rdd.RDD.collect(RDD.scala:908)
>       at 
> org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:405)
>       at 
> org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>       at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>       at py4j.Gateway.invoke(Gateway.java:259)
>       at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>       at py4j.commands.CallCommand.execute(CallCommand.java:79)
>       at py4j.GatewayConnection.run(GatewayConnection.java:207)
>       at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to