To understand the issue, you need to describe more about your case; what's the version of spark you use and what's your job? Also, what if you directly use scala interfaces instead of python ones?
On Mon, May 16, 2016 at 11:56 PM, Aleksandr Modestov < aleksandrmodes...@gmail.com> wrote: > Hi, > > "Why did you though you have enough memory for your task? You checked task > statistics in your WebUI?". I mean that I have jnly about 5Gb data but > spark.driver memory in 60Gb. I check task statistics in web UI. > But really spark says that > *"05-16 17:50:06.254 127.0.0.1:54321 <http://127.0.0.1:54321> 1534 > #e Thread WARN: Swapping! GC CALLBACK, (K/V:29.74 GB + POJO:18.97 GB + > FREE:8.79 GB == MEM_MAX:57.50 GB), desiredKV=7.19 GB OOM!Exception in > thread "Heartbeat" java.lang.OutOfMemoryError: Java heap space"* > But why spark doesn't split data into a disk? > > On Mon, May 16, 2016 at 5:11 PM, Takeshi Yamamuro <linguin....@gmail.com> > wrote: > >> Hi, >> >> Why did you though you have enough memory for your task? You checked task >> statistics in your WebUI? >> Anyway, If you get stuck with the GC issue, you'd better off increasing >> the number of partitions. >> >> // maropu >> >> On Mon, May 16, 2016 at 10:00 PM, AlexModestov < >> aleksandrmodes...@gmail.com> wrote: >> >>> I get the error in the apache spark... >>> >>> "spark.driver.memory 60g >>> spark.python.worker.memory 60g >>> spark.master local[*]" >>> >>> The amount of data is about 5Gb, but spark says that "GC overhead limit >>> exceeded". I guess that my conf-file gives enought resources. >>> >>> "16/05/16 15:13:02 WARN NettyRpcEndpointRef: Error sending message >>> [message >>> = Heartbeat(driver,[Lscala.Tuple2;@87576f9,BlockManagerId(driver, >>> localhost, >>> 59407))] in 1 attempts >>> org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 >>> seconds]. This timeout is controlled by spark.executor.heartbeatInterval >>> at >>> org.apache.spark.rpc.RpcTimeout.org >>> $apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48) >>> at >>> >>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63) >>> at >>> >>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) >>> at >>> >>> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33) >>> at >>> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76) >>> at >>> >>> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101) >>> at >>> org.apache.spark.executor.Executor.org >>> $apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:449) >>> at >>> >>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470) >>> at >>> >>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470) >>> at >>> >>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470) >>> at >>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765) >>> at >>> org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470) >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >>> at >>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) >>> at >>> >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) >>> at >>> >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745) >>> Caused by: java.util.concurrent.TimeoutException: Futures timed out after >>> [10 seconds] >>> at >>> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) >>> at >>> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) >>> at >>> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) >>> at >>> >>> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) >>> at scala.concurrent.Await$.result(package.scala:107) >>> at >>> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) >>> ... 14 more >>> 16/05/16 15:13:02 WARN NettyRpcEnv: Ignored message: >>> HeartbeatResponse(false) >>> 05-16 15:13:26.398 127.0.0.1:54321 2059 #e Thread WARN: >>> Swapping! >>> GC CALLBACK, (K/V:29.74 GB + POJO:16.74 GB + FREE:11.03 GB == >>> MEM_MAX:57.50 >>> GB), desiredKV=7.19 GB OOM! >>> 05-16 15:13:44.528 127.0.0.1:54321 2059 #e Thread WARN: >>> Swapping! >>> GC CALLBACK, (K/V:29.74 GB + POJO:16.86 GB + FREE:10.90 GB == >>> MEM_MAX:57.50 >>> GB), desiredKV=7.19 GB OOM! >>> 05-16 15:13:56.847 127.0.0.1:54321 2059 #e Thread WARN: >>> Swapping! >>> GC CALLBACK, (K/V:29.74 GB + POJO:16.88 GB + FREE:10.88 GB == >>> MEM_MAX:57.50 >>> GB), desiredKV=7.19 GB OOM! >>> 05-16 15:14:10.215 127.0.0.1:54321 2059 #e Thread WARN: >>> Swapping! >>> GC CALLBACK, (K/V:29.74 GB + POJO:16.90 GB + FREE:10.86 GB == >>> MEM_MAX:57.50 >>> GB), desiredKV=7.19 GB OOM! >>> 05-16 15:14:33.622 127.0.0.1:54321 2059 #e Thread WARN: >>> Swapping! >>> GC CALLBACK, (K/V:29.74 GB + POJO:16.91 GB + FREE:10.85 GB == >>> MEM_MAX:57.50 >>> GB), desiredKV=7.19 GB OOM! >>> 05-16 15:14:47.075 127.0.0.1:54321 2059 #e Thread WARN: >>> Swapping! >>> GC CALLBACK, (K/V:29.74 GB + POJO:16.93 GB + FREE:10.84 GB == >>> MEM_MAX:57.50 >>> GB), desiredKV=7.19 GB OOM! >>> 05-16 15:15:10.555 127.0.0.1:54321 2059 #e Thread WARN: >>> Swapping! >>> GC CALLBACK, (K/V:29.74 GB + POJO:16.92 GB + FREE:10.84 GB == >>> MEM_MAX:57.50 >>> GB), desiredKV=7.19 GB OOM! >>> 05-16 15:15:25.520 127.0.0.1:54321 2059 #e Thread WARN: >>> Swapping! >>> GC CALLBACK, (K/V:29.74 GB + POJO:16.93 GB + FREE:10.84 GB == >>> MEM_MAX:57.50 >>> GB), desiredKV=7.19 GB OOM! >>> 05-16 15:15:39.087 127.0.0.1:54321 2059 #e Thread WARN: >>> Swapping! >>> GC CALLBACK, (K/V:29.74 GB + POJO:16.93 GB + FREE:10.84 GB == >>> MEM_MAX:57.50 >>> GB), desiredKV=7.19 GB OOM! >>> Exception in thread "HashSessionScavenger-0" java.lang.OutOfMemoryError: >>> GC >>> overhead limit exceeded >>> at >>> >>> java.util.concurrent.ConcurrentHashMap$ValuesView.iterator(ConcurrentHashMap.java:4683) >>> at >>> >>> org.eclipse.jetty.server.session.HashSessionManager.scavenge(HashSessionManager.java:314) >>> at >>> >>> org.eclipse.jetty.server.session.HashSessionManager$2.run(HashSessionManager.java:285) >>> at java.util.TimerThread.mainLoop(Timer.java:555) >>> at java.util.TimerThread.run(Timer.java:505) >>> 16/05/16 15:22:26 ERROR Executor: Exception in task 0.0 in stage 10.0 >>> (TID >>> 107) >>> java.lang.OutOfMemoryError: GC overhead limit exceeded >>> at java.lang.Double.valueOf(Double.java:519) >>> at scala.runtime.BoxesRunTime.boxToDouble(BoxesRunTime.java:84) >>> at >>> >>> org.apache.spark.sql.catalyst.expressions.MutableRow.setDouble(rows.scala:176) >>> at >>> >>> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown >>> Source) >>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >>> at >>> scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30) >>> at >>> org.spark-project.guava.collect.Ordering.leastOf(Ordering.java:665) >>> at >>> org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37) >>> at >>> >>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1391) >>> at >>> >>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1388) >>> at >>> >>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) >>> at >>> >>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) >>> at >>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) >>> at >>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) >>> at >>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) >>> at org.apache.spark.scheduler.Task.run(Task.scala:89) >>> at >>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745) >>> 05-16 15:22:26.947 127.0.0.1:54321 2059 #e Thread WARN: Unblock >>> allocations; cache below desired, but also OOM: GC CALLBACK, (K/V:29.74 >>> GB + >>> POJO:16.93 GB + FREE:10.83 GB == MEM_MAX:57.50 GB), desiredKV=38.52 GB >>> OOM! >>> 05-16 15:22:26.948 127.0.0.1:54321 2059 #e Thread WARN: >>> Swapping! >>> GC CALLBACK, (K/V:29.74 GB + POJO:14.94 GB + FREE:12.83 GB == >>> MEM_MAX:57.50 >>> GB), desiredKV=8.65 GB OOM! >>> 16/05/16 15:22:26 WARN HeartbeatReceiver: Removing executor driver with >>> no >>> recent heartbeats: 144662 ms exceeds timeout 120000 ms >>> 16/05/16 15:22:26 ERROR ActorSystemImpl: exception on LARS’ timer thread >>> java.lang.OutOfMemoryError: GC overhead limit exceeded >>> at >>> akka.dispatch.AbstractNodeQueue.<init>(AbstractNodeQueue.java:22) >>> at >>> >>> akka.actor.LightArrayRevolverScheduler$TaskQueue.<init>(Scheduler.scala:443) >>> at >>> >>> akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409) >>> at >>> akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375) >>> at java.lang.Thread.run(Thread.java:745) >>> 16/05/16 15:22:26 INFO ActorSystemImpl: starting new LARS thread >>> 16/05/16 15:22:26 ERROR TaskSchedulerImpl: Lost executor driver on >>> localhost: Executor heartbeat timed out after 144662 ms >>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 3.0 in stage 10.0 (TID >>> 110, >>> localhost): ExecutorLostFailure (executor driver exited caused by one of >>> the >>> running tasks) Reason: Executor heartbeat timed out after 144662 ms >>> 16/05/16 15:22:26 ERROR TaskSetManager: Task 3 in stage 10.0 failed 1 >>> times; >>> aborting job >>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 6.0 in stage 10.0 (TID >>> 113, >>> localhost): ExecutorLostFailure (executor driver exited caused by one of >>> the >>> running tasks) Reason: Executor heartbeat timed out after 144662 ms >>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 0.0 in stage 10.0 (TID >>> 107, >>> localhost): ExecutorLostFailure (executor driver exited caused by one of >>> the >>> running tasks) Reason: Executor heartbeat timed out after 144662 ms >>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 2.0 in stage 10.0 (TID >>> 109, >>> localhost): ExecutorLostFailure (executor driver exited caused by one of >>> the >>> running tasks) Reason: Executor heartbeat timed out after 144662 ms >>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 5.0 in stage 10.0 (TID >>> 112, >>> localhost): ExecutorLostFailure (executor driver exited caused by one of >>> the >>> running tasks) Reason: Executor heartbeat timed out after 144662 ms >>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 7.0 in stage 10.0 (TID >>> 114, >>> localhost): ExecutorLostFailure (executor driver exited caused by one of >>> the >>> running tasks) Reason: Executor heartbeat timed out after 144662 ms >>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 1.0 in stage 10.0 (TID >>> 108, >>> localhost): ExecutorLostFailure (executor driver exited caused by one of >>> the >>> running tasks) Reason: Executor heartbeat timed out after 144662 ms >>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 4.0 in stage 10.0 (TID >>> 111, >>> localhost): ExecutorLostFailure (executor driver exited caused by one of >>> the >>> running tasks) Reason: Executor heartbeat timed out after 144662 ms >>> 16/05/16 15:22:26 INFO TaskSchedulerImpl: Removed TaskSet 10.0, whose >>> tasks >>> have all completed, from pool >>> 16/05/16 15:22:26 ERROR ActorSystemImpl: Uncaught fatal error from thread >>> [sparkDriverActorSystem-scheduler-1] shutting down ActorSystem >>> [sparkDriverActorSystem] >>> java.lang.OutOfMemoryError: GC overhead limit exceeded >>> at >>> akka.dispatch.AbstractNodeQueue.<init>(AbstractNodeQueue.java:22) >>> at >>> >>> akka.actor.LightArrayRevolverScheduler$TaskQueue.<init>(Scheduler.scala:443) >>> at >>> >>> akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409) >>> at >>> akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375) >>> at java.lang.Thread.run(Thread.java:745) >>> 16/05/16 15:22:27 INFO RemoteActorRefProvider$RemotingTerminator: >>> Shutting >>> down remote daemon. >>> 16/05/16 15:22:27 INFO RemoteActorRefProvider$RemotingTerminator: Remote >>> daemon shut down; proceeding with flushing remote transports. >>> 16/05/16 15:22:27 WARN NettyRpcEnv: Ignored message: true >>> 16/05/16 15:22:27 WARN NettyRpcEnv: Ignored message: true >>> 16/05/16 15:22:27 ERROR SparkUncaughtExceptionHandler: Uncaught >>> exception in >>> thread Thread[Executor task launch worker-14,5,main] >>> java.lang.OutOfMemoryError: GC overhead limit exceeded >>> at java.lang.Double.valueOf(Double.java:519) >>> at scala.runtime.BoxesRunTime.boxToDouble(BoxesRunTime.java:84) >>> at >>> >>> org.apache.spark.sql.catalyst.expressions.MutableRow.setDouble(rows.scala:176) >>> at >>> >>> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown >>> Source) >>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >>> at >>> scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30) >>> at >>> org.spark-project.guava.collect.Ordering.leastOf(Ordering.java:665) >>> at >>> org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37) >>> at >>> >>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1391) >>> at >>> >>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1388) >>> at >>> >>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) >>> at >>> >>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) >>> at >>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) >>> at >>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) >>> at >>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) >>> at org.apache.spark.scheduler.Task.run(Task.scala:89) >>> at >>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745) >>> 16/05/16 15:22:27 INFO TaskSchedulerImpl: Cancelling stage 10 >>> 16/05/16 15:22:27 WARN SparkContext: Killing executors is only supported >>> in >>> coarse-grained mode >>> 16/05/16 15:22:27 INFO DAGScheduler: ResultStage 10 (head at >>> <ipython-input-13-f753ebdb6b0f>:13) failed in 667.824 s >>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat >>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with >>> master >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register >>> BlockManager >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager >>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master. >>> 16/05/16 15:22:27 INFO DAGScheduler: Executor lost: driver (epoch 2) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerMasterEndpoint: Trying to remove >>> executor >>> driver from BlockManagerMaster. >>> 16/05/16 15:22:27 INFO BlockManagerMasterEndpoint: Removing block manager >>> BlockManagerId(driver, localhost, 59407) >>> 16/05/16 15:22:27 INFO DAGScheduler: Job 8 failed: head at >>> <ipython-input-13-f753ebdb6b0f>:13, took 667.845630 s >>> 16/05/16 15:22:27 ERROR BlockManager: Failed to report >>> broadcast_15_piece0 >>> to master; giving up. >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Removed driver successfully in >>> removeExecutor >>> 16/05/16 15:22:27 INFO DAGScheduler: Host added was in lost list earlier: >>> localhost >>> 16/05/16 15:22:27 INFO SparkContext: Invoking stop() from shutdown hook >>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat >>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with >>> master >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register >>> BlockManager >>> 16/05/16 15:22:27 INFO BlockManagerMasterEndpoint: Registering block >>> manager >>> localhost:59407 with 51.5 GB RAM, BlockManagerId(driver, localhost, >>> 59407) >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager >>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master. >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in >>> memory >>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat >>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with >>> master >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register >>> BlockManager >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager >>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master. >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in >>> memory >>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat >>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with >>> master >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register >>> BlockManager >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager >>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master. >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in >>> memory >>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat >>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with >>> master >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register >>> BlockManager >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager >>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master. >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in >>> memory >>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat >>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with >>> master >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register >>> BlockManager >>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager >>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master. >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO SparkUI: Stopped Spark web UI at >>> http://192.168.107.30:4040 >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in >>> memory >>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in >>> memory >>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB) >>> 16/05/16 15:22:56 INFO MapOutputTrackerMasterEndpoint: >>> MapOutputTrackerMasterEndpoint stopped! >>> 05-16 15:22:56.111 127.0.0.1:54321 2059 #e Thread WARN: >>> Swapping! >>> GC CALLBACK, (K/V:29.74 GB + POJO:15.20 GB + FREE:12.56 GB == >>> MEM_MAX:57.50 >>> GB), desiredKV=8.12 GB OOM! >>> 16/05/16 15:22:56 INFO RemoteActorRefProvider$RemotingTerminator: >>> Remoting >>> shut down. >>> 16/05/16 15:22:56 WARN NettyRpcEndpointRef: Error sending message >>> [message = >>> Heartbeat(driver,[Lscala.Tuple2;@797268e9,BlockManagerId(driver, >>> localhost, >>> 59407))] in 1 attempts >>> org.apache.spark.SparkException: Could not find HeartbeatReceiver or it >>> has >>> been stopped. >>> at >>> org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:161) >>> at >>> >>> org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:126) >>> at >>> org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:227) >>> at >>> org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:511) >>> at >>> >>> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:100) >>> at >>> org.apache.spark.executor.Executor.org >>> $apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:449) >>> at >>> >>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470) >>> at >>> >>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470) >>> at >>> >>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470) >>> at >>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765) >>> at >>> org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470) >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >>> at >>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) >>> at >>> >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) >>> at >>> >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745)" >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/GC-overhead-limit-exceeded-tp26966.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >> >> -- >> --- >> Takeshi Yamamuro >> > > -- --- Takeshi Yamamuro