[ https://issues.apache.org/jira/browse/SPARK-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046438#comment-15046438 ]
Xu Chen commented on SPARK-11487: --------------------------------- {code} java.lang.OutOfMemoryError: Java heap space {code} Increment Master heap memory > Spark Master shutdown automatically after some applications execution > --------------------------------------------------------------------- > > Key: SPARK-11487 > URL: https://issues.apache.org/jira/browse/SPARK-11487 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.5.0 > Environment: Spark Standalone on CentOS 6.6, > One Master and 5 worker nodes cluster (Each Node Memory: > 150 GB each, 72 > cores each) > Reporter: Sandeep Pal > Labels: master > > The master logs are as follow after the spark automatic shutdown: > 15/11/02 20:50:01 INFO master.Master: Registering app PythonWordCount > 15/11/02 20:50:01 INFO master.Master: Registered app PythonWordCount with ID > app-20151102205001-0025 > 15/11/02 20:50:01 INFO master.Master: Launching executor > app-20151102205001-0025/0 on worker worker-20151030135450-x.x.x.76-42502 > 15/11/02 20:50:01 INFO master.Master: Launching executor > app-20151102205001-0025/1 on worker worker-20151030135450-x.x.x.86-51916 > 15/11/02 20:50:01 INFO master.Master: Launching executor > app-20151102205001-0025/2 on worker worker-20151030135450-x.x.x.85-47388 > 15/11/02 20:50:01 INFO master.Master: Launching executor > app-20151102205001-0025/3 on worker worker-20151030125450-x.x.x.69-51604 > 15/11/02 20:50:01 INFO master.Master: Launching executor > app-20151102205001-0025/4 on worker worker-20151030135450-x.x.x.87-35705 > 15/11/02 20:57:35 INFO master.Master: Received unregister request from > application app-20151102205001-0025 > 15/11/02 20:57:35 INFO master.Master: Removing app app-20151102205001-0025 > 15/11/02 20:57:35 WARN master.Master: Application PythonWordCount is still in > progress, it may be terminated abnormally. > 15/11/02 20:57:35 INFO spark.SecurityManager: Changing view acls to: root > 15/11/02 20:57:35 INFO spark.SecurityManager: Changing modify acls to: root > 15/11/02 20:57:35 INFO spark.SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(root); users > with modify permissions: Set(root) > 15/11/02 20:57:43 INFO master.Master: x.x.x.x:47502 got disassociated, > removing it. > 15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor > app-20151102205001-0025/4 > 15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor > app-20151102205001-0025/3 > 15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor > app-20151102205001-0025/0 > 15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor > app-20151102205001-0025/2 > 15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor > app-20151102205001-0025/1 > 15/11/02 20:58:28 INFO master.Master: Registering app App Test > 15/11/02 20:58:28 INFO master.Master: Registered app App Test with ID > app-20151102205828-0026 > 15/11/02 20:58:28 INFO master.Master: Launching executor > app-20151102205828-0026/0 on worker worker-20151030135450-x.x.x.76-42502 > 15/11/02 20:58:28 INFO master.Master: Launching executor > app-20151102205828-0026/1 on worker worker-20151030135450-x.x.x.86-51916 > 15/11/02 20:58:28 INFO master.Master: Launching executor > app-20151102205828-0026/2 on worker worker-20151030135450-x.x.x.85-47388 > 15/11/02 20:58:28 INFO master.Master: Launching executor > app-20151102205828-0026/3 on worker worker-20151030125450-x.x.x.69-51604 > 15/11/02 20:58:28 INFO master.Master: Launching executor > app-20151102205828-0026/4 on worker worker-20151030135450-x.x.x.87-35705 > 15/11/02 20:59:35 INFO master.Master: Received unregister request from > application app-20151102205828-0026 > 15/11/02 20:59:35 INFO master.Master: Removing app app-20151102205828-0026 > 15/11/02 20:59:35 WARN master.Master: Application App Test is still in > progress, it may be terminated abnormally. > 15/11/02 20:59:35 INFO spark.SecurityManager: Changing view acls to: root > 15/11/02 20:59:35 INFO spark.SecurityManager: Changing modify acls to: root > 15/11/02 20:59:35 INFO spark.SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(root); users > with modify permissions: Set(root) > 15/11/02 21:17:46 INFO master.Master: x.x.x.x:40954 got disassociated, > removing it. > 15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor > app-20151102205828-0026/3 > 15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor > app-20151102205828-0026/1 > 15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor > app-20151102205828-0026/0 > 15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor > app-20151102205828-0026/2 > 15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor > app-20151102205828-0026/4 > 15/11/02 21:17:46 INFO master.Master: x.x.x.x:37676 got disassociated, > removing it. > 15/11/02 21:17:48 ERROR akka.ErrorMonitor: Uncaught fatal error from thread > [sparkMaster-akka.actor.default-dispatcher-3] shutting down ActorSystem > [sparkMaster] > java.lang.OutOfMemoryError: Java heap space > at > com.fasterxml.jackson.core.util.BufferRecycler.calloc(BufferRecycler.java:156) > at > com.fasterxml.jackson.core.util.BufferRecycler.allocCharBuffer(BufferRecycler.java:124) > at > com.fasterxml.jackson.core.io.IOContext.allocTokenBuffer(IOContext.java:181) > at > com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:830) > at > com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2161) > at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:19) > at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:44) > at > org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:58) > at > org.apache.spark.deploy.master.Master.rebuildSparkUI(Master.scala:950) > at > org.apache.spark.deploy.master.Master.removeApplication(Master.scala:812) > at > org.apache.spark.deploy.master.Master.org$apache$spark$deploy$master$Master$$finishApplication(Master.scala:790) > at > org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382) > at > org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.deploy.master.Master$$anonfun$receive$1.applyOrElse(Master.scala:382) > at > org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$processMessage(AkkaRpcEnv.scala:177) > at > org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$4.apply$mcV$sp(AkkaRpcEnv.scala:126) > at > org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$safelyCall(AkkaRpcEnv.scala:197) > at > org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1.applyOrElse(AkkaRpcEnv.scala:125) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) > at > org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59) > at > org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42) > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) > at > org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42) > at akka.actor.Actor$class.aroundReceive(Actor.scala:467) > at > org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1.aroundReceive(AkkaRpcEnv.scala:92) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) > at akka.actor.ActorCell.invoke(ActorCell.scala:487) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) > at akka.dispatch.Mailbox.run(Mailbox.scala:220) > 15/11/02 21:17:48 ERROR actor.ActorSystemImpl: Uncaught fatal error from > thread [sparkMaster-akka.actor.default-dispatcher-3] shutting down > ActorSystem [sparkMaster] > java.lang.OutOfMemoryError: Java heap space > at > com.fasterxml.jackson.core.util.BufferRecycler.calloc(BufferRecycler.java:156) > at > com.fasterxml.jackson.core.util.BufferRecycler.allocCharBuffer(BufferRecycler.java:124) > at > com.fasterxml.jackson.core.io.IOContext.allocTokenBuffer(IOContext.java:181) > at > com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:830) > at > com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2161) > at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:19) > at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:44) > at > org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:58) > at > org.apache.spark.deploy.master.Master.rebuildSparkUI(Master.scala:950) > at > org.apache.spark.deploy.master.Master.removeApplication(Master.scala:812) > at > org.apache.spark.deploy.master.Master.org$apache$spark$deploy$master$Master$$finishApplication(Master.scala:790) > at > org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382) > at > org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.deploy.master.Master$$anonfun$receive$1.applyOrElse(Master.scala:382) > at > org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$processMessage(AkkaRpcEnv.scala:177) > at > org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$4.apply$mcV$sp(AkkaRpcEnv.scala:126) > at > org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$safelyCall(AkkaRpcEnv.scala:197) > at > org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1.applyOrElse(AkkaRpcEnv.scala:125) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) > at > org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59) > at > org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42) > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) > at > org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42) > at akka.actor.Actor$class.aroundReceive(Actor.scala:467) > at > org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1.aroundReceive(AkkaRpcEnv.scala:92) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) > at akka.actor.ActorCell.invoke(ActorCell.scala:487) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) > at akka.dispatch.Mailbox.run(Mailbox.scala:220) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org