Hello everybody, I am running a spark streaming app and I am planning to use it as a long running service. However while trying the app in a rc environment I got this exception in the master daemon after 1 hour of running:
Exception in thread "master-rebuild-ui-thread" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.regex.Pattern.compile(Pattern.java:1667) at java.util.regex.Pattern.<init>(Pattern.java:1351) at java.util.regex.Pattern.compile(Pattern.java:1054) at java.lang.String.replace(String.java:2239) at org.apache.spark.util.Utils$.getFormattedClassName(Utils.scala:1632) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:486) at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:58) at org.apache.spark.deploy.master.Master$$anonfun$17.apply(Master.scala:972) at org.apache.spark.deploy.master.Master$$anonfun$17.apply(Master.scala:952) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) As a palliative measure I've increased the master memory to 1.5gb. My job is running with a batch interval of 5 seconds. I'm using spark version 1.6.2. I think it might be related to this issues: https://issues.apache.org/jira/browse/SPARK-6270 https://issues.apache.org/jira/browse/SPARK-12062 https://issues.apache.org/jira/browse/SPARK-12299 But I don't see a clear road to solve this apart from upgrading spark. What would you recommend? Thanks in advance Mariano