Hi Ankur
If your hardware is ok, looks like it is config problem. Can you show me
the config of spark-env.sh or JVM config?
Thanks
Wisely Chen
2015-03-28 15:39 GMT+08:00 Ankur Srivastava ankur.srivast...@gmail.com:
Hi Wisely,
I have 26gb for driver and the master is running on m3.2xlarge machines.
I see OOM errors on workers and even they are running with 26th of memory.
Thanks
On Fri, Mar 27, 2015, 11:43 PM Wisely Chen wiselyc...@appier.com wrote:
Hi
In broadcast, spark will collect the whole 3gb object into master node
and broadcast to each slaves. It is very common situation that the master
node don't have enough memory .
What is your master node settings?
Wisely Chen
Ankur Srivastava ankur.srivast...@gmail.com 於 2015年3月28日 星期六寫道:
I have increased the spark.storage.memoryFraction to 0.4 but I still
get OOM errors on Spark Executor nodes
15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block
broadcast_5_piece10
15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5
took 2704 ms
15/03/27 23:19:52 INFO MemoryStore: ensureFreeSpace(672530208) called
with curMem=2484698683, maxMem=9631778734
15/03/27 23:19:52 INFO MemoryStore: Block broadcast_5 stored as values
in memory (estimated size 641.4 MB, free 6.0 GB)
15/03/27 23:34:02 WARN AkkaUtils: Error sending message in 1 attempts
java.util.concurrent.TimeoutException: Futures timed out after [30
seconds]
at
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at
scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at
org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:187)
at
org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:407)
15/03/27 23:34:02 ERROR Executor: Exception in task 7.0 in stage 2.0
(TID 4007)
java.lang.OutOfMemoryError: GC overhead limit exceeded
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1986)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
Thanks
Ankur
On Fri, Mar 27, 2015 at 2:52 PM, Ankur Srivastava
ankur.srivast...@gmail.com wrote:
Hi All,
I am running a spark cluster on EC2 instances of type: m3.2xlarge. I
have given 26gb of memory with all 8 cores to my executors. I can see that
in the logs too:
*15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
app-20150327213106-/0 on worker-20150327212934-10.x.y.z-40128
(10.x.y.z:40128) with 8 cores*
I am not caching any RDD so I have set spark.storage.memoryFraction
to 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB.
I am now confused with these logs?
*15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block
manager 10.77.100.196:58407 http://10.77.100.196:58407 with 4.5 GB RAM,
BlockManagerId(4, 10.x.y.z, 58407)*
I am broadcasting a large object of 3 gb and after that when I am
creating an RDD, I see logs which show this 4.5 GB memory getting full and
then I get OOM.
How can I make block manager use more memory?
Is there any other fine tuning I need to do for broadcasting large
objects?
And does broadcast variable use cache memory or rest of the heap?
Thanks
Ankur