Understanding Spark Memory distribution

Ankur Srivastava Fri, 27 Mar 2015 14:55:22 -0700

Hi All,

I am running a spark cluster on EC2 instances of type: m3.2xlarge. I have
given 26gb of memory with all 8 cores to my executors. I can see that in
the logs too:


*15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
app-20150327213106-0000/0 on worker-20150327212934-10.x.y.z-40128
(10.x.y.z:40128) with 8 cores*

I am not caching any RDD so I have set "spark.storage.memoryFraction" to
0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB.

I am now confused with these logs?

*15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block manager
10.77.100.196:58407 <http://10.77.100.196:58407> with 4.5 GB RAM,
BlockManagerId(4, 10.x.y.z, 58407)*

I am broadcasting a large object of 3 gb and after that when I am creating
an RDD, I see logs which show this 4.5 GB memory getting full and then I
get OOM.

How can I make block manager use more memory?

Is there any other fine tuning I need to do for broadcasting large objects?

And does broadcast variable use cache memory or rest of the heap?


Thanks

Ankur

Understanding Spark Memory distribution

Reply via email to