Hi All, I am running a spark cluster on EC2 instances of type: m3.2xlarge. I have given 26gb of memory with all 8 cores to my executors. I can see that in the logs too:
*15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added: app-20150327213106-0000/0 on worker-20150327212934-10.x.y.z-40128 (10.x.y.z:40128) with 8 cores* I am not caching any RDD so I have set "spark.storage.memoryFraction" to 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB. I am now confused with these logs? *15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block manager 10.77.100.196:58407 <http://10.77.100.196:58407> with 4.5 GB RAM, BlockManagerId(4, 10.x.y.z, 58407)* I am broadcasting a large object of 3 gb and after that when I am creating an RDD, I see logs which show this 4.5 GB memory getting full and then I get OOM. How can I make block manager use more memory? Is there any other fine tuning I need to do for broadcasting large objects? And does broadcast variable use cache memory or rest of the heap? Thanks Ankur