Re: Memory allocation for Broadcast values

2015-12-25 Thread Chris Fregly
Note that starting with Spark 1.6, memory can be dynamically allocated by the Spark execution engine based on workload heuristics. You can still set a low watermark for the spark.storage.memoryFraction (RDD cache), but the rest can be dynamic. Here's some relevant slides from a recent

Re: Memory allocation for Broadcast values

2015-12-22 Thread Akhil Das
If you are creating a huge map on the driver, then spark.driver.memory should be set to a higher value to hold your map. Since you are going to broadcast this map, your spark executors must have enough memory to hold this map as well which can be set using the spark.executor.memory, and

Memory allocation for Broadcast values

2015-12-20 Thread Pat Ferrel
I have a large Map that is assembled in the driver and broadcast to each node. My question is how best to allocate memory for this. The Driver has to have enough memory for the Maps, but only one copy is serialized to each node. What type of memory should I size to match the Maps? Is the