To explain more, we upgraded from 0.7.3 to 0.9 incubating snapshot today
and are getting out of memory errors very quickly even though our cluster
has plenty of RAM and the data is relatively small:

Using Scala version 2.9.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_21)
Initializing interpreter...
Creating SparkContext...
13/11/19 23:17:20 INFO Slf4jEventHandler: Slf4jEventHandler started
13/11/19 23:17:20 INFO SparkEnv: Registering BlockManagerMaster
13/11/19 23:17:20 INFO DiskBlockManager: Created local directory at
/opt/spark/tmp/spark-local-20131119231720-a023
13/11/19 23:17:20 INFO MemoryStore: MemoryStore started with capacity 323.9
MB.
13/11/19 23:17:20 INFO ConnectionManager: Bound socket to port 11240 with
id = ConnectionManagerId(spark-shell-01,11240)
13/11/19 23:17:20 INFO BlockManagerMaster: Trying to register BlockManager
13/11/19 23:17:20 INFO BlockManagerMasterActor$BlockManagerInfo:
Registering block manager spark-shell-01:11240 with 323.9 MB RAM
13/11/19 23:25:17 INFO BlockManagerMasterActor$BlockManagerInfo:
Registering block manager dn-02:50623 with 1943.0 MB RAM
13/11/19 23:25:17 INFO BlockManagerMasterActor$BlockManagerInfo:
Registering block manager dn-01:61960 with 1943.0 MB RAM
13/11/19 23:25:18 INFO BlockManagerMasterActor$BlockManagerInfo:
Registering block manager dn-03:45775 with 1943.0 MB RAM

I've included memory store output for more information:


13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113598) called with
curMem=0, maxMem=339585269
13/11/19 23:40:38 INFO MemoryStore: Block broadcast_0 stored as values to
memory (estimated size 110.9 KB, free 323.7 MB)
13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=113598, maxMem=339585269
13/11/19 23:40:38 INFO MemoryStore: Block broadcast_1 stored as values to
memory (estimated size 111.0 KB, free 323.6 MB)
13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=227244, maxMem=339585269
13/11/19 23:40:38 INFO MemoryStore: Block broadcast_2 stored as values to
memory (estimated size 111.0 KB, free 323.5 MB)
13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=340890, maxMem=339585269
13/11/19 23:40:38 INFO MemoryStore: Block broadcast_3 stored as values to
memory (estimated size 111.0 KB, free 323.4 MB)
13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=454536, maxMem=339585269
13/11/19 23:40:38 INFO MemoryStore: Block broadcast_4 stored as values to
memory (estimated size 111.0 KB, free 323.3 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=568182, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_5 stored as values to
memory (estimated size 111.0 KB, free 323.2 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=681828, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_6 stored as values to
memory (estimated size 111.0 KB, free 323.1 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=795474, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_7 stored as values to
memory (estimated size 111.0 KB, free 323.0 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=909120, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_8 stored as values to
memory (estimated size 111.0 KB, free 322.9 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=1022766, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_9 stored as values to
memory (estimated size 111.0 KB, free 322.8 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=1136412, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_10 stored as values to
memory (estimated size 111.0 KB, free 322.7 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=1250058, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_11 stored as values to
memory (estimated size 111.0 KB, free 322.6 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=1363704, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_12 stored as values to
memory (estimated size 111.0 KB, free 322.4 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=1477350, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_13 stored as values to
memory (estimated size 111.0 KB, free 322.3 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=1590996, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_14 stored as values to
memory (estimated size 111.0 KB, free 322.2 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=1704642, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_15 stored as values to
memory (estimated size 111.0 KB, free 322.1 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=1818288, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_16 stored as values to
memory (estimated size 111.0 KB, free 322.0 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=1931934, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_17 stored as values to
memory (estimated size 111.0 KB, free 321.9 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=2045580, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_18 stored as values to
memory (estimated size 111.0 KB, free 321.8 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=2159226, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_19 stored as values to
memory (estimated size 111.0 KB, free 321.7 MB)
13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=2272872, maxMem=339585269
13/11/19 23:40:39 INFO MemoryStore: Block broadcast_20 stored as values to
memory (estimated size 111.0 KB, free 321.6 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=2386518, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_21 stored as values to
memory (estimated size 111.0 KB, free 321.5 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=2500164, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_22 stored as values to
memory (estimated size 111.0 KB, free 321.4 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=2613810, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_23 stored as values to
memory (estimated size 111.0 KB, free 321.3 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=2727456, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_24 stored as values to
memory (estimated size 111.0 KB, free 321.1 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=2841102, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_25 stored as values to
memory (estimated size 111.0 KB, free 321.0 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=2954748, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_26 stored as values to
memory (estimated size 111.0 KB, free 320.9 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=3068394, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_27 stored as values to
memory (estimated size 111.0 KB, free 320.8 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=3182040, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_28 stored as values to
memory (estimated size 111.0 KB, free 320.7 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=3295686, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_29 stored as values to
memory (estimated size 111.0 KB, free 320.6 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=3409332, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_30 stored as values to
memory (estimated size 111.0 KB, free 320.5 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=3522978, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_31 stored as values to
memory (estimated size 111.0 KB, free 320.4 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=3636624, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_32 stored as values to
memory (estimated size 111.0 KB, free 320.3 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=3750270, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_33 stored as values to
memory (estimated size 111.0 KB, free 320.2 MB)
13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with
curMem=3863916, maxMem=339585269
13/11/19 23:40:40 INFO MemoryStore: Block broadcast_34 stored as values to
memory (estimated size 111.0 KB, free 320.1 MB)
13/11/19 23:40:40 INFO MemoryStore..



Thanks,

Gary


On Tue, Nov 19, 2013 at 6:22 PM, Gary Malouf <malouf.g...@gmail.com> wrote:

> We have a 4 node Spark cluster with 3 gigs of ram available per executor
> (via the spark.executor.memory setting).  When we run a Spark job, we see
> the following output:
>
> Using Scala version 2.9.3 (Java HotSpot(TM) 64-Bit Server VM, Java
> 1.7.0_21)
> Initializing interpreter...
> Creating SparkContext...
> 13/11/19 23:17:20 INFO Slf4jEventHandler: Slf4jEventHandler started
> 13/11/19 23:17:20 INFO SparkEnv: Registering BlockManagerMaster
> 13/11/19 23:17:20 INFO DiskBlockManager: Created local directory at
> /opt/spark/tmp/spark-local-20131119231720-a023
> 13/11/19 23:17:20 INFO MemoryStore: MemoryStore started with capacity
> 323.9 MB.
> 13/11/19 23:17:20 INFO ConnectionManager: Bound socket to port 11240 with
> id = ConnectionManagerId(spark-shell-01,11240)
> 13/11/19 23:17:20 INFO BlockManagerMaster: Trying to register BlockManager
> 13/11/19 23:17:20 INFO BlockManagerMasterActor$BlockManagerInfo:
> Registering block manager spark-shell-01:11240 with 323.9 MB RAM
>
> Is this right?  I feel like much more RAM should be available.
>

Reply via email to